-
Notifications
You must be signed in to change notification settings - Fork 20
Composite Time Series Design document. #1103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
|
||
| * Get, through existing TimeSeries classes. | ||
|
|
||
| Does it make sense to support writing directly to a composite time series. While the write of each element *could* be sent to the underlying member, this seems ripe for error when editing or updating any data. It is likely that any edits would always be to the most recent time series, and configured in some other system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that seems like a bad idea. It's probably better to write to the underlying timeseries.
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Duration -\> var |Duration of average or total may change over time with new members, duration will be indicated in the member definition | | ||
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Version |As Normal CWMS TS ID | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a common version name denote it as authoritative? Or does just the existence of a composite timeseries imply that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a possibility. I had put in a place older "is-authoritative" flag in the composite definition.
Though I agree the version is technically a good place for that, it does seem to get a bit... overused at times.
There are certainly arguments to be me in either case, so we'll wait for commentary from others to tip the scales.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"is-authoritative" flag
To me this seems more explicit, because if what happens in a world when someone hears about composite and thinks it's okay to just have that in the version?
I expect the UI would throw an error saying "You can't make a composite TS manually this way, you must XYZ" or "" prompting for proper composite creation when a user tries to make one with this standard name in the version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To consider authoritative as part of the identifier, we would want to compare two composites for the same site/parameter one of which is authoritative and the other non-authoritative. How would those two time series identifiers be differentiated from one another? Does authoritative uniquely define the time series vs non-authoritative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not specifically. Not enough that we should rely on the time series id version field anyways.
jbkolze
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good summary of the design -- thanks for putting that together. Noted a few points of clarity / contention.
| #. The definition of the composite time series is stored within the CWMS database | ||
| #. The members of a composite time series define a continuous range | ||
| #. The date ranges of a composite time series *MUST* not overlap | ||
| #. The date ranges of a composite time series *MUST* have any gaps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this missing a "not"? If so, I think it's reasonable to allow for gaps in the data. For example, if a gage gets washed out and isn't replaced for a month (and the new gage uses a different interval) then there simply wouldn't be data for the missing month. The user could theoretically extend the range of preceding or following record to include the gap, but that wouldn't feel very intuitive to me. I would expect that the interpreter could simply return gaps as missing data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, the NOT is missing.
I wasn't thinking about the processing returning the missings, but that is a good idea, and would required the gap to be defined... and probably some other information. So it may just be easier to put the information into the time series itself (e.g. in notes) and just let the missings be returned.
| { | ||
| "office": "<string>", | ||
| "name": "<ts id name>", | ||
| "is-period-of-record": true, // or is authoritative. to distinguish between other possible use-cases? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little unclear on the relationship between the "composite time series" and the "authoritative time series". It makes sense that the authoritative TS will generally be a composite TS, but will the functionality for both be handled within this one construct? What would be the implications of setting "is-period-of-record" to true here? I feel like this would be more appropriate as "isAuthoritative" based on my assumptions.
Under the CMA paradigm a separate construct exists to link the "authoritative" TS with the parameter for a location. It seems like this will handle that automatically? e.g. if I have a Buckhorn.Flow-Outflow composite record with this set to true, this will automatically be returned when a user requests Flow-Outflow data for Buckhorn (also, is there another endpoint created for this)?
Just to clarify -- I don't have strong feelings either way, just trying to understand the intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is-authoritative is probably better. Especially after the ideas @msweier provided in his responses.
Though idea behind this flag is that it makes it easier to decide what should be rendered. and now that I think about it I suspect A2W would always want to render the "authoritative" so it makes sense to actually use that language.
| "start-inclusive": true, | ||
| "end-inclusive": false | ||
| // suggest default of "start-inclusive": true, "end-inclusive": false | ||
| // it may also make sense to just make this *always* the above and not let the user set it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just choosing a standard for this would suffice -- seems like a toggle would over-complicate things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
I say this to anyone, if you think a few things need clarity use the "request changes" feature of the pull-request if you have the authority to do so. If not (e.g. the option just doesn't exist) go ahead make it blindingly obvious (bold text, big red letters like in Hitch Hikers Guide to the Galaxy, etc) something should change first. Especially in documents like this that will define things for years to come. |
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Parameter Type |As Normal CWMS TS ID, Instantaneous, average, total, etc | | ||
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Interval -\> Composite| Marker that this time series does not have a fix information and is build of various member time series. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm of the opinion that:
- duration MUST be the same for all composite members. If you change the window size of an averaging function, then you are producing different data. It's not the same as simply swapping out a sensor.
- interval MUST be the same for all composite members. Trying to retrieve a composite time series where you have no idea what you're going to get is a nightmare to code for. If you want to combine different intervals, create a computation to create the requisite interval data from other intervals.
Tying back into an earlier section, that means the composite doesn't have to be irregular, and could potentially be regular, lrts, or prts depending on the sources.
If all timeseries in a composite are hourly regular, for example, empty values could be generated to fill in the gaps.
Now, that leads to the section I highlighted:
I say make the naming somewhat arbitrary, like it is now. Allow it to operate like an alias, so the user creates the name they want the timeseries to labeled, then say "this is a composite".
For example, that would allow the user to just specify .Composite in the version, and prevent confusion by overloading the other parameters.
<Location Id>.<Parameter>.<Parameter Type>.<Interval>.<Duration>.Composite
If that's not feasible, then perhaps adding to the interval, like lrts did:
Something like 1HourComp:
<Location Id>.<Parameter>.<Parameter Type>.1HourComp.<Duration>.<Version>
Otherwise, how would you specify composite data for different intervals and types, and keep them separate?
At SPK we have period-of-record data like this:
New Bullards Bar.Elev.Inst.1Hour.0.POR
New Bullards Bar.Elev.Inst.~1Day.0.POR
Currently that's a separate timeseries with duplicate data. But it doesn't have to be.
They could be changed internally to be composites, then maintain the same names, and everything else matches the rest of the system (parameter type, interval, duration).
Otherwise, you end up with something like:
New Bullards Bar.Elev.Inst.Composite.0.POR - What is that? Hourly, daily? What if I only want daily data?
New Bullards Bar.Elev.Composite.~1Day.0.POR - Is that averaged data, or instantaneous?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interval MUST be the same for all composite members. Trying to retrieve a composite time series where you have no idea what you're going to get is a nightmare to code for. If you want to combine different intervals, create a computation to create the requisite interval data from other intervals.
The intent is to avoid duplicating data while providing a simple name to the entire range of the Period of Record for the measurement at a location (that's where the this started from was the period of record)
That said, I think I agree with the Duration. Or at least that if a duration is specified everything must match. But as a matter of historical record and how things changed, the durations do change and would be indicated in each.
I think one thing to point out, the results of this aren't meant for the entire PoR to be cleaning put into a display, but to provide the data as it was used at that was used at that time. Downstream users, say doing a study, would need to determine how they would interpolate/correct any gaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about
New Bullards Bar.Elev[Composite]...
As you said, it's an alias, that will be processed before further work and we don't use [] for anything so far as I'm aware. Certainly other options.
That said, since it is an alias. maybe we should just add yet-another .
New Bullards Bar.Elev.Inst.0.0.POR.Compositebasically the full history regardless of intervalNew Bullards Bar.Elev.Inst.~1Day.0.Calc-val.Compositeuseful for situations of multiple sensors and setting an "active".
That said, it would be more work to use in say, OpenDCS and any client software, but it is an option. But I do agree that fixing it to one location does limit things that appear rather useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create a computation to create the requisite interval data from other intervals.
Tossing this out there, but what if, this computation was done (not by OpenDCS?) and cached to (insert cache solution) and the composite is pulled from that.
Implementing this could be a pain but it lends into making any composite timeseries immediate when it's all well defined in the backend on what the data should be. Then you add additional metadata/fields to establish, much like the spec/templates for ratings, what is to be done with these composite timeseries given one of these situations.
just add yet-another .
Boy another part, I disagree there but then I wouldn't be doing all of that work!
Could instead add another sub-version, i.e. calc-val-composite and look for composite in the version. Issue there is if anyone already has a TS with that in the version...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, the composite time series should be the most granular series of time value data pairs for a given site/measurement. From this most granular time series, math operations can be used to derive a common interval, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea primarily stems from having a "period of record" whose intervals will have changed over time as measurement technology improved. So the granularity of those would be variable. Additionally the exact usage is unknown and further interval manipulations should be sent to left to the end user of the data set.
| ---------------------------- | ||
|
|
||
| As the composite time series is comprised of multiple other time series should this always be an error to specify? | ||
| The marker for always latest or always first may make sense to allow, however, at the time series is supposed to be authoritative, that would add ambiguity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to define what "authoritative" means in order to properly answer this question.
At SPK we have two different types of data: real time, and analyzed/processed data. The analyzed data isn't available in our CWMS system until at least 6 months after the water year ends.
Do you want the data as it was in the system when operational decisions were made, or what the 'true' system state that data represents was?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination for Period of Record, and thus authoritative, is the data when the operational decisions were made.
perhaps is-authorized should be, instead of boolean, <some name>: (authoritative|raw|por|analyzed) or something like that.
| #. Composite Time Series are Irregular | ||
| #. The definition of the composite time series is stored within the CWMS database | ||
| #. The members of a composite time series define a continuous range | ||
| #. The date ranges of a composite time series *MUST* not overlap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The members of a composite time series define a continuous range
This could be worked around by using versioning in the underlying implementation. If there's ambiguity (which I 100% guarantee will happen), then pick whichever is the most recent version.
Alternatively, for a simpler implementation, it could use the "archived" flag on the timeseries. If a timeseries is archived, then we can decide on a behavior based on the start-end dates of that timeseries.
Or, let the creator specify priority order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ranges are stored within the composite time series "object" so the creator/curator is always specifying the "this TS ID is for (start,end]".
But reading this the 4th time, the language isn't clear. I should've said "member" instead of composite time series in those.
95b6551 to
10e1840
Compare
MikeNeilson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the document with the things we seem to agree on.
For the other elements I've added some of the discussed options for historical reference and made some changes to the language of the text to avoid ambiguity.
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Parameter Type |As Normal CWMS TS ID, Instantaneous, average, total, etc | | ||
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Interval -\> Composite| Marker that this time series does not have a fix information and is build of various member time series. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interval MUST be the same for all composite members. Trying to retrieve a composite time series where you have no idea what you're going to get is a nightmare to code for. If you want to combine different intervals, create a computation to create the requisite interval data from other intervals.
The intent is to avoid duplicating data while providing a simple name to the entire range of the Period of Record for the measurement at a location (that's where the this started from was the period of record)
That said, I think I agree with the Duration. Or at least that if a duration is specified everything must match. But as a matter of historical record and how things changed, the durations do change and would be indicated in each.
I think one thing to point out, the results of this aren't meant for the entire PoR to be cleaning put into a display, but to provide the data as it was used at that was used at that time. Downstream users, say doing a study, would need to determine how they would interpolate/correct any gaps.
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Parameter Type |As Normal CWMS TS ID, Instantaneous, average, total, etc | | ||
| +----------------------+------------------------------------------------------------------------------------------------------------------------+ | ||
| |Interval -\> Composite| Marker that this time series does not have a fix information and is build of various member time series. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about
New Bullards Bar.Elev[Composite]...
As you said, it's an alias, that will be processed before further work and we don't use [] for anything so far as I'm aware. Certainly other options.
That said, since it is an alias. maybe we should just add yet-another .
New Bullards Bar.Elev.Inst.0.0.POR.Compositebasically the full history regardless of intervalNew Bullards Bar.Elev.Inst.~1Day.0.Calc-val.Compositeuseful for situations of multiple sensors and setting an "active".
That said, it would be more work to use in say, OpenDCS and any client software, but it is an option. But I do agree that fixing it to one location does limit things that appear rather useful.
| #. Composite Time Series are Irregular | ||
| #. The definition of the composite time series is stored within the CWMS database | ||
| #. The members of a composite time series define a continuous range | ||
| #. The date ranges of a composite time series *MUST* not overlap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ranges are stored within the composite time series "object" so the creator/curator is always specifying the "this TS ID is for (start,end]".
But reading this the 4th time, the language isn't clear. I should've said "member" instead of composite time series in those.
| "start-inclusive": true, | ||
| "end-inclusive": false | ||
| // suggest default of "start-inclusive": true, "end-inclusive": false | ||
| // it may also make sense to just make this *always* the above and not let the user set it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
| ---------------------------- | ||
|
|
||
| As the composite time series is comprised of multiple other time series should this always be an error to specify? | ||
| The marker for always latest or always first may make sense to allow, however, at the time series is supposed to be authoritative, that would add ambiguity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination for Period of Record, and thus authoritative, is the data when the operational decisions were made.
perhaps is-authorized should be, instead of boolean, <some name>: (authoritative|raw|por|analyzed) or something like that.
|
From Ruth (with a few responses):
That is a fair point, no gaps makes that difficult, the more important issue is no overlap. Would allowing gaps, but, so far as standards of use go, requiring information in the notes about seasons, or perhaps a link to a location level for seasonal placement? Though reading that again, if it's only in place for a year, isn't that just the complete record? Or are you saying sometimes that same collection will start again in some another year sometimes?
Perhaps each member should only be defined by a start date? that would make it clear that whatever the last member is valid until stated otherwise. I'll review the that section, allowing end to be null/not set for the last member seems a reasonable compromise.
We should be able to create a standard way to set that to reduce the burden, alternatively perhaps just require we have such information in the notes section for the whole composite time series? For a period of time, say when a station was catastrophically removed from service I think it does make sense to
That would be covered under the update operations as an editable field. However, the question does helpfully reiterate that we should describe exactly which fields are updateable or not.
A versioned time-series yes, depending on a specific version, we're so far thinking now, that when queried by "the composite time series" the data should always be the latest.
That will have to be determined by CWMSVue itself. We have decided that one cannot write to the composite time series. E.G. the system won't go figure out which member it would need to write to and then do so. However, CWMSVue and other interfaces, are free to behave in which ever way they desire. |
|
|
||
| It is a challenge for users to identity what the correct authoritative time series is for a given measurement at a location, when there are multiple time series at the same location. Additionally these time series often change over time, either being completely new or changing their interval as newer technologies become available. | ||
|
|
||
| Gather an entire Period of Record for the value at a location is also rather difficult. And the POR record and "authoritative timeseries" may be one-in-the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Period of Record
Establish Period of Record (POR)
POR record
nit; just say POR not Period of Record record
| Need | ||
| ==== | ||
|
|
||
| #. CWMS and Access-2-Water require a simple mechanism to allow users of data to retrieve the Authoritative Period of Record data for a given measurement without having to understand all of the possible component time series that may be involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Access-2-Water
Are we still calling the new site A2W? I only see mention of this branding in the FAQ. It might not be immediately obvious that https://water.usace.army.mil is A2W if users did not have prior knowledge.
Could at least add a link here since this doc has nothing to do with that nomenclature!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the official name in the various repos and how it's referred to.
|
|
||
| #. CWMS and Access-2-Water require a simple mechanism to allow users of data to retrieve the Authoritative Period of Record data for a given measurement without having to understand all of the possible component time series that may be involved. | ||
| #. Period-of-Record time series *should* not be created by duplicating data from the component time series and merging them into a new one. | ||
| #. The naming of the time series should fit within the excepting CWMS Time Series Identifier design and not unreasonably interfere with existing usages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excepting
nit: I believe this is "accepting"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually that should be existing.
| Description | ||
| ----------- | ||
|
|
||
| CDA should handle a concept of a "Composite Time Series". Whether a Time Series is considered composite will be determined by a specific element of the Time Series Identifier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CDA
nit: CWMS Data API (CDA) - I realize this is probably a rough draft, so maybe just treat this as a reminder for the official doc?
|
|
||
| CDA should handle a concept of a "Composite Time Series". Whether a Time Series is considered composite will be determined by a specific element of the Time Series Identifier. | ||
| Data Administrators will configure which Time Series (members), and the date-time range there-in, to define the composite time series. | ||
| CDA will use this stored information to build the Composite Time Series during a query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might elaborate more on "stored information", while I think it's insinuated here it might be worth saying this is not a duplicate of the data but a "reference to data that exists in their members that, when changed, will also update the composite TSID with such data"?
| #. The members of a composite time series define a continuous range | ||
|
|
||
| #. The date ranges of members *MUST* not overlap | ||
| #. The date ranges of members *MUST* not have any gaps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To which I expect an error would be thrown
That is the data itself can overlap but the member date definitions cannot overlap or is thrown.
This leads more into the design but I imagine, if using CDA to build the composite, a 400 of some sort would get returned.
I think this is the way to go. That would take care of the issue with checking for gaps/overlap. We should allow for a timeseries to be listed more then once to allow instances where Timeseies1, timeseries2, then timeseries 1 again. If there are gaps in the timeseries like Ruth's examples those would just be displayed in the composite timeseries as it would be when looking at the original timeseries. Just because the composite is grabbing from a timeseries between Date 1 and Date2 doesn't mean that timeseries needs to have values consistently between Date1 and Date 2. The gaps would just pull as null. |
In this case I think the most likely scenario is that it would be grabbing from the aggregate not the actual dated versions. That should probably be the default behavior. But I guess having that ability to grab from a version might make sense in some cases I just can't really think of one at the moment. |
| * Add member | ||
| * List members | ||
| * Replace all members? | ||
| * Delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add Update?
I.e. you want to update members, TSIDs, or any other such operations. And then when an update is issued do any triggers, or recalculations (mentioned above), need to fire? I.e. send a message down the que and anything subscribed knows to respond. (Report generation?!?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't allow saving directly to the member time series from this "object".
However, the messaging does provide a concern. Should saves to the underlying member time series also trigger a TS_STORED message for the composite. And if so how would it happen.
As there is a work around to that at the moment, simply pull the composite definition and filter client side for any TS_STORE message that for a member. That will be determined and handled at a later date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a composite is depended on externally, shouldnt a change in the definition of the composite trigger an event/message for dependents to be able to re-retrieve the definition and react accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. Added events to document.
| Storage of member information | ||
| ================================ | ||
|
|
||
| #. Store in Clob as we refine the design - cache appropriately in member to avoid any major performance issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no preference, might we go for Blob, to allow existing libraries/frameworks to make potential web pages easy to make for this.
i.e. if you store it in a JSON in BLOB we could build web interfaces with inputs to change them.
Doable with CLOB too, but Blob is the one being heavily looked at.
krowvin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added off-the-cuff thoughts. Nothing I said is a deal breaker. Any variation of what this becomes is better than the current proposed solutions that this would solve.
Considering one gets the list of member time series, it's not difficult, where that need exists, to then retrieve a specific version. Given this will generate a time series that would by definition include multiple versioned time series, it would be near impossible to even establish "correct" behavior, much less implement it. |
I'm changing the end to "can be null." There's a situation were a given measure may be completely discontinued (station permanently removed, river routed, whatever) and setting that last one would make it clear. |
19b4dd8 to
088a66c
Compare
06911d9 to
5c1472c
Compare
| location. | ||
|
|
||
| However, "best" is subjective. The POR of data used to make a given decision may not be the same as data that has formal | ||
| validation. Additionally having a POR of the raw, unedited data may be what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oof, thanks.
| Versioned (date) time series | ||
| ---------------------------- | ||
|
|
||
| It is an error to specify a Version (date) when requesting composite data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning composite time series always represent the latest version date for all member time series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added that to the text.
| Composite Time Series Definition | ||
| ================================ | ||
|
|
||
| .. code-block::jsonc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The markup view in GitHub doesn't show this code block.... Heads up in case anyone is only viewing the rich diff instead of the source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will make sure that shows in the render output though.
| #. Composite Time Series are Irregular | ||
| #. The definition of the composite time series is stored within the CWMS database | ||
| #. The members of a composite time series define a continuous range | ||
| #. The date ranges of a member *MUST* not overlap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definition of the JSON should define whether the start/end dates are open or closed
rma-psmorris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feedback on design
| ~~~~~~~~~~~~~~~~~~~~~ | ||
|
|
||
| A Time Series that is comprised of multiple same measure time series. For example, a river gage that has two sensors | ||
| only one of which is valid for certain conditions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement is fairly broad and conditions could be restricted to time if that is the goal here. Conditions could be an evaluation of the values, quality, or some independent data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement is intentionally broad, it is in fact just a grouping of time series in some meaningful way. Meaningful ways vary.
| ~~~~~~~~~~~~~~~~ | ||
|
|
||
| The Period of Record (POR, period-of-record) for a measurement (such as the Stage at a river or the pool elevation of a | ||
| dam) is *ALL* available (time,value) pairs since recording began until either recording as ended or the most recent available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this statement to be confusing to read, perhaps:
"The Period of Record (POR, period-of-record) for a measurement (such as the Stage at a river or the pool elevation of a dam) is ALL available (time,value) pairs since recording began until recording has ended or the most recent available pair, regardless of changes in intervals or unavoidable changes in averaging."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the end of that description definitely got away from me.
| An authoritative time series is a period-of-record time series with the additional constraint that is contains the best | ||
| official data. In other words the data that is determined to be "correct" by appropriate methods of validation. | ||
|
|
||
| The data provide will have validated or corrected after events when additional information become available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will have been validated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks.
|
|
||
| .. NOTE:: | ||
|
|
||
| At time of design we are considering a boolean flag to indicate whether a time series is "the authoritative correct" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this were an enum/state value instead of a boolean, it would support more than just authoritative vs non-authoritative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we're all just not entirely sure how that will work out. I may just make it a "state" field or something. Once we have some solid example we'll likely refine the design with additional discussions.
| #. Composite Time Series are Irregular | ||
| #. The definition of the composite time series is stored within the CWMS database | ||
| #. The members of a composite time series define a continuous range | ||
| #. The date ranges of a member *MUST* not overlap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the time series is irregular, you could go with setting start/end to opposing open/closed dates. That way you dont have the end at 11:59 and the start at 12:00 with a possible miss of data at 11:59:30. You would be able to set both to 12:00 with convention stating that one is < while the other is >= (if end is open and start is closed).
|
|
||
| From Daniel | ||
|
|
||
| Argument Against: the "Version" field it freeform and we often encode other information in it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a problem inherent with the identifier design pattern. Once set, it becomes hard to capture new concepts/metadata/attributes. Use of an identifying data structure (XML/JSON) would capture this better.
| time series retrieval. | ||
|
|
||
|
|
||
| Composite Time Series Definition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is not showing up in the marked up view of this file.
| Operations required: | ||
| -------------------- | ||
|
|
||
| * Create |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A filtered catalog of composites will be useful capturing a set of composites including the list members content.
| * Add member | ||
| * List members | ||
| * Replace all members? | ||
| * Delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a composite is depended on externally, shouldnt a change in the definition of the composite trigger an event/message for dependents to be able to re-retrieve the definition and react accordingly?
0345a49 to
237abcd
Compare
Initial design work to meet the needs of #956 as previously discussed.