-
Notifications
You must be signed in to change notification settings - Fork 20
Creation of initial decision records. #959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
I've tagged some specific people for initial review, please feel free to bring in anyone else that may have an opinion. This is a discussion of both the decision record format as well as the actual decisions being recorded. More will follow, I'm starting small...ish. |
|
@jvanaalsburg considering you started with with cwms-python, probably good to get your feedback about how I'm starting these as well. |
|
|
||
| @MikeNeilson | ||
|
|
||
| By versioning the data, and using the Content-Type and Accept headers and the full features of MIME types we appropriately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think REST principal accuracy is a good goal, but I think we should consider the following:
- API clutter
- industry standards
- browser support
- clear documentation
I would argue that accept header versioning provides the same amount of clutter both in implementation and downstream usage as path parameter versioning. However, it fails at the other three points.
While I would not want to have a tool (Swagger webpage) dictate how we implement an API, I do think it is a non-trivial point when it is our only source of API documentation.
The OpenAPI does not provide a meaningful way to distinguish between different versions on the version marker itself and the Swagger webpage does not make it immediately obvious that there are multplie versions of an endpoint.
Adding/removing/editing behavior of query parameters require extra documentation on the parameter (e.g. "This parameter is not supported by application/json;version=1, application/json;version=2 but is supported by application/json;version=2025-05-01. Now make sure you check what the default is to make sure the parameter is supported by your version."). Why do that to ourselves and users?
Path versioning on the other hand, communicates very clearly to end users what different versions' behaviors will be, both in query parameter behaviors and in what formats are accepted/returned.
Switching to calendar versioning on the accept header gives away the API clutter benefit as now downstream usages will have to manage many different versions beyond easy-to-grok integers, in which case, might as well put the version in the path or as a query parameter to at least provide easier documentation and browser support.
I could see a slight mitigation for the explosion of versions to manage by looping it into the release cycle so they are at least versioned along with the REST API itself, rather than at PR creation date. At that point though, we're just adding more maintenance burden and over time with faster release cycles this becomes moot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we are expanding the API documentation, Actually I need to move these to rst and that read-the-docs infrastructure.
But I'm still not convinced path versioning is correct. I can see the point about not using dates since they will be somewhat all over the place. But while path version may be an industry standard... it seems a really sloppy industry standard.
Over time, while our outputs have varied slightly, the inputs have only been expanded, not broken in a backwards incompatible way (at least intentionally)
And the few places were they have drastically change, when ended up with a better name of the endpoint anyways, usually something more refined and specific that /cwms-data/v1/locations -> /cwms-data/v2/locations (just for a concrete example).
I have found some arguments and discussion in favor of versioning data, I will take the time to look them up.
Another option, instead of ;version= could just be more expanded data types like application/vnd.json-ts-yada yada (I can't remember off the top of my end the right vnd syntax but I think that gets the idea across)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some examples of how industry does versioning. What is done by Stripe, Twitter, OpenAI etc? Do they version data-types too?
|
|
||
| ### Opinion 2 | ||
|
|
||
| Summary: Each datatype under "catalog" should be a full path" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trailing quote
| If it makes sense to group all catalogs under catalog, perhaps for grouping in the SWAGGER-UI, making each catalog it's own | ||
| path under `/catalog` instead of the current path parameter is a better approach. | ||
|
|
||
| We would maintain the grouping, but each catalog can have it's appropriate search criteria. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem in opposition to the accept header reasoning. We already implement getAll for most (all?) data types. However, most of them return the full data object rather than a listing of refs/ids. The catalog endpoint is supposed to be the lightweight alternative to a getAll, which seems to be a change in the shape of the data. A application/json;catalog=true or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, I hadn't thought of just using a different content-type of content-type parameter in that situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will say, while I brought up using the header, I still don't like that solution given my arguments above about discoverability and lack of clear Swagger UI documentation....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion we haven't done a great job of explaining what a catalog is vs just a bulk-retrieve end-point. I think they are (or can be) different things. In some places we've added flags to enable the controllers to returned reduced data fields or added new *Identifier endpoints but that hasn't been done consistently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion we haven't done a great job of explaining what a catalog is vs just a bulk-retrieve end-point. I think they are (or can be) different things. In some places we've added flags to enable the controllers to returned reduced data fields or added new *Identifier endpoints but that hasn't been done consistently
We also haven't done a great job of decide if we should even have a "catalog" endpoint or just treat the <dataset>/ as the catalog for each.
4f028a3 to
96f8650
Compare
3083c2f to
cc24a03
Compare
|
NOTE: I will be forcibly merge this in as-is next Friday barring any additional input. That said this is "initial decision records" should any one decide they think our design goals and intentions are working in the future, you are of course free, and encouraged, to propose changes. But we have to start someone and I don't want this to linger anymore. |
docs/source/libraries/java.rst
Outdated
| @@ -1,0 +1,3 @@ | |||
| #### | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a decision record for determining which version(s) of java to support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is a placeholder that was missing from the original sphinx setup that was putting errors in the build output.
| [comment:] <> (Status: request for comments | proposed | accepted | rejected | deprecated | superseded) | ||
|
|
||
| References | ||
| ========== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend adding a resource referencing the law: It's not just a good idea, it's technically the law.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, that one should actually be that hard to dig up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References were added.
|
|
||
| @MikeNeilson | ||
|
|
||
| If it makes sense to group all catalogs under catalog, perhaps for grouping in the SWAGGER-UI, making each catalog it's own |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Groups in the swagger-ui are created using the TAG annotation configuration, not by the paths themselves.
I find catalog grouped with the data type easier for discoverability:
timeseries/catalogtimeseries/<name>location/cataloglocation/<name>
It looks like swagger-ui does support multiple tags and allowing the same endpoint to show up under both groups in the UI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know.
| .. NOTE:: | ||
|
|
||
| Or is that confusing and we should just allows add a new endpoint to the highest endpoint version? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth calling out how various methods would get versioned with path versioning:
I presume we would want cwms-data/v3/locations GET, cwms-data/v3/locations/<name> GET, cwms-data/v3/locations/<name> DELETE, cwms-data/v3/locations/<name> POST would all get updated to the v3 even if only the cwms-data/v3/locations/<name> GET became backwards incompatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to duplicate
| to attempt to consildate search query parameters. For TimeSeries and Locations this works reasonably well since there | ||
| is parity between the concepts. | ||
|
|
||
| However, if we tried to add ratings into the mix, the list of query parameters grows, and it would rather difficult to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we define what a "catalog" request means under this concept? We already have Javalin CrudHandler getAll implementations under the data types for every data type. There is a redundant /locations getAll already that does similar cataloging as the /catalog/locations. If we are adding an explicit /locations/catalog would there be a difference in implementation? Would the returned results be a smaller payload than the other CRUD endpoints? We would want that to be clear and explicit in this decision doc to avoid confusion.
More questions on this: if we are separating out the CrudHandler getAll endpoints from a /catalog endpoint, would one handle aliases, and would one return a smaller subset of metadata? Ex. /locations/catalog would return office+loc id only but /locations (CrudHandler getAll) would return all physical location metadata and all aliases for each location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. "Catalog" here is "I want to search for data", not necessarily I want to get all the data.
My thought behind having <data>/catalog/ vs /catalog/<data> was to try and simplify things; however, since that was written I think we also consider moving the /catalog/<data> to have the <data> part be it's own independent Handler vs a single data type (the amount of query parameters on the /catalog endpoint are getting a bit... nuts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and having it specifically named "catalog" what to indicate that it's more like a library card catalog than for actual data.
MikeNeilson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, definitely going to have to consider some of those to fill out some explinations.
docs/source/libraries/java.rst
Outdated
| @@ -1,0 +1,3 @@ | |||
| #### | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is a placeholder that was missing from the original sphinx setup that was putting errors in the build output.
| to attempt to consildate search query parameters. For TimeSeries and Locations this works reasonably well since there | ||
| is parity between the concepts. | ||
|
|
||
| However, if we tried to add ratings into the mix, the list of query parameters grows, and it would rather difficult to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. "Catalog" here is "I want to search for data", not necessarily I want to get all the data.
My thought behind having <data>/catalog/ vs /catalog/<data> was to try and simplify things; however, since that was written I think we also consider moving the /catalog/<data> to have the <data> part be it's own independent Handler vs a single data type (the amount of query parameters on the /catalog endpoint are getting a bit... nuts.
|
|
||
| @MikeNeilson | ||
|
|
||
| If it makes sense to group all catalogs under catalog, perhaps for grouping in the SWAGGER-UI, making each catalog it's own |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know.
| [comment:] <> (Status: request for comments | proposed | accepted | rejected | deprecated | superseded) | ||
|
|
||
| References | ||
| ========== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, that one should actually be that hard to dig up.
| .. NOTE:: | ||
|
|
||
| Or is that confusing and we should just allows add a new endpoint to the highest endpoint version? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to duplicate
424961d to
bfb7d7d
Compare
MikeNeilson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated have been provided based the the given feedback.
| Catalog - a complete list of items, for examples of things that people can look at or buy [3] | ||
|
|
||
| By having a well defined structure of information users can more easily discover what they are looking for. While the | ||
| Sagger-UI, if used, presented all of the types of data that can be found. The `catalog` for each type should present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Sagger-UI, if used, presented all of the types of data that can be found. The `catalog` for each type should present | |
| Swagger-UI, if used, presented all of the types of data that can be found. The `catalog` for each type should present |
| a clear way to find the available data of each type. | ||
|
|
||
| The catalog of each data set would include only metadata associated with each data type. For example a time series | ||
| catalog would include support to discover primary timeseries names, aliases, extends, and the like but not actual time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| catalog would include support to discover primary timeseries names, aliases, extends, and the like but not actual time | |
| catalog would include support to discover primary timeseries names, aliases, extents, and the like but not actual time |
Co-authored-by: Adam Korynta <47677856+adamkorynta@users.noreply.github.com>
bfb7d7d to
347de19
Compare
In reference to #942, this is the initial creation of decision records for CDA.
In the current state, as little feedback has been provided on #942 at this time, this is to get feedback on the format of the decision records themselves, but anyone looking at this PR should comment, maybe open PR to this PR to add opinions.
I do not intent to merge these until an actual decision is reached with appropriate feedback; however, barring feedback and discussion such decisions will be made.