-
Notifications
You must be signed in to change notification settings - Fork 0
Adds healthchecks using the OKComputer gem #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bfc4994 to
4ed4889
Compare
awilfox
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a few minor changes, but overall good.
| relative paths are resolved against the data dir | ||
| --> | ||
| <str name="healthcheckFile">server-enabled.txt</str> | ||
| <!-- <str name="healthcheckFile">server-enabled.txt</str> --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to remain commented or should it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see arguments for both, but I went with this because it minimizes the diff with the solrconfig.xml that ships with GeoBlacklight. If we deleted it, we'd also want to delete the preceding lines/comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain more about why are we changing this? more context: https://solr.apache.org/docs/8_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That config requires the named file to exist in the data directory. If it's not there, then even though /admin/ping is configured it returns a 503. I figured this was the easiest way to get it working, the alternative being scaffolding that file into each core's datadir.
Note that Solr is firewall'd in staging / prod, so we're not opening things up any more than they currently are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 ; i think this is fine to go, then.
config/initializers/okcomputer.rb
Outdated
| # See config/routes.rb for the routing configuration, which is trimmed | ||
| # to just /health (mapped to what would usually be /health/all.json) | ||
| OkComputer.mount_at = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we say more about the rationale here? i'm not sure i understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, OKComputer will perform an up-only check where it is mounted, and also offer each defined check as an endpoint underneath that. The special all check will run all checks.
This is described in more detail in the readme that was hard to find because the repo moved recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale was simply that we didn't need check-specific routes, but we did want /health to run all checks and return JSON by default (to be consistent with our other apps). By default, OKC's /health endpoint returns plaintext and only runs the "up" check; oddly enough, adding .json causes it to run all your checks.
I'm somewhat ambivalent toward keeping the check-specific routes, but I do think we should keep /health the same across all apps (runs everything, returns JSON).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, that sounds good to me. i don't see a strong argument for disabling check-specific routes but 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meta question: do we care about doing healthchecks for geoserver and spatial within geodata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! Yes, both are critical to GeoData's functionality and the checks should be pretty easy to write. I'll see what I can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anarchivist I worked through this a bit last night, and I think these checks are a big enough lift to warrant a separate PR. The crux is that unlike with the Solr URL, there's no easy way to get at the GeoServer and Spatial URLs worth monitoring. That data is in Solr, so either we query it at startup or check-time, or we inject it (e.g. ENV['CHECKS_GEOSERVER_URL'] and some conditional logic).
So there's quite a bit more decision-making there. I think we should get this basic configuration out now, then address GeoServer/Spatial in a follow-up PR.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| relative paths are resolved against the data dir | ||
| --> | ||
| <str name="healthcheckFile">server-enabled.txt</str> | ||
| <!-- <str name="healthcheckFile">server-enabled.txt</str> --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain more about why are we changing this? more context: https://solr.apache.org/docs/8_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html
| resources :download, only: [:show] | ||
|
|
||
| # Map OkComputer's /health/all.json to /health | ||
| get '/health', to: 'ok_computer/ok_computer#index', defaults: { format: :json } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@awilfox @anarchivist So here's an alternative — we mount OKComputer with its default setting (all routes under /okcomputer) but map /health to the endpoint we actually want. This way we get the simple default OKC configuration but with the same /health behavior as our other apps. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
- Add the OKComputer gem as a framework for health checking. - Configure checks for ActiveRecord, Migrations, and Solr. - Set up OKComputer's default routes, plus a route for /health which runs all checks and returns JSON (consistent with our other apps).
No description provided.