Adds healthchecks using the OKComputer gem#68
Conversation
bfc4994 to
4ed4889
Compare
awilfox
left a comment
There was a problem hiding this comment.
Needs a few minor changes, but overall good.
| relative paths are resolved against the data dir | ||
| --> | ||
| <str name="healthcheckFile">server-enabled.txt</str> | ||
| <!-- <str name="healthcheckFile">server-enabled.txt</str> --> |
There was a problem hiding this comment.
Does this need to remain commented or should it be removed?
There was a problem hiding this comment.
I see arguments for both, but I went with this because it minimizes the diff with the solrconfig.xml that ships with GeoBlacklight. If we deleted it, we'd also want to delete the preceding lines/comments.
There was a problem hiding this comment.
can you explain more about why are we changing this? more context: https://solr.apache.org/docs/8_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html
There was a problem hiding this comment.
That config requires the named file to exist in the data directory. If it's not there, then even though /admin/ping is configured it returns a 503. I figured this was the easiest way to get it working, the alternative being scaffolding that file into each core's datadir.
Note that Solr is firewall'd in staging / prod, so we're not opening things up any more than they currently are.
There was a problem hiding this comment.
👍 ; i think this is fine to go, then.
| # See config/routes.rb for the routing configuration, which is trimmed | ||
| # to just /health (mapped to what would usually be /health/all.json) | ||
| OkComputer.mount_at = false |
There was a problem hiding this comment.
can we say more about the rationale here? i'm not sure i understand it.
There was a problem hiding this comment.
By default, OKComputer will perform an up-only check where it is mounted, and also offer each defined check as an endpoint underneath that. The special all check will run all checks.
This is described in more detail in the readme that was hard to find because the repo moved recently.
There was a problem hiding this comment.
The rationale was simply that we didn't need check-specific routes, but we did want /health to run all checks and return JSON by default (to be consistent with our other apps). By default, OKC's /health endpoint returns plaintext and only runs the "up" check; oddly enough, adding .json causes it to run all your checks.
I'm somewhat ambivalent toward keeping the check-specific routes, but I do think we should keep /health the same across all apps (runs everything, returns JSON).
There was a problem hiding this comment.
okay, that sounds good to me. i don't see a strong argument for disabling check-specific routes but 🤷
There was a problem hiding this comment.
meta question: do we care about doing healthchecks for geoserver and spatial within geodata?
There was a problem hiding this comment.
Good suggestion! Yes, both are critical to GeoData's functionality and the checks should be pretty easy to write. I'll see what I can do.
There was a problem hiding this comment.
@anarchivist I worked through this a bit last night, and I think these checks are a big enough lift to warrant a separate PR. The crux is that unlike with the Solr URL, there's no easy way to get at the GeoServer and Spatial URLs worth monitoring. That data is in Solr, so either we query it at startup or check-time, or we inject it (e.g. ENV['CHECKS_GEOSERVER_URL'] and some conditional logic).
So there's quite a bit more decision-making there. I think we should get this basic configuration out now, then address GeoServer/Spatial in a follow-up PR.
What do you think?
| relative paths are resolved against the data dir | ||
| --> | ||
| <str name="healthcheckFile">server-enabled.txt</str> | ||
| <!-- <str name="healthcheckFile">server-enabled.txt</str> --> |
There was a problem hiding this comment.
can you explain more about why are we changing this? more context: https://solr.apache.org/docs/8_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html
| resources :download, only: [:show] | ||
|
|
||
| # Map OkComputer's /health/all.json to /health | ||
| get '/health', to: 'ok_computer/ok_computer#index', defaults: { format: :json } |
There was a problem hiding this comment.
@awilfox @anarchivist So here's an alternative — we mount OKComputer with its default setting (all routes under /okcomputer) but map /health to the endpoint we actually want. This way we get the simple default OKC configuration but with the same /health behavior as our other apps. Thoughts?
- Add the OKComputer gem as a framework for health checking. - Configure checks for ActiveRecord, Migrations, and Solr. - Set up OKComputer's default routes, plus a route for /health which runs all checks and returns JSON (consistent with our other apps).
No description provided.