ORES/Metrics

We collect some metrics in the Python backend, and forward them to Grafana (TODO: via...)

precache_request - Incremented once for each "precache" request. Note that this is not the same scale as scores_request.

revision_scored - TODO: Unused? Seems to be redundant to scores_request.count?

scores_request - Tally of the number of revisions scored, which many be many in a single API request. Not incremented for "precache" requests. TODO: Give a more descriptive name, like "spontaneous_revision_scored"

score_processed - Incremented after each scoring worker thread finishes. TODO: I'm not sure how this maps to the other scales, for instance, is a large request's work batched across multiple threads?

score_processor_overloaded - Sent when a request is rejected due to the Celery queue exceeding the configured maximum queue size. We also return a "503 server overloaded" to the client.

score_cache_hit - Incremented for each revision score looked up in the Redis cache.

score_cache_miss - Incremented for each score cache lookup that fails.

score_errored - TODO: Unused? Also a null view in the dashboard. We should wire this up to last-chance error handling rather than deprecate the metric.

score_timed_out - Bumped when score processing takes longer than the scoring system's configured timeout (15 seconds [Oct 2016]).

datasources_extracted - Unused. TODO: FIXME?

precache_scores - Unused functions.

precache_score - Only incremented by the precache utility, so effectively unused. TODO: Unify with precache_revision_scored or deprecate.

precache_scoring_error - Unused, sent by the precache utility.

TODO:
 * One more pass to tighten up how each event stream maps to a score e.g. per model.
 * Deprecate "precache" request dichotomy.