User:EEvans (WMF)/Scratch/RateLimiting

= Rate Limiting =

Requirements

 * Work for services that are, or are not, deployed to k8s
 * Fail-safe; An outage of the rate limiter should not create service outages
 * Kill switch to safely disable limiter(s)
 * Simulation mode (deploy to see what limiting would do)
 * Metrics (and alerts) to indicate how often limits are kicking in

Questions

 * Overhead?
 * TCP?
 * Persistent connections?
 * UDP?
 * Limit by request rate — Restrict a user to N requests per second?
 * Limit by concurrent requests — Restrict a user to N requests in progress at once
 * Load shedding — Reserve capacity for critical requests by shedding less critical requests?
 * HTTP 429 (Too Many Requests) or HTTP 503 (Service Unavailable)?