Topic on Talk:Core Platform Team/Initiatives/API Gateway

All API calls count the same for rate limiting

3
Pchelolo (talkcontribs)

This will make this system entirely useless. If all the calls has roughly the same cost, this could work. However in our system we have endpoints that return responses within a few milliseconds running together with endpoints that are extremely costly can take tens of seconds and loads of resources to compute. If the goal of the rate limiter project is to protect the infrastructure, the limit would need to be set to a number, that would protect it for the expensive calls, which would be way too low for inexpensive calls. And wise-versa - if we set a high limit targeted to allow using cheap endpoints, a client can easily destroy the infrastructure just using the whole pool making expensive requests.


For example (NOT REAL NUMBERS I PROPOSE), for a cheap endpoint a reasonable limit could be 1000r/s, while for an expensive on, like parsing a huge page on Parsoid, 5r/s. These are so vastly different, that it would be impossible to come up with a single number to cover both. We need to have per-endpoint limits at least.

If you think that it's more complicated, it's actually easier to use. As a developer making a large project using the API, maintaining local limits when you're making the calls and slowing down if needed, and specifying concurrencies for requests where you're making it is much easier the maintaining a global pool of API requests from all your systems towards us.

EvanProdromou (talkcontribs)

There's a big advantage in usability for client developers in having a simple policy, "X API calls in Y seconds".

How can we balance that against our resource-management requirements? I can see a few ways to do it. First, we could just cost it out based on our most expensive API calls. So if we can handle X very expensive calls per hour, that's the number we use.

Another is that we take a statistical approach: 75% of the calls by all clients are to API endpoint 1, 15% to API endpoint 2, and 10% to API endpoint 3, each with different resource costs. We could set the rate limit so on average clients can make this mix of calls, even if individual developers might make more or fewer expensive calls.

I'm working on a review of the other major API platforms to get an idea of what their rate limit policies are. I want to fall squarely in the middle of the pack.

Pchelolo (talkcontribs)

> There's a big advantage in usability for client developers in having a simple policy, "X API calls in Y seconds".

I would argue with that. Getting rate-limited in one part of your system because something is misbehaving in an entirely different part of the system will be a nightmare to debug. Maintaining a global outgoing request rate limit is no fun in distributed systems. The only benefit I see in this simplicity would be that the docs are simpler.


> So if we can handle X very expensive calls per hour, that's the number we use.

That would make the system unusable for most GET requests.


> Another is that we take a statistical approach:

If the goal of the limit is to protect against malicious agents, this approach would clearly not work. Malicious agent can only make requests to the most expensive endpoint deliberately.


> I'm working on a review of the other major API platforms to get an idea of what their rate limit policies are. I want to fall squarely in the middle of the pack.

This would be an interesting data point in this discussion, eager to see it. But also let's not do cargo-cult programming either :)

Reply to "All API calls count the same for rate limiting"