Mention m:User-Agent policy?
I think this page should mention Wikimedia's m:User-Agent policy and the general principle of using an informative, useful User-Agent in scripts and tools. Thoughts? --MZMcBride (talk) 22:27, 15 March 2012 (UTC)
- Not really, We should keep MW documentation to be MW specific where possible, not chucking in WMF stuff as well, Although a small note along the lines of "Other communities such as WMF may have access/usage restrictions etc etc etc, For a example of WMF's look at meta: etc etc" wouldn't be too bad. But the bit about informative user-agents for sure. Peachey88 (talk) 23:29, 16 March 2012 (UTC)
New Wikimedia request limits
- Wasn't that reverted? "50/s, burst of 250" is very low for our standards, I would be very surprised to hear there is a real need for it. Nemo 10:26, 14 November 2015 (UTC)
- I don't think gerrit:241643 has been reverted. A revert changeset was submitted in gerrit:252385, but that changeset has not been merged and deployed, as far as I can tell.
- Assuming "50/s" means 50 requests per second, that seems quite fast. --MZMcBride (talk) 14:14, 14 November 2015 (UTC)
- I got request limit after 40 update/POST requests in 3 minute window on wikidata.org :( --Ceefour (talk) 15:59, 12 December 2016 (UTC)
This page doesn't mention anything about the limits
Why is this page like a general "What not to do" advisory article? Can we have some data on what are the different kinds of limits and how to configure them for a wiki? --Nischayn22 (talk) 03:44, 12 August 2016 (UTC)
Guidelines for parallelism and load
I'm trying to code to the suggestions here, but finding it very hard to put into practice in a concrete way. This seems a bit outdated: "we ask that you be considerate and try not to take a site down," for example. It would take a DDoS to take the site down, since we're throttling individual IPs to 50 requests/second in Varnish.
I've also heard rules of thumb mentioned, like a parallelism of 1, 2, or 4, but don't know what to trust. I'd prefer to clarify this page. Actually, paralellism is a bad stand-in for impact on our API servers, because it doesn't take into account the delay between requests for each thread, server resource differences for different types of request, or slow client data pipes hogging sockets. I think a better measure would be requests per second, and we should define a suggested limit in some kind of absolute numbers, which are adjusted as our capacity grows or congestion increases. If we're feeling fancy, we could even provide an API for the recommended throttling at a given moment.
Here's an example of rate limiting elsewhere in industry, https://developer.twitter.com/en/docs/basics/rate-limits.html Adamw (talk) 20:00, 7 May 2018 (UTC)
- A danger of absolute numbers is that they more quickly become outdated since no one actually does adjust them when capacity increases.
The rule of thumb on this page is "don't parallelize". Or, to directly quote it,
With respect to database load, the maxlag parameter (mentioned on this page) implements a "dynamic" throttling of a sort by temporarily failing requests if the lag gets too high. There isn't currently an equivalent for things like appserver load, although there could be. But I don't think an API request to fetch recommended client-side throttling settings is all that great of a way to go versus the existing "fail this request if load is too high" model. Anomie (talk) 13:44, 8 May 2018 (UTC)
If you make your requests in series rather than in parallel (i.e. wait for the one request to finish before sending a new request, such that you're never making more than one request at the same time), then you should definitely be fine.