Talk:Requests for comment/Reducing image quality for mobile

Impact on infrastructure
Have in mind that depending on how you'll implement this, you may significantly increase imagescaler/Swift/upload CDN requests and storage requirements for Swift. I'm not saying to not do it, but I'd like to see some risk assessment & load estimates in the RFC itself. I'd also very much prefer for this to be blocked on the implementation of the simplify thumbnail cache RFC, as it may alleviate half of my concerns (but not all). Faidon Liambotis (WMF) (talk) 01:23, 19 March 2014 (UTC)
 * Added some basic guestimations and reasoning. Please provide some basic swift stats (like how many images have actually been scaled down, how many different versions of a single image do we currently have on average, and what is their size vs original) to have better statistics. Having varnish-based thumbnail cache would obviously be a great improvement, but I think we should treat them as two independent projects, unless we know for sure that estimated image-size increase cannot be handled by our current infrastructure. --Yurik (talk) 04:44, 19 March 2014 (UTC)
 * Your guesstimations are very wrong, your numbers are off by about 300-600x. We currently have 24.148.337 originals taking up 40.9TB. These, have 302.195.630 corresponding thumbnails, taking up 19.8TB and growing. They are consuming space in the Swift backends with replica count 3 (i.e. 60TB), as well as on the SSD disks of backend caches (eqiad, esams, ulsfo), in the  pagecache of Swift backends & Varnish backends, and in memory of Varnish frontends (yes, it's wasteful, that's why we have been discussing it over at the simplification RFC). Finally, note that a lot of the scalability problems arise from the count of files that we keep, not from their aggregate size, a dimension that you haven't considered at all -- just imagine how different it is to handle e.g. 10 files of 2G each vs. 10 million files of 2K each. If this proposal is to double the amount of thumbnails that we keep, I'm afraid that it's going to need serious ops & platform work with many months of work needed to make significant improvements to the architecture and it would definitely need to be blocked on the simplified cache RFC. —The preceding unsigned comment was added by Faidon Liambotis (WMF) (talk • contribs) .
 * Thanks for the numbers, let me try to go through them. 25 million images turn into 300 million thumbnails. That's 12 per 1. There are automatic gallery size (shown on category pages in commons), and all image pages have these options on the file page: 800×600, 320×240, 640×480, 1024×768, 1280×960, 2048×1536 (assuming the image is larger than those options). Any dumb crawler that simply follows all URLs would trigger image scaling. Also, when we generate HTML, the srcset attribute automatically adds 1.5x and 2x options, tripling thumbnail count (does not happen for the gallery or file page, only for articles.
 * So if we assume that 12 consists of ~5 preset options on the file/category pages plus 1 original, what remains are 2 article usages (x3 because of srcset). Mobile javascript would only replace those 2 usages (removing srcset attribute and ignoring pages in file namespace), so we end up with 50 million extra pictures 5-10KB each - 250-500GB (twice the size of my original calc). In other words, we are looking at about half a terabyte growth in disk space and ~15% growth in the number of files. Hope my calculations make sense and I did not miss anything major. --Yurik (talk) 16:00, 21 March 2014 (UTC)

Authors shouldn't have to worry about compression settings
The idea to have authors specify a quality setting, unless I'm fundamentally misunderstanding the intent, seems very misguided. We should write our software to take performance and efficiency considerations into account, not shift that responsibility to authors. If indeed images should be delivered at a higher compression factor / lower quality in certain use cases, let's identify those use cases, and go as far as we can without introducing markup to specify compression settings before even thinking about doing so.--Eloquence (talk) 02:26, 19 March 2014 (UTC)
 * This feature is not for authors, but for internal/advanced use - when we are serving image to a mobile device on mobile network, the goal is to automatically reduce image quality to reduce wait time. Additionally, this might also benefit users who pay for their data plan per MB, as it would allow us to create low/high quality mobile settings. --Yurik (talk) 04:44, 19 March 2014 (UTC)


 * If it's not for authors, I fail to see why it should be added to the markup (and your proposal contradicts that statement: "[t]his parameter might be used by various template authors").--Eloquence (talk) 06:35, 19 March 2014 (UTC)


 * Sorry, wasn't clear. This is MOSTLY for internal use, but there might be advanced authors out there who might decide to reduce image quality in addition to reducing pixel size for some obscure template. After all, if we provide rotation and scaling, why limit the toolset? And since the generated image URL is exposed to the world, why not do a well documented parameter as well? In any case, I am ok to not include it if the community is against it/there is a technical reason not to. --Yurik (talk) 07:16, 19 March 2014 (UTC)


 * Image compression quality is a technical concern rather than an editorial one. It doesn't belong in the markup.--Eloquence (talk) 03:00, 20 March 2014 (UTC)
 * Yeah, I'm struggling to understand why we would want to reduce image quality (very strange to say aloud...) manually rather than programmatically. I agree with your posts here. --MZMcBride (talk) 03:20, 20 March 2014 (UTC)

Ok, for now the core patch 119661 does not let users specify quality reduction via image link param, but only via modifying the URL itself. --Yurik (talk) 17:41, 24 March 2014 (UTC)

File insertion syntax
File insertion syntax is already an abomination. It really shouldn't be extended any further. --MZMcBride (talk) 02:04, 20 March 2014 (UTC)
 * The image with a different quality must have a different URL -- the image is not scaled by the initial request during HTML rendering, but rather it is processed on 404 by extracting needed parameters from the URL. The URL syntax is what seemed the most straightforward to me. How do you suggest we pass that information to the image processor? --Yurik (talk) 17:13, 20 March 2014 (UTC)

+1 on MZ (and Erik in ). --Nemo 14:45, 26 March 2014 (UTC)
 * , are you proposing an alternative to changing URL for the image? --Yurik (talk) 17:13, 26 March 2014 (UTC)
 * I have no opinion about that. MZ, Erik and I all commented on the image markup, i.e. what's written in wikitext. --Nemo 18:06, 26 March 2014 (UTC)


 * Ok, so is ok with the URL change, or is there another option?  This is NOT related to the above discussion about image link parameter, which has been removed. --Yurik (talk) 22:25, 27 March 2014 (UTC)

Bikeshed
"Reducing image quality" feels like a strange name to me. Perhaps the name of this RFC could be "Compressing images for mobile" or something like that? Just a suggestion. :-) --MZMcBride (talk) 03:22, 20 March 2014 (UTC)
 * Renamed. --Yurik (talk) 19:51, 20 March 2014 (UTC)

Fundamentals
I'm unable to comment on this proposal, because it doesn't provide any background on how you reached your proposal. All we're provided as use case and background is one line: "Many mobile devices with low bandwidth or small screen/slow processor could benefit from showing JPEG images with reduced quality". All the main points seem to be given for granted: they shouldn't.
 * Why JPEG? We have lots of PNG thumbnails, are we sure those are not taking more bandwidth?
 * Why for mobile? If we can have quality degradations without noticeable problems, why not tweak the thumbnailing settings for everyone in core? Kiwix also compresses images a lot, for instance, and the quality is still rather good: worth exploring. (About 30 GB in addition to the 10 GB of text in the first and last full ZIM release.)
 * Why quality? Isn't it easier to change default thumbnail size on the mobile site, from 220px to something else we already have thumbs for, like 120px? --Nemo 14:45, 26 March 2014 (UTC)
 * Thanks, I expanded the Rational section, hopefully that answers your question. --Yurik (talk) 17:12, 26 March 2014 (UTC)
 * Thank you. I summarise that with "because we can [easily]". That's not particularly convincing, I must say, even though it may still be a good idea. On the alleged lack of alternatives:
 * changing default thumb size is something we've done before without noise;
 * as for PNG we don't even know how much traffic they produce, maybe it's substantial and we can make vipscaler compress them lossily or otherwise tweak them;
 * for JPEG we have 51451 and claims a couple options can have a big impact. --Nemo 18:06, 26 March 2014 (UTC)
 * , I am not against vips, but it seams the biggest benefit of it is execution efficiency - it runs much faster and consumes less memory. These are great qualities, but they are orthogonal to this RFC - I will be very happy if our scaler switches to a more efficient one, but getting it should be a separate issue. I ran stats against one day:
 * PNG: 2,199,455 / 18,749 MB / 8.7 KB/file
 * JPEG: 1,708,366 / 28,548 MB / 17 KB/file
 * So even though JPEGs are only 43% by count, they are 60% of total traffic, and more than twice the size. Targeting JPEGs seems to give more bang for the buck, without introducing a new backend scaler.
 * Method: ran zgrep '/commons/thumb/.*image/png' sampled-1000.tsv.log-20140325.gz, counted with -c, summed with cut -d$'\t' -f7 | awk '{s+=$1} END {printf "%.0f\n", s/1024/1024}'
 * --Yurik (talk) 22:21, 27 March 2014 (UTC)