Architecture meetings/RFC review 2014-04-16

2100-2200 UTC April 16th, at.

Requests for Comment to review

 * 1) Requests for comment/Reducing image quality for mobile

Meeting summary

 * LINK: https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-04-16 (sumanah, 21:03:18)
 * Today is probably going to be a short meeting - just 1 RfC on the agenda (sumanah, 21:03:27)
 * Reducing image quality for mobile (sumanah, 21:03:31)
 * LINK: https://www.mediawiki.org/wiki/Requests_for_comment/Reducing_image_quality_for_mobile (sumanah, 21:03:54)
 * I asked Yuri what he wanted: 1) an ok from ops to increase thumbnail storage by 2-3% and number of files by 15%, 2) from core/tim/etc to proceed with the proposed patch assuming my proposed path is satisfactory to everyone's involved (sumanah, 21:04:15)
 * LINK: https://gerrit.wikimedia.org/r/#/c/119661/ Gerrit changeset, "Allow mobile to reduce image quality" (sumanah, 21:09:33)
 * comments were provided on the image quality gerrit patch (TimStarling, 21:42:33)
 * LINK: https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#File_insertion_syntax on wikitext addition (sumanah, 21:42:56)
 * image scaler backend relatively uncontroversial -- HTML/URL manipulation to access that API is more complex (TimStarling, 21:43:27)
 * gwicke predictably favours Node.JS service (TimStarling, 21:44:42)
 * ok, all settled, will implement the first step (core patch), and start implementing JS magic (sumanah, 21:48:52)
 * required modifications: use string instead of integer "qlow-100px-image.jpg", make it JPG only (no png)  (yurik, 21:50:31)
 * Tim skeptical about client-side JS rewrite: potential for CPU usage, flicker, image load aborts, browser incompatibilities, etc. (TimStarling, 21:54:34)


 * Next week - Associated namespaces (sumanah, 21:57:01)
 * LINK: https://www.mediawiki.org/wiki/Requests_for_comment/Associated_namespaces Next week David Cuenca wants to find out whether there are any objections to the "Namespace registry and association handlers" that Mark proposed, discuss possible problems with his proposed approach, and see if there would be any hands available to work on it. He mentioned that "I hope this RFC moves forward because it affects important upcoming and already depl (sumanah, 21:57:07)


 * RfC news (sumanah, 21:57:45)
 * LINK: https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Simplify_thumbnail_cache Mark Bergsma and Aaron Schulz just left some comments on the "Simplify thumbnail cache" RfC - if you're into that one, check them out (sumanah, 21:58:03)
 * Pau Giner has updated his grid system RfC https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Grid_system with more detail, and has submitted a patchset to Gerrit https://gerrit.wikimedia.org/r/#/c/125387/ so that the discussion can get more specific. Also see the example implementation http://pauginer.github.io/agora-grid/ (sumanah, 21:58:16)
 * http://www.gossamer-threads.com/lists/wiki/wikitech/451921 "REST and SOA within MediaWiki - is my understanding right?" includes gwicke saying, "ideally the only code that directly talks to the database would live in a storage service, which exposes a REST API." which refers to https://www.mediawiki.org/wiki/Requests_for_comment/Storage_service in case you want to take a look at that (sumanah, 21:58:34)
 * bd808 needs feedback on his structured logging patch - see http://lists.wikimedia.org/pipermail/wikitech-l/2014-April/075921.html (sumanah, 21:58:49)

Full log
See in HTML or see below.

21:02:02 #startmeeting RfC review: reducing image quality for mobile | Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). https://meta.wikimedia.org/wiki/IRC_office_hours 21:02:02  Meeting started Wed Apr 16 21:02:02 2014 UTC and is due to finish in 60 minutes. The chair is sumanah. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:02:02  Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:02:02  The meeting name has been set to 'rfc_review__reducing_image_quality_for_mobile___channel_is_logged_and_publicly_posted__do_not_remove_this_note___https___meta_wikimedia_org_wiki_irc_office_hours' 21:02:21 * sumanah waits for Brion 21:02:38 #chair sumanah TimStarling 21:02:38  Current chairs: TimStarling sumanah 21:03:13 #chair sumanah TimStarling brion 21:03:13  Current chairs: TimStarling brion sumanah 21:03:15 * brion waves 21:03:18 #link https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2014-04-16 21:03:27 #info Today is probably going to be a short meeting - just 1 RfC on the agenda 21:03:31 #topic Reducing image quality for mobile 21:03:42  the patch seems quite different to what yurik and I discussed at the architecture summit 21:03:50 ( but brion TimStarling - I may ask some follow-up questions at the end about a few other RfCs and pending things) 21:03:54 #link https://www.mediawiki.org/wiki/Requests_for_comment/Reducing_image_quality_for_mobile 21:04:15 #info I asked Yuri what he wanted: 1) an ok from ops to increase thumbnail storage by 2-3% and number of files by 15%, 2) from core/tim/etc to proceed with the proposed patch assuming my proposed path is satisfactory to everyone's involved 21:04:19  I thought that you should have only quality classes exposed, not expose an API allowing any integer percentage quality 21:04:59 TimStarling, it would be fairly easy to change from a number to a string constant 21:05:11  you suggest 30% but probably every mobile app will choose something different 21:05:12 if this is a requirement of course 21:06:37 TimStarling, this is similar to the problem we face with the thumbnail dimension - every wiki varying images by a few pixels. I propose a somewhat different solution here - an extension that does filtering/rounding of these numbers during the rendering 21:07:04 thedj: dfoy_ - http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/20140416.txt for the logs up till now 21:07:21  I don't see any filtering or rounding in the patch 21:07:41 example: user requested 240x250 image - the ext would say 250x250 already exists, or it is a multiple of 50, hence render it as a link to 250x250, with width=240 21:08:00 * aude waves 21:08:11 yurik: is this something your extension would do? rather than core? 21:08:12 Hi :) 21:08:12 separate patch - as an extension - to address all such rounding requirements for both image size & quality 21:08:16  yeah, you can read my thoughts on that on the relevant RFC 21:08:47 aude, not ours, a new extension whose job is only to "standardize" on thumbnail generation 21:08:54  gah 21:08:59 but not core? 21:09:11 AaronSchulz: I presume you think that's the wrong approach :) 21:09:28 * brion just added comment on the patch agreeing with idea to use quality classes rather than expsoing full integer range 21:09:33 #link https://gerrit.wikimedia.org/r/#/c/119661/ Gerrit changeset, "Allow mobile to reduce image quality" 21:09:34 no, i think core should be more flexible - depending on the site 21:09:46 * aude prefers we allow any size, but not keep cached so long if it's not requested 21:09:58 if that's feasible 21:10:09  me too 21:11:37 can i ask what the primary purpose is ? 21:11:46 reduce time to load ? 21:12:01 thedj: honest question: does the RfC address that? do you think the RfC should be clearer about the problem being solved? 21:12:03 reducing quality? to lower bandwidth consumption 21:12:36 yurik: so download time and download cost ? 21:13:30 both 21:13:36 Do we have some metrics/ideas to give us indications of how much benefit that would translate into ? 21:13:43 especially when the bandwidth is donated 21:14:22 thedj, 30-40% 21:14:55 ah k. so it's to a large degree from the zero perspective that we want to do this. 21:15:01 correct 21:15:36 i could see it being handy for hi-dpi devices as well, we could serve the double-size images with a medium quality setting to trade-off brandwidth and visual quality 21:15:38 BTW, for those who haven't looked, we now have a few more comments on the changeset https://gerrit.wikimedia.org/r/119661 in the last few minutes 21:15:50 but definitely the incentive is where we’re pushing donated bandwidth :) 21:15:55 (there's our brion always looking out for responsive design & gadget stuff :) ) 21:16:19 My comment was just that it shouldn't touch the -quality setting on pngs, and a nitpick on the commit message 21:17:11 once we move to HTMl storage, is the idea to implement this as a DOM post-processing step? 21:17:39 TimStarling, brion, please take a look at the https://www.mediawiki.org/wiki/Requests_for_comment/Reducing_image_quality_for_mobile#Possible_approaches 21:18:15 it discusses the 3 paths to do this, with 1 path doing everything internally without exposing it via URL 21:18:43 *nod* i was assuming the first pass implementation once the qualitys etting was available... 21:18:52  probably option 2 21:18:54 … was to do it as a dom postprocess step in mf+zero 21:19:09  that's not on the list 21:19:24 that's #3 i think 21:19:26 agh, i confused that with the js one 21:19:59 tim, you think it is better to let varnish do automagical image url rewrite? 21:20:19 * AaronSchulz prefers js if possible 21:20:31  how would it work with JS? 21:20:38 because we won't have as much info in varnish, plus we would have to put too much biz-logic in varnish (ops won't like it) 21:20:43 one issue I see with Varnish is transparent downstream caches 21:20:52 yes, that too 21:20:55  a DOM ready event? 21:20:55 the third option (JS) avoids that 21:21:03 JS would rewrite the URL 21:21:10 hmm 21:21:24 my main concern with that is rewriting urls in JS without often loading the original url is tricky 21:21:35 <TimStarling> I am wondering what the CPU requirements of option 3 are 21:21:37 <AaronSchulz> gwicke: related to downstream caches is handling purges 21:21:51 <TimStarling> and whether there will be flicker, browser incompatibilities, etc. 21:21:58 AaronSchulz, *nod* 21:22:02 <AaronSchulz> I guess if it's the very frontend cache it's fine 21:22:09 <TimStarling> we can't really waste the CPU of phones the same way we can desktop browsers 21:22:17 we'd have to send s-maxage-0 21:22:19 workflow:   zero ext changes src= to low quality,   JS changes it back to highres if device/network is good 21:22:21 =0 21:22:34 :\ 21:22:45 how expensive is a JS image tag search? 21:22:55 it's pretty cheap I believe 21:23:07 replacing them may be slow if it’s a big page with lots of images though 21:23:19 one querySelectorAll call 21:23:24 and you’ve got the issue of loading the original images and then the new ones.... 21:23:25 <TimStarling> image loading will start as soon as the img tag is created, right? 21:23:27 percentage wise i still think it won't be much 21:23:35 <AaronSchulz> TimStarling: I think so :/ 21:23:45 yeah, I think that's the bigger issue 21:23:57 we have a similar issue with the thumb size pref 21:23:58 that's the big question - can the low->high quality img tag replacement be done before browser starts loadnig them? 21:24:07 <TimStarling> what about what brion said, why is that not an option? 21:24:19 <TimStarling> … was to do it as a dom postprocess step in mf+zero 21:24:22 if we can find a way to suppress the original thumb load before resizing / quality downgrading, then that would be awesome 21:24:46 TimStarling, we would have to do it anyway, but there will be users who would want high-end images 21:25:03 i think we’re trying to avoid having php-time cacheable differences on zero….. it’s all very scary 21:25:35 in general, trying to scale for estimated network bandwidth is just a tricky tricky business 21:26:45 tfinc: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/20140416.txt for chat so far 21:27:18 there is another question - i am pretty sure there are many mobile users out there who don't have zero and who might want low bandwidth too 21:27:24 <TimStarling> what about having a separate new service to do DOM rewriting? 21:27:49 so we really should have a mobile setting "auto/always high/always low" 21:28:01 <TimStarling> yurik: those users can put up with what we give them 21:28:05 TimStarling, that's doable for low volume 21:28:26 which zero is afaik 21:28:57 what are the peak request rates on zero in pages / s ? 21:29:00 well, not those who are still on 2G, or who is paying high price for their internet. 21:29:24 <TimStarling> it's out of scope 21:29:31 yurik: then i'd want no images, if concerned about bandwidth (imho) 21:29:40 maybe my mobile browser allows that 21:29:45 Those who have questions for Max, he's here now 21:29:54 <TimStarling> the problem is complicated enough when it is just Zero 21:30:09 fwiw, MobileFrontend already has an Images on/off toggle 21:30:13 (OK, maybe today's meeting WON'T be a short one after all.) 21:31:17 <TimStarling> dr0ptp4kt: does it work? 21:31:26 <TimStarling> or do the images start loading and then get aborted? 21:31:34 TimStarling: it is completely rewritten html 21:31:40 it works 21:31:40 <MaxSem> it works via DOM rewriting on PHP side 21:31:53 there are ways to parse html without loading images, using https://developer.mozilla.org/en-US/docs/Web/API/DOMParser for example 21:32:27 or XMLHttpRequest 21:32:27 one caveat is supporting devices that don't support javascript, or rather "advanced javascript" as determined by rl 21:32:35 <TimStarling> gwicke: well, that's the kind of thing that I would expect to use a lot of client-side CPU 21:32:48 not really- it's using the normal html parser 21:32:59 it does rely on JS support though 21:33:05 and a non-sucky browser 21:33:13 HA! 21:33:20 ;) 21:33:26 <MaxSem> I don't think that many devices we want to support will work well with this 21:33:53 are you using XMLHttpRequest currently? 21:34:06 <MaxSem> libxml2 21:34:14 <MaxSem> be its name foreveer cursed 21:34:22 <TimStarling> can someone give me a quick overview of how HTML delivery in MF works and what the plans for it are? 21:34:42 we use xhr opportunistically. so it's usually to upgrade the experience, like avoid server roundtrips for newer phones 21:34:58 er, bigger roundtrips 21:35:15 <MaxSem> шеэы ыешдд мукн кщгпр щт увпуы 21:35:18 I see, so you are hesitant to require it 21:35:25 <TimStarling> preferably in a latin script 21:35:28 MaxSem, +2 21:35:34 <MaxSem> it's still quite buggy so is used only in alpha 21:35:54 sumana, would you please wire up a translation bot now? :) 21:36:02 <MaxSem> plans are to fix it 21:36:06 <MaxSem> ...eventually 21:36:11 <MaxSem> ...maybe 21:36:35 I don't see an issue with DOM post-processing on the server and storing that HTML back 21:36:36 yeah, the xhr for w0 is more like getting runtime config to do things ahead of caches being purged (e.g., add zero-rated support for an additional language) 21:36:58 dr0ptp4kt: I think here it would just emit those cartoon profanity things, like $%#%@ 21:37:17 as long as there are only a few variants and the transforms build on a known DOM spec that should work well 21:37:52 gwicke, zero already does a DOM post-parse rewrite to replace all external URL links with special warning URLs 21:37:59 <MaxSem> I would reeeeeally love to avoid doing it in PHP again 21:38:21 it's fairly easy in JS 21:38:27 you can use jquery etc 21:38:37 <MaxSem> wouldn't be lethal for zero which already does HTML transformations, but still sucks 21:38:50 gwicke, assuming flip phone has it :( 21:38:59 yurik, I mean on the server 21:39:38 do we have a framework for node.js extensions? 21:39:57 yurik, we have HTTP.. 21:40:08 set up a service, make requests to it 21:40:24 So we're about 2/3 through the hour and I'm not sure what to #info :) 21:40:46 gwicke, you mean PHP becomes a proxy to another service on internal network? 21:41:07 in any case, this is an optimization for the future, outside of the scope imho 21:41:19 <TimStarling> sumanah: three of us wrote comments on the gerrit change 21:41:36 yurik, you can go through PHP if you want; depends on whether it adds info that would be hard to get otherwise 21:42:04 <AaronSchulz> do we actually need the wikitext syntax addition too? 21:42:14 <MaxSem> definitely not 21:42:16 * AaronSchulz leans toward not adding it 21:42:17 i think we don’t need the wikitext addition no 21:42:28 keep it opaque to that layer 21:42:32 <AaronSchulz> right 21:42:33 <TimStarling> #info comments were provided on the image quality gerrit patch 21:42:33 it’s a presentation-layer decision 21:42:34 !link https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#File_insertion_syntax 21:42:50 er 21:42:53 -1 on the extra syntax 21:42:56 #link https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Reducing_image_quality_for_mobile#File_insertion_syntax on wikitext addition 21:43:03 can I say #agreed ? :) 21:43:07 +1 on not adding extra options to file syntax 21:43:27 bawolff, how do you mean? 21:43:27 <TimStarling> #info image scaler backend relatively uncontroversial -- HTML/URL manipulation to access that API is more complex 21:43:38 we need to distinguish low-quality URLs from the highs 21:43:41 yurik: I'm agreeing with everyone 21:44:03 good position :) 21:44:05 yurik: as in not adding ui 21:44:15 gotcha 21:44:42 <TimStarling> #info gwicke predictably favours Node.JS service 21:44:48 <AaronSchulz> lol 21:44:53 hehe 21:45:06 that's in response to MaxSem's lament about libxml2 ;) 21:45:34 <MaxSem> having a service for that would be even more cruffty 21:45:39 ok, i will change the URL syntax to   image.jpg/100px-qlow-image.jpg   this way we can later change it to some other magic keywords 21:45:45 So it's sounding like people think this is a relatively uncontroversial idea overall and we're just talking about implementation, right? 21:46:08 * tfinc reads the backscroll 21:46:20 any objections to that URL format? 21:46:32 yurik: Maybe re-order those parameters. Easier to regex out qlow-100px from the actual name of the file 21:46:37 "this" being the RfC as a whole 21:46:53 <MaxSem> +1 21:46:54 since we're going to be presumably keeping 100px-image.jpg for the normal quality image 21:47:15 are we sure that we need a different URL? 21:47:21 <MaxSem> yes 21:47:29 <MaxSem> varnish rewrites are evil 21:48:16 do we already have info about zero ip ranges in varnish? 21:48:27 ok, all settled, will implement the first step (core patch), and start implementing JS magic 21:48:43 <MaxSem> gwicke, for all that is holy, don't 21:48:44 gwicke, yes, varnish detects zero based on ip 21:48:52 #info ok, all settled, will implement the first step (core patch), and start implementing JS magic 21:49:15 <MaxSem> especially since now only mobile varnishes know about zero 21:49:19 hmm, then it might not actually be that hard to use that for image request rewriting 21:50:31 #info required modifications: use string instead of integer "qlow-100px-image.jpg", make it JPG only (no png) 21:50:31 I'd be against adding that info if it wasn't there already; but since it's already there it seems that the extra complexity would be fairly limited 21:50:46 <TimStarling> varnish doesn't have a lot of string handling built in, but you can use inline C, I did it once... 21:51:17 <MaxSem> regexping it would actually be possiblee 21:51:34 <MaxSem> but still this would SUCK 21:52:00 I have a few min of "what's up next week + other RfC news you should be aware of" to say before the end of the hour. 21:52:04 Any closing statements? 21:52:39 <TimStarling> modules/varnish/templates/vcl/wikimedia.vcl.erb was my own little bit of varnish URL manipulation 21:53:30 if we are done, would love to get +2 for https://gerrit.wikimedia.org/r/#/c/109853/ 21:54:34 <TimStarling> #info Tim skeptical about client-side JS rewrite: potential for CPU usage, flicker, image load aborts, browser incompatibilities, etc. 21:55:21 avoiding a double-load is hard afaik 21:55:41 <TimStarling> which is an argument for doing it on the server side 21:55:49 <AaronSchulz> yeah it may not be possible to use JS 21:55:50 or in Varnish 21:56:04 <AaronSchulz> so it's 1-2 21:56:18 <TimStarling> we have so many powerful tools on the server side now, we shouldn't be so keen to offload processing 21:56:36 ok, I'm gonna wrap up with a couple other #topics 21:57:00 for normal desktop page views the thumb size pref is pretty much the only one that can't be easily handled in CSS 21:57:01 #topic Next week - Associated namespaces 21:57:07 #link https://www.mediawiki.org/wiki/Requests_for_comment/Associated_namespaces Next week David Cuenca wants to find out whether there are any objections to the "Namespace registry and association handlers" that Mark proposed, discuss possible problems with his proposed approach, and see if there would be any hands available to work on it. He mentioned that "I hope this RFC moves forward because it affects important upcoming and already depl 21:57:08 oyed projects (Commons migration, templates, Visual editor, WD, etc)." 21:57:15 er: "it affects important upcoming and already deployed projects (Commons migration, templates, Visual editor, WD, etc)."" 21:57:45 #topic RfC news 21:57:53 so if we can find a way to do this in Varnish it might be possible to implement those prefs purely in CSS 21:58:03 #link https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Simplify_thumbnail_cache Mark Bergsma and Aaron Schulz just left some comments on the "Simplify thumbnail cache" RfC - if you're into that one, check them out 21:58:16 #info Pau Giner has updated his grid system RfC https://www.mediawiki.org/wiki/Talk:Requests_for_comment/Grid_system with more detail, and has submitted a patchset to Gerrit https://gerrit.wikimedia.org/r/#/c/125387/ so that the discussion can get more specific. Also see the example implementation http://pauginer.github.io/agora-grid/ 21:58:34 #info http://www.gossamer-threads.com/lists/wiki/wikitech/451921 "REST and SOA within MediaWiki - is my understanding right?" includes gwicke saying, "ideally the only code that directly talks to the database would live in a storage service, which exposes a REST API." which refers to https://www.mediawiki.org/wiki/Requests_for_comment/Storage_service in case you want to take a look at that 21:58:49 #info bd808 needs feedback on his structured logging patch - see http://lists.wikimedia.org/pipermail/wikitech-l/2014-April/075921.html 21:59:24 And as always I welcome your suggestions of what RfCs to talk about in these meetings next - and who specifically needs to be in those chats so we can sometimes change the timing 21:59:27 That's all from me. 21:59:57 yurik: did you get what you wanted (somewhat) today? :) 22:00:03 MaxSem: ^ (same question) 22:00:13 TrevorParscal: do you have an RfC that needs chatting about sometime soon? 22:00:17 (for instance) 22:00:32 #endmeeting