User:Leucosticte/IRC

[17:02]  #startmeeting [17:02]  TimStarling: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' [17:03] == rfarrand [~rfarrand@guest-tan1.corp.wikimedia.org] has quit [Quit: Computer has gone to sleep.] [17:03]  #topic API roadmap | https://meta.wikimedia.org/wiki/IRC_office_hours | Please note: Channel is logged and publicly posted (DO NOT REMOVE THIS NOTE). | Logs: http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-office/ [17:03]  #link https://www.mediawiki.org/wiki/Requests_for_comment/API_roadmap [17:03] == lbenedix [~lbenedix@dslb-092-078-133-026.092.078.pools.vodafone-ip.de] has quit [Quit: Leaving.] [17:04] == DarTar [~DarTar@wikimedia/DarTar] has quit [Quit: DarTar] [17:04]  do we have anomie and yurikR? [17:04] == rdaiccherlb [~rdaiccher@wikimedia/rdicerb-wmf] has quit [Quit: Computer has gone to sleep.] [17:04] TimStarling: I'm here [17:04]  yep [17:05] == ryasmeen has changed nick to ryasmeen|Away [17:05] == J-Mo [~jtmorgan@199.231.242.26] has joined #wikimedia-office [17:05] == rfarrand [~rfarrand@198.73.209.5] has joined #wikimedia-office [17:06] == DangSunM|cloud [sid13042@wikimedia/DangSunM] has quit [Ping timeout: 260 seconds] [17:06] == Revi [sid12940@wikimedia/Hym411] has quit [Ping timeout: 260 seconds] [17:06]  can you tell us what has been done on this API work since the architecture summit? [17:07] == alantz [~Anna@tan4.corp.wikimedia.org] has joined #wikimedia-office [17:07] I've started working on the stuff in the document. I added Gerrit links to each item as patches got submitted, and moved a few things to a "completed" section. [17:08] Since we're taking things slow as far as deprecation, some have their patches merged but need an analysis of whether people have actually changed their code. [17:08] == TrevorParscal has changed nick to TrevorP|Away [17:09] == Guest24138 [sid13042@gateway/web/irccloud.com/x-sygihooeqvhmvnnv] has joined #wikimedia-office [17:10]  you mean like token handling? [17:10] == rdaiccherlb [~rdaiccher@wikimedia/rdicerb-wmf] has joined #wikimedia-office [17:10] Yes [17:10] == James_F|Away has changed nick to James_F [17:10] == Revi [sid12940@wikimedia/Hym411] has joined #wikimedia-office [17:11] == ryasmeen|Away has changed nick to ryasmeen [17:11] == alantz [~Anna@tan4.corp.wikimedia.org] has quit [Ping timeout: 258 seconds] [17:12] == alantz [~Anna@tan1.corp.wikimedia.org] has joined #wikimedia-office [17:12]  it looks like you need code review on some changes [17:12] Yes, I do [17:15]  is there anything else you need? [17:15] == alantz [~Anna@tan1.corp.wikimedia.org] has quit [Client Quit] [17:16] Not really. [17:16] I'm still not too fond of the decision to go with format=json2 for https://www.mediawiki.org/wiki/Requests_for_comment/API_roadmap#Changes_to_JSON_output_format, but I still agree with the points you made at Wikimania that clean breaking is better than random mystery breaking. [17:16]  I have a set of API feature requests from Tomasz that he sent in June [17:17] I would like to see those [17:17]  I'll forward [17:17] == alantz [~Anna@tan1.corp.wikimedia.org] has joined #wikimedia-office [17:17]  the main one that is relevant is a request for "chain queries" [17:18]  "The fever queries we have to send the better it gets for our users batteries." [17:18]  so I suppose we are talking about doing multiple actions in a single POST request [17:19] == InezK_away has changed nick to InezK [17:19] We already have generators for a common instance in action=query. Details on what other "chains" he's thinking of would be useful. [17:19]  yeah, he didn't give details, but I assume he knows about generators already [17:20] == DarTar [~DarTar@wikimedia/DarTar] has joined #wikimedia-office [17:21] == wisdom has changed nick to alpha [17:22] don't forget that SPDY / HTTP2 is around the corner [17:22] == alantz [~Anna@tan1.corp.wikimedia.org] has quit [Quit: Computer has gone to sleep.] [17:23] <TimStarling> well, if we are just talking about doing several unconnected API queries in a row, that could be done with pipelining, if the client supported that [17:23] which eliminates some of the issues that generators are designed to address [17:23] gwicke: No it doesn't. [17:23] <TimStarling> but what if you are taking some data from one query and using it in the next query? [17:24] == DanielK_WMDE [~daniel@wikipedia/duesentrieb] has joined #wikimedia-office [17:24] == alantz [~Anna@tan3.corp.wikimedia.org] has joined #wikimedia-office [17:24] <TimStarling> then it could be arbitrarily complicated [17:24] TimStarling: right, that is the bit that isn't addressed [17:25] security is another relevant aspect to consider [17:25] DOS in particular [17:25] == alantz [~Anna@tan3.corp.wikimedia.org] has quit [Client Quit] [17:25] == Jyothis [~Jyothis@wikipedia/Jyothis] has quit [Remote host closed the connection] [17:26] == Jyothis [~Jyothis@wikipedia/Jyothis] has joined #wikimedia-office [17:26] <TimStarling> you mean DOS by means of an expensive query batch? [17:26] we shouldn't provide entry points that allow somebody to take down the API cluster by visiting some static web page with their cell phone [17:27] there is a security bug with an example page [17:28] == TrevorP|Away has changed nick to TrevorParscal [17:28] #62615 [17:28] <yurikR> i would actually prefer to keep queries separate too [17:28] == parent5446 [parent5446@mediawiki/parent5446] has joined #wikimedia-office [17:29] <yurikR> if you want to chain requests, lets rely on http-level protocol [17:29] == alantz [~Anna@tan4.corp.wikimedia.org] has joined #wikimedia-office [17:29] == DarTar [~DarTar@wikimedia/DarTar] has quit [Quit: DarTar] [17:29] <TimStarling> but splitting it up implies duplicated overhead [17:29] <yurikR> if some data is needed for consequent request, we either create specific api that understands that (e.g. - generators for query and other) [17:29] or rely on gzip compression to take care of it [17:30] == alantz [~Anna@tan4.corp.wikimedia.org] has quit [Client Quit] [17:30] <yurikR> well, the overhead will be negligent if they reuse the same connection, plus caching might make it much more efficient [17:30] <yurikR> with combining done on the api level, caching is totally busted [17:30] == Jyothis [~Jyothis@wikipedia/Jyothis] has quit [Ping timeout: 240 seconds] [17:30] == aharoni [~chatzilla@di8-33084.dialin.huji.ac.il] has quit [Remote host closed the connection] [17:30] <TimStarling> I mean in varnish, apache and HHVM [17:31] I don't think anybody is proposing to get rid of generators or chaining in general altogether -- it's just that we should be careful about what we use them for, and keep in mind how HTTP/2 affects the trade-offs [17:31] <TimStarling> there is per-request overhead at each level [17:31] <TimStarling> especially in HHVM/MW [17:31] <JetLaggedPanda> re: Tomasz's requests, I think the problem there is action=mobileformat, which apps use (this was the reason for asking about pipelining, IIRC) [17:31] <TimStarling> also in MySQL [17:31] <JetLaggedPanda> and that doesn't support generators or anything [17:31] <JetLaggedPanda> so over time slowly things have been tacked on to it [17:32] * anomie sees no action=mobileformat on enwiki [17:32] <TimStarling> there's a big difference in MySQL CPU usage between doing a single query that gets information about 100 pages, and doing 100 queries, one for each page [17:32] <JetLaggedPanda> anomie: gah, action=mobileview [17:32] == flyingclimber [~tfinc@wikipedia/Tfinc] has quit [Remote host closed the connection] [17:32] TimStarling: the same is not necessarily true if each of those pages is stored on a different node [17:33] == mhurd [~anonymous@tan4.corp.wikimedia.org] has quit [Quit: mhurd] [17:33] <yurikR> JetLaggedPanda, there was a big change a while ago that allowed any module to use generators [17:33] JetLaggedPanda: I'd have to look at what exactly action=mobileview is doing, but offhand it sounds like it needs any unique bits rolled into core. Much like a lot of MobileFrontend. [17:33] <yurikR> so now the mobileview simply needs to be updated to use generators [17:33] <JetLaggedPanda> anomie: i agree, yeah [17:33] <TimStarling> we're not going to split storage across hundreds of nodes [17:33] <JetLaggedPanda> yurikR: yeah, that would be good too, although perhaps it needs general query prop= as well [17:33] perhaps not hundreds, but we already use dozens [17:34] <JetLaggedPanda> yurikR: *also*, perhaps this could be solved by simply making mobileview html a prop= for action=query, but I guess that'll have caching implications [17:34] <TimStarling> I don't think so [17:34] <yurikR> JetLaggedPanda, yes, i think it should have been done that way :) [17:35] TimStarling: I agree with your general point, it's just that it might not be an eternal truth to the same degree it's right now [17:35] == tfinc [~tfinc@wikipedia/Tfinc] has joined #wikimedia-office [17:36] == alantz [~Anna@tan3.corp.wikimedia.org] has joined #wikimedia-office [17:37] <TimStarling> #info implementation by anomie is proceeding, some changes just need code review and merge [17:37] there are for example wins in making more API requests static by storing or caching them; combined with different cost structures in HTTP/2 some applications might actually perform better when they do a few parallel requests vs. hitting a custom, uncached entry point [17:38] <TimStarling> #info Tomasz requested a "chain query" feature, but we need specific requirements [17:39] I see it more as a gradual shift [17:39] <TimStarling> you can't cache API responses [17:40] it's not technically impossible [17:40] <TimStarling> maybe you could if it were REST, but it is too difficult to invalidate the multiple URL variants enabled by the action API [17:40] Some API response can be cached, mostly action=query. We already emit cache-control headers indicating what MediaWiki thinks about cacheability. [17:41] True, people might have stale caches then. [17:41] <TimStarling> the client requests cache-control headers [17:41] <TimStarling> the client is explicitly requesting a stale cache since there is no way to update those caches once they are generated [17:41] == James_F has changed nick to James_F|Away [17:42] * gwicke nods [17:42] == bearND [~bearnd@198.73.209.5] has quit [Remote host closed the connection] [17:42] <TimStarling> maybe we could normalize requests in varnish... [17:42] <DanielK_WMDE> there's a lot of stuff on that rfc page. perhaps it would be good to split it to ease discussion. [17:42] The major opportunity for caching is revision content, for which gwicke is already working on a REST API specifically intended for heavy caching. [17:43] <DanielK_WMDE> the way things a structured now, i'm afraid some high profile discussions may drown out talk about some finer points [17:43] I think it might be worth looking for other resources that could potentially be cacheable with the right URL structure [17:43] and have the right granularity / access pattern for this to make sense [17:43] <TimStarling> even with normalization, you still have things like rvprop [17:44] == bearND [~bearnd@guest-tan1.corp.wikimedia.org] has joined #wikimedia-office [17:44] <TimStarling> with REST, you just send all the data, but with api.php, each application will request a different rvprop [17:44] == TrevorParscal has changed nick to TrevorP|Away [17:45] <TimStarling> so even in that simple case, you multiply the cache space requirement by several [17:46] == alantz [~Anna@tan3.corp.wikimedia.org] has quit [Quit: Computer has gone to sleep.] [17:46] yeah, it only makes sense if the number of variants is more limited [17:46] which is something we could try to move towards for newer modules [17:46] where the trade-offs make sense [17:47] <TimStarling> for purging, imagine if you had to send an HTCP purge request for each rvprop combination [17:47] == alantz [~Anna@tan4.corp.wikimedia.org] has joined #wikimedia-office [17:47] DanielK_WMDE: There's basically no discussion happening there at the moment, so I doubt anything is being drowned out. Although at some point (not now) I'd still like to hear your thoughts on what makes things like ApiResult::setIndexedTagName hard for you to use (without getting into redesigning the whole thing around a forest of objects, that was discussed enough at Wikimania IMO). [17:48] returning more props by default would probably not make a big difference in request size, and could still result in a faster response if the response is cached in exchange [17:49] == alantz [~Anna@tan4.corp.wikimedia.org] has quit [Read error: Connection reset by peer] [17:49] == Guest24138 [sid13042@gateway/web/irccloud.com/x-sygihooeqvhmvnnv] has quit [Changing host] [17:49] == Guest24138 [sid13042@wikimedia/DangSunM] has joined #wikimedia-office [17:49] == alantz [~Anna@tan1.corp.wikimedia.org] has joined #wikimedia-office [17:49] == Guest24138 has changed nick to DangSunM|cloud [17:49] there are some entry points where the choices culd perhaps be reduced a bit without major ill effects [17:50] <TimStarling> #info gwicke suggests we consider a gradual shift towards greater edge caching coupled with the use of SPDY, as a replacement for batches embedded in single queries (incl. generators) [17:50] that's overstating it quite a bit [17:51] <TimStarling> the meetbot command is unprivileged, you can do your own #info if you like [17:52] == alantz [~Anna@tan1.corp.wikimedia.org] has quit [Read error: Connection reset by peer] [17:52] #info s/as a replacement for batches/as a replacement for *some* batches and expensive generators/ [17:53] == alantz [~Anna@tan3.corp.wikimedia.org] has joined #wikimedia-office [17:53] == PPena [~PPena@tan1.corp.wikimedia.org] has quit [Quit: Computer has gone to sleep.] [17:54] <TimStarling> should we mark this RFC as approved? [17:54] I also think that we could do some of the assembly and orchestration in an intermediate layer [17:54] == JetLaggedPanda has changed nick to YuviPanda|zzz [17:54] netflix of example has been doing something like that: http://techblog.netflix.com/2012/07/embracing-differences-inside-netflix.html [17:54] == awight [~adamw@wikimedia/Adamw] has quit [Remote host closed the connection] [17:55] * Krenair thinks we should [17:55] <TimStarling> I think the reason RFCs don't get approved is that we worry that by marking an RFC approved, we are approving every little aspect [17:55] It's fine with me to mark it as approved; I've been treating it that way for a while now. [17:56] The only drawback might be that it might discourage further discussion and further things for my "TODO" list. [17:56] == mhurd [~anonymous@tan2.corp.wikimedia.org] has joined #wikimedia-office [17:56] == moizsyed [~moizsyed@tan1.corp.wikimedia.org] has joined #wikimedia-office [17:57] == kaity|away has changed nick to kaity [17:57] <TimStarling> yeah, maybe it makes sense for something this complex to be a living document [17:57] We could move the living document portion of it out of the RFC, although I'm not sure what would be left in the RFC then. [17:57] <DanielK_WMDE> ...or factor out some parts that can be considered agreed on and treated as a "plan". [17:57] <yurikR> gwicke and I just spoke about caching a bit, and it seems ideally we should somehow cache certain requests, and devise a well established way to flush them when they become obsolete [17:57] == jhobs [~jhobson@tan3.corp.wikimedia.org] has quit [Ping timeout: 246 seconds] [17:57] <TimStarling> it suggests a status flow "in draft" -> "archived complete" for big RFCs [17:58] == jhobs [~jhobson@tan2.corp.wikimedia.org] has joined #wikimedia-office [17:58] <yurikR> this caching won't apply to every api request, but we really ought to move in that direction [17:58] <DanielK_WMDE> what does "archived complete" mean? [17:58] <DanielK_WMDE> "we are done talking"? [17:58] <TimStarling> it means it will be listed at https://www.mediawiki.org/wiki/Requests_for_comment/Archive#Implemented [17:59] <TimStarling> yes, which means we are done talking [17:59] == alantz [~Anna@tan3.corp.wikimedia.org] has quit [Quit: Computer has gone to sleep.] [17:59] == Jeff_Green [~jgreen@wikipedia/jgreen] has left #wikimedia-office [] [18:00] <TimStarling> we presumably won't discuss archived RFCs in public IRC meetings or architecture committee meetings [18:00] == alantz [~Anna@tan1.corp.wikimedia.org] has joined #wikimedia-office [18:00] <TimStarling> for the API roadmap, the work could theoretically be eternal [18:00] <DanielK_WMDE> yea, makes sense [18:00] == ori [~ori@wikipedia/ori-livneh] has joined #wikimedia-office [18:00] <TimStarling> but I prefer to see RFCs as change requests that can be approved and completed [18:01] == zz_MissGayle [~gyoung@ec2-50-112-50-28.us-west-2.compute.amazonaws.com] has joined #wikimedia-office [18:01] == zz_MissGayle has changed nick to MissGayle [18:01] <DanielK_WMDE> in such a case, the goal of the rfc is not to implement a feaqture, but to agree on a general plan [18:01] == MissGayle [~gyoung@ec2-50-112-50-28.us-west-2.compute.amazonaws.com] has quit [Changing host] [18:01] == MissGayle [~gyoung@wikimedia/gyoung] has joined #wikimedia-office [18:01] == PPena [~PPena@tan2.corp.wikimedia.org] has joined #wikimedia-office [18:01] <TimStarling> maybe the RFC should be called "API roadmap 1" [18:01] DanielK_WMDE: I think that's a good summary of RFCs in general. [18:01] <TimStarling> which can be marked approved [18:02] <TimStarling> then while that is being implemented, an "API roadmap 2" RFC can be the parking lot for design of the next batch of features [18:03] <TimStarling> then we can schedule a meeting to discuss "API roadmap 2" and we will know that that means we are looking forward not back [18:03] <AaronS> heh [18:03] <TimStarling> you know it is nice when people don't have to read so much [18:03] == James_F|Away has changed nick to James_F [18:03] == mhurd [~anonymous@tan2.corp.wikimedia.org] has quit [Quit: mhurd] [18:04] <TimStarling> Daniel complained about the RFC being big already, but it has a lot of complete stuff mixed with plans for the near future, plus a few plans for the somewhat more distant future [18:04] <DanielK_WMDE> anomie: that's my understanding too, but the final status is currently called "implemented". That'S a lot more than "agreed on a plan". [18:05] <TimStarling> we have "accepted" also [18:05] == Ltrlg [~ltrlg@ppp-seco21th2-46-193-173-180.wb.wifirst.net] has quit [Quit: Leaving.] [18:05] So, to summarize: RFC is approved, the "living document" aspect should be abstracted out into a project page of some sort (I'll do that), and when we have enough of a backlog of non-trivial changes we'll make a new RFC (I'll probably do that too when the time comes). [18:05] == parent5446 [parent5446@mediawiki/parent5446] has left #wikimedia-office ["wikimedia-office"] [18:05] <TimStarling> yeah, makes sense I think [18:06] == moizsyed [~moizsyed@tan1.corp.wikimedia.org] has quit [Remote host closed the connection] [18:07] == kaity has changed nick to kaity|away [18:07] == alantz [~Anna@tan1.corp.wikimedia.org] has quit [Quit: Computer has gone to sleep.] [18:07] == kaity|away has changed nick to kaity [18:07] <TimStarling> #action anomie to abstract the "living document" aspect of the RFC out to a project page [18:07] <TimStarling> ok, anything else before I end the meeting? [18:07] == kristenlans [~kristenla@wikimedia/KLans-WMF] has quit [Quit: kristenlans] [18:07] Not from me, I was about to leave the meeting anyway [18:08] == bearND [~bearnd@guest-tan1.corp.wikimedia.org] has quit [Remote host closed the connection] [18:08] <TimStarling> #endmeeting [18:08] <TimStarling> oh, there was no meeting started, oops [18:09] <TimStarling> #action TimStarling to start meeting properly in future so that we have logs