Architecture meetings/RFC review 2013-12-18

Wednesday, December 18, 2013 at 10:00 PM UTC at .

Requests for Comment to review
Propose your own RFCs:


 * Requests for comment/Localisation format
 * Requests for comment/PHP web service interface
 * Requests for comment/Json Config pages in wiki

Meeting summary

 * 1) wikimedia-meetbot Meeting

Meeting started by drdee at 22:03:29 UTC (full logs).

  https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2013-12-18 (bd808, 22:05:57) http://etherpad.wikimedia.org/p/RFC%20review (bd808, 22:06:37)

 Localisation format RFC (drdee, 22:09:13)  https://www.mediawiki.org/wiki/Requests_for_comment/Localisation_format (bd808, 22:09:25) ACTION: RoanKattouw to remove groups (TimStarling, 22:24:58) ACTION: RoanKattouw to look at the number of stat calls and consider optimisations (TimStarling, 22:34:34)</li></ol>

</li> PHP web service interface (TimStarling, 22:37:39)  https://www.mediawiki.org/wiki/Requests_for_comment/PHP_web_service_interface (TimStarling, 22:37:46)</li> https://www.mediawiki.org/wiki/Requests_for_comment/Services_and_narrow_interfaces (gwicke, 22:42:40)</li> ACTION: AaronSchulz to propose an API (TimStarling, 22:51:13)</li> ACTION: AaronSchulz to survey existing HTTP client libraries for ideas and potential bundling (TimStarling, 22:59:12)</li></ol> </li></ol>

Meeting ended at 23:15:26 UTC (full logs).

Action items

 * 1) RoanKattouw to remove groups
 * 2) RoanKattouw to look at the number of stat calls and consider optimisations
 * 3) AaronSchulz to propose an API
 * 4) AaronSchulz to survey existing HTTP client libraries for ideas and potential bundling

Action items, by person

 * 1) AaronSchulz
 * 2) AaronSchulz to propose an API
 * 3) AaronSchulz to survey existing HTTP client libraries for ideas and potential bundling
 * 4) RoanKattouw
 * 5) RoanKattouw to remove groups
 * 6) RoanKattouw to look at the number of stat calls and consider optimisations

People present (lines said)

 * 1) RoanKattouw (83)
 * 2) gwicke (71)
 * 3) TimStarling (62)
 * 4) parent5446 (34)
 * 5) siebrand (20)
 * 6) ori-l (17)
 * 7) James_F (14)
 * 8) AaronSchulz (10)
 * 9) Nikerabbit (9)
 * 10) drdee (9)
 * 11) bd808 (8)
 * 12) MaxSem (6)
 * 13) robla (5)
 * 14) Nemo_bis (3)
 * 15) meetbot-wm (3)

Generated by MeetBot 0.1.4.

Full log
22:03:29 #startmeeting 22:03:29 <meetbot-wm> Meeting started Wed Dec 18 22:03:29 2013 UTC. The chair is drdee. Information about MeetBot at https://bugzilla.wikimedia.org/46377. 22:03:29 <meetbot-wm> Useful Commands: #action #agreed #help #info #idea #link #topic. 22:04:50 It appears there is no architect present. 22:05:01 yup, let's wait a couple of more minutes 22:05:06 <Nemo_bis> engineers are usually happy about that 22:05:57 #link https://www.mediawiki.org/wiki/Architecture_meetings/RFC_review_2013-12-18 22:06:10 I sms'ed TIm just now 22:06:28 thanks robla 22:06:37 #link http://etherpad.wikimedia.org/p/RFC%20review 22:07:36 <TimStarling> hi, sorry about that 22:07:49 #chair TimStarling 22:07:49 <meetbot-wm> Current chairs: TimStarling drdee 22:07:55 hi Tim 22:08:01 are we good to go? 22:08:30 <TimStarling> yes 22:08:33 first RFC Localisation format? 22:09:09 <TimStarling> ok 22:09:13 #topic Localisation format RFC 22:09:23 <TimStarling> I did write a few comments about this one on the talk page 22:09:25 #link https://www.mediawiki.org/wiki/Requests_for_comment/Localisation_format 22:09:37 OK, so how exactly are groups being done here. There was a brief mention of message prefixing in Discussion 22:09:54 If messages are separated into groups, how does core know which messages are where? 22:10:06 RoanKattouw: Can you comment? 22:10:18 <RoanKattouw> Yeah so the message groups are mostly for future application 22:10:32 <RoanKattouw> In the WIP implementation that I wrote, they are ignored 22:10:41 Groups are not relevant for the PHP implementation. All messages are still in the server side localisation cache. 22:10:55 <RoanKattouw> That is to say, you can specify multiple directories with JSON files in them, and you're required to name each of them as a group 22:11:04 James_F: we are still drafting those, so better to wait until Jan 22:11:08 <James_F> gwicke: OK. 22:11:09 <RoanKattouw> But the PHP message loader doesn't actually care that there are multiple directories or what their names are 22:11:16 <RoanKattouw> It just visits all of them and extracts all the messages 22:11:51 <RoanKattouw> In the future, I think that message grouping could be useful as a replacement for messages arrays in ResourceLoader definitions, or at least for us to identify which messages are needed in the frontend 22:12:43 <TimStarling> groups are less flexible than message lists 22:12:50 <RoanKattouw> That's true 22:12:58 <RoanKattouw> I'm not convinced that we'll use them for this purpose yet 22:13:03 <TimStarling> say, if someone needs one group plus one message from another group 22:13:16 <TimStarling> they might be inclined to grab both groups 22:13:23 <RoanKattouw> I think it came up while discussing a potential future RfC for changing how we do client-side localization (moving to jquery.i18n perhaps) 22:13:25 <RoanKattouw> Yeah, that's a valid concern 22:13:40 <RoanKattouw> I personally am fine with dropping the group names and just making it a flat array 22:13:47 <MaxSem> woudn't per-language i18n files make things slower on wikis without manual cache rebuild? 22:13:48 <RoanKattouw> That wouldn't even break the code I wrote 22:13:57 <RoanKattouw> MaxSem: Why would they? 22:14:10 <MaxSem> more stats? 22:14:21 <RoanKattouw> On the topic of group names for another second, does anyone object to dropping the group names? 22:14:31 <RoanKattouw> Asking for the opinions of the RfC co-authors in particular 22:14:34 <Nikerabbit> I thought one point of the message groups was to allow automatic prefixing... but was that already moved out of the RfC? 22:14:34 <James_F> RoanKattouw: We're we planning to use them for the follow-up RfCs? 22:14:37 <MaxSem> admittedly, I'm not very knowledgeable in LocalisationCache 22:14:41 <RoanKattouw> Because if no one is particularly attached to them, let's just kill them 22:14:46 I agree that the best idea for now would be to drop group names. 22:14:47 <James_F> RoanKattouw: The automatic prefixing in particular, but other things too. 22:14:47 I think the groups also remove some maintenance burden on the developers. I wouldn't drop it, as adding it back later may prove to be a lot of work. 22:14:48 <TimStarling> MaxSem: no, it's a good point 22:15:02 <RoanKattouw> James_F: Yeah but I'm not convinced they're particularly useful. Autoprefixing could be useful, though, yes 22:15:21 <James_F> Adding groups later could be a real pain, as siebrand says. 22:15:24 <RoanKattouw> siebrand makes a good point, it's easy to remove them later but hard to add them later 22:15:27 If you add groups now but don't do something like auto-prefixing, adding in auto-prefixing later will be a lot harder than adding in groups later. 22:15:51 What would auto prefixing accomplish? 22:16:00 And auto prefixing what exactly? 22:16:06 <Nikerabbit> of message keys 22:16:09 <Nikerabbit> not that much I think 22:16:14 <TimStarling> MaxSem: there won't be a huge number of extra stats, just the length of the fallback sequence 22:16:15 <RoanKattouw> Auto prefixing of message keys so you could share the same i18n file between different applications 22:16:22 <TimStarling> it'll be like loading core messages 22:16:26 <RoanKattouw> But I don't see much benefit in that 22:16:40 <RoanKattouw> (Let's hold the stat discussion for just one minute) 22:17:00 <RoanKattouw> You could just as well prefix the messages with the name of your application/extension 22:17:01 <James_F> RoanKattouw: It's so the author of an extension doesn't need to be prescient about what other extensions may be called, I thought. 22:17:11 <RoanKattouw> And that would presumably be unique enough in any context you integrated it into 22:17:15 <James_F> RoanKattouw: And the same messages be used in MW and non-MW context easily. 22:17:39 If you implement groups now and do *not* include a method of telling the software which messages are where, adding in a method to do that later will be very difficult. 22:17:50 Auto-prefixing is one of those methods. 22:18:30 <RoanKattouw> To be clear, auto-prefixing is not primarily intended as a way to tell the software which messages are where 22:18:32 I don't like auto prexing. namespaces or domains: possibly.. 22:18:36 <RoanKattouw> Although it could be used that way to optimize loading 22:18:45 Then there is also the future possibility that, eventually, (even if out of scope for this current RFC), that the CDB cache will be split based on groups. 22:18:46 <RoanKattouw> Anyway 22:18:49 auto prefixing is a pain, because for example special page names, etc. do not participate in the prefixes. 22:18:53 So that'll be a pain to implement properly. 22:19:01 <RoanKattouw> It's clear that there is no consensus whatsoever for auto-prefixing 22:19:09 That'll significantly delay the implementation of the RfC as it is phrased now. 22:19:19 <RoanKattouw> We haven't discussed it properly and it's out of scope of this RfC 22:19:26 +1 22:19:33 <RoanKattouw> So let's keep things in scope 22:19:33 <James_F> So… 22:19:41 <RoanKattouw> Should we have group names or should we not have them? 22:20:06 <RoanKattouw> I argue that until we have a use for them, we should not have them now, and perhaps have them later. They will be optional for b/c, and using purely numeric group names will be forbidden 22:20:11 Like I said before, unless there is a comprehensive plan on exactly what to do with groups other than just as a means of organizing messages, they should not be added. 22:20:13 <RoanKattouw> That way we can distinguish between flat arrays and named groups 22:20:43 <RoanKattouw> If any group-related name-mangling is to happen, that needs to be introduced at the same time as the grouping system, otherwise b/c will be a massive pain 22:20:53 RoanKattouw: Can we have both with little effort? 22:21:01 <RoanKattouw> Both of what? 22:21:29 RoanKattouw: support for group names, or a flat array. 22:21:45 <RoanKattouw> Yes, that's what I'm saying 22:21:55 <RoanKattouw> We should support both in both directions 22:22:02 RoanKattouw: Okay, I must have missed some text. 22:22:12 <RoanKattouw> My current WIP implementation supports both groups and flat arrays because it completely ignores the array keys 22:22:32 <RoanKattouw> Any future implementation of groups should be tolerant of flat arrays without group names (hence the ban on numbers as group names) 22:23:20 is may result in reduced capabilities to not have group names, but that would be future functionality that will not harm existing code. 22:23:31 s/is may/It may/ 22:23:48 <Nikerabbit> I think a concerete example here would make this clearer 22:24:09 <Nemo_bis> like the one bd808 added? 22:24:23 ULS currently implements it's own API class to serve a JSON file through RL. 22:24:24 <RoanKattouw> Does anyone object to dropping groups (both the subject in this meeting, and from the RfC) at this point? 22:24:28 <TimStarling> ok, can the implementation omit group names for now, since it's obvious that that's the only solution parent5446 wants, and everyone else seems to be content with it? 22:24:41 <TimStarling> then we can move on to the next issue 22:24:44 <James_F> Sure. 22:24:45 In the future, we see this file being served by making a generic request. 22:24:47 <RoanKattouw> Yeah let's discuss it later 22:24:58 <TimStarling> #action RoanKattouw to remove groups 22:25:05 <RoanKattouw> siebrand: I think that should work completely different anyway. But that's a different discussion for a different day and a different RfC 22:25:09 <RoanKattouw> *differently 22:25:16 <RoanKattouw> OK, so MaxSem said something about stats 22:25:36 <RoanKattouw> There was a concern that splitting languages into separate files would harm performance for wikis without pre-built caches 22:25:39 <TimStarling> yeah, for non-english page views, you would expect this feature to roughly double the number of stats 22:26:01 <TimStarling> since say fr.json and en.json will both have to be checked for freshness 22:26:03 <RoanKattouw> I haven't tested fallbacks with my code yet 22:26:15 <RoanKattouw> But my code doesn't exhibit this behavior 22:26:18 <MaxSem> also, cache rebuilds would be slower but that's not critical 22:26:25 <RoanKattouw> I'm also not clear on how MessageCache handles fallbacks 22:26:35 <Nikerabbit> Why would they be slower? 22:26:45 <RoanKattouw> They wouldn't necessarily be slower overall 22:26:48 If this would turn out to be an issue (not sure if it is), we could always do what jQuery i18n allows: having all languages in one file as a fallback. 22:26:57 <RoanKattouw> What would slow them down is the need to open more files (both fr.json and en.json for French) 22:27:10 <RoanKattouw> However, the amount of data it has to read in is still 100x less for an extension 22:27:30 <RoanKattouw> Because ExtensionName.i18n.php contains the messages for all 200+ languages and you can't selectively read from it 22:27:48 RoanKattouw: MessageCache doesn't really handle fallbacks at the moment. This only really affects LocalisationCache. 22:27:57 <MaxSem> this is access speed vs. latency. I'm all for making SSDs a requirement for MW:) 22:27:59 <RoanKattouw> Right, sorry I meant to say LocalisationCache 22:28:07 Ah sorry. 22:28:12 <RoanKattouw> My bad 22:28:24 <RoanKattouw> I'm not entirely unconfused as to how the i18n system in core works :) 22:28:28 <TimStarling> well, the case you have to think about is NFS 22:28:47 <TimStarling> since a lot of shared web hosting is apparently done over NFS or some equivalent slow network storage 22:29:07 <RoanKattouw> I see now that I have made a mistake in my implementation and that fallbacks will most likely be broken 22:29:19 <Nikerabbit> In that case one would hope they do manual localisation cache rebuilds 22:29:42 <James_F> RoanKattouw: That's why it's WIP. :-) 22:29:44 * RoanKattouw -1s his own code 22:29:53 <TimStarling> Nikerabbit: you mean someone technically competent who also uses shared hosting instead of a VPS? 22:29:53 <MaxSem> Nikerabbit, if they only had shell access...;) 22:30:04 <TimStarling> I'm not sure such people exist... 22:30:11 <RoanKattouw> Right, so we'd roughly double stats for them 22:30:38 <RoanKattouw> Which hurts the freshness checks 22:30:52 <Nikerabbit> TimStarling: I'm confident some of them would able to read and follow a documentation that states it can make MediaWiki faster 22:31:04 <RoanKattouw> I wonder if we can get the mtime of the directory instead of the individual files? I'm not quite sure what the semantics of that are 22:31:39 <TimStarling> what if we batch the checks, by storing a timestamp and only checking once every, say, 1 minute? 22:31:39 `$stat = stat('\path\to\directory');` 22:31:49 <TimStarling> RoanKattouw: no, you can't 22:32:02 <TimStarling> the mtime of the directory is only updated when a file is created or removed 22:32:07 <RoanKattouw> Blegh 22:32:09 <James_F> Helpful. 22:32:09 <RoanKattouw> Thanks UNIX 22:32:28 <Nikerabbit> do we have any idea how big issue the stat calls can be? 22:32:36 <RoanKattouw> TimStarling: That sounds like a reasonable idea. We can probably work some magic in a custom CacheDependency subclass 22:32:50 <RoanKattouw> Nikerabbit: Not until we try it on a slow NFS setup? :) 22:33:01 <Nemo_bis> Uh! At last a use for gluster 22:33:02 <RoanKattouw> More seriously, we should compare stat calls before and after 22:33:05 We could implement a manual stat, i.e., have a file in the directory called mtime.txt or something. Every time the cache file is changed, update that file. Then it will act as a pseudo-mtime for all files in the directory. 22:33:05 <RoanKattouw> hahahaha 22:33:12 Not the cleanest solution but a possibility. 22:33:30 <RoanKattouw> parent5446: There's no need for that, CacheDependency will let us do nicer things 22:33:48 Ah OK. Didn't think about CacheDependency. 22:33:50 <RoanKattouw> Anyway 22:34:04 <Nikerabbit> I'm worried that we are spending a lot of effort on fine-tuning stat calls why other parts of the code have bigger effect... 22:34:19 <RoanKattouw> Are we agreed that I'll look at the number of stat calls and maybe write a CacheDependency subclass if we need it? 22:34:34 <TimStarling> #action RoanKattouw to look at the number of stat calls and consider optimisations 22:34:37 Yep 22:34:38 <RoanKattouw> Niklas is right, we don't even know if this is an issue or if it'll be eclipsed by something else 22:35:06 <RoanKattouw> Although this is the one thing that happens on every request (freshness check) so that's not a great thing to slow down 22:35:10 <RoanKattouw> Alright 22:35:14 <RoanKattouw> What else 22:35:20 "While handling JSON may be slower than using PHP (we have no benchmarks on this)" 22:35:30 Can we get benchmarks on this? 22:35:41 There's a stat on every request for every message group? 22:35:42 <ori-l> json_encode is often faster than serialize, I've found 22:35:43 RoanKattouw: you can always concatenate those json files, store an offset index and check for updates every <n> accesses so that those stats are amortized 22:36:03 not much fun, but doable.. 22:36:04 <James_F> We now have VE messages in parallel in i18n.php and *.json, so theoretically. 22:36:07 <TimStarling> should we go to the next RFC? 22:36:21 <TimStarling> we've just about got enough time to fit another one in 22:36:22 <James_F> Is getting benchmarks needed? 22:36:34 <ori-l> no, IMO. 22:36:39 <RoanKattouw> Not really IMO 22:36:43 <James_F> OK, saves another action item. 22:36:45 <RoanKattouw> It may make sense to measure the entire process 22:36:48 The one thing I'm concerned is that using json_encode will cause much more memory usage. 22:36:57 Since the entire file is loaded into memory before being parsed. 22:36:58 <RoanKattouw> But we don't care terribly about the recache operation, since it writes to a cache 22:37:06 <RoanKattouw> And so it's done infrequently 22:37:08 <James_F> In that case, move to the next RfC. 22:37:12 <RoanKattouw> IMO the freshness check is more important 22:37:13 <ori-l> yeah, let's do another one 22:37:16 parent5446: Is that not the case with a php file? 22:37:19 next RFC PHP web service interface ? 22:37:22 * ori-l was late to the party and wants in on some RFC action 22:37:27 <RoanKattouw> Yeah if no one else has questions about this one, let's move on 22:37:32 bd808: No, it's read and parsed incrementally. 22:37:36 And yeah let's just move on. 22:37:39 <TimStarling> #topic PHP web service interface 22:37:45 <RoanKattouw> Link to RFC? 22:37:46 <TimStarling> #link https://www.mediawiki.org/wiki/Requests_for_comment/PHP_web_service_interface 22:37:47 Thanks for the comments and discussion, everyone. 22:37:48 <RoanKattouw> I haven't seen this one 22:38:01 that's still at an early draft stage 22:38:04 So I mentioned this on the discussion page, but this is literally Guzzle. 22:38:21 Oh if it's too early to discuss we can leave it for later. 22:38:31 Aaron and me have been discussing implementation options 22:38:39 <TimStarling> so this is coming out of the cloudfiles work? 22:39:00 partly, that's what Aaron is working on 22:39:17 my motivation is making it easy to work with web services from PHP 22:39:18 <ori-l> I think the name is misleading. You're proposing a generic, library-like load-balancing function that takes a collection of URLs as input, right? 22:39:44 <RoanKattouw> I think there is a lot of context missing from this RfC 22:39:58 <RoanKattouw> Does this replace the routing engine in MW? 22:40:02 ori-l, it works on paths; the storage backends map those to URIs 22:40:10 For the record, Guzzle has a really nice tool where it keeps an array of various web services, all with different configurations. 22:40:13 <RoanKattouw> Would, say, /wiki be one of the paths that would be matched? 22:40:15 it can also support full URIs, but that is not the main motivation 22:40:19 <TimStarling> is AaronSchulz actually online? 22:40:51 <ori-l> gwicke: so an expanded ArrayUtils::consistentHashSort, right? 22:41:01 parent5446: there are nice features in guzzle, it just does not seem to be certain that the implementations would work for us 22:41:06 * AaronSchulz is around, yes 22:41:19 that is an implementation question though 22:41:27 the RFC is more about the API than the implementation 22:41:54 <ori-l> what existing services would we port to use the API? 22:42:00 <TimStarling> do we have an immediate second application, or would it just be an abstraction of cloudfiles stuff? 22:42:01 ori-l: how load balancing is implemented depends on the backend handler 22:42:05 <ori-l> it'd be helpful to have a list; that way we can identify commonalities 22:42:29 TimStarling: my motivation is the storage service and related service apis 22:42:40 https://www.mediawiki.org/wiki/Requests_for_comment/Services_and_narrow_interfaces 22:42:47 to be discussed later 22:43:43 <ori-l> there is functionality in the objectcache classes for doing this that I have wanted to use in the past (I can't remember what for, frustratingly) and that I found to be too tightly coupled to objectcache specifically 22:43:44 there are already a bunch of web services that we are using including swift, parsoid, the math service and the pdf renderer 22:43:59 there will be more, so making is easy to work with them might be a good idea 22:44:06 <TimStarling> ok, well that is what I want to see on the RFC, I think 22:44:09 <TimStarling> a list of subclasses 22:44:32 you mean a list of services to abstract over? 22:44:53 the API is intended to be open-ended regarding handlers 22:45:01 <TimStarling> well, the RFC has 22:45:04 <ori-l> existing classes that would be ported to use this API and projected classes that would use it 22:45:05 <TimStarling> / General Rashomon storage service for all remaining buckets 22:45:06 <TimStarling> $wgStoreBackends['/'] = new RashomonBackend ( array ( 22:45:28 <TimStarling> are you saying RashomonBackend is not a subclass of something in this RFC? it is its own thing? 22:45:45 that is an implementation detail to be figured out 22:45:55 IMO we won't need subclassing there 22:46:01 implementing an interface would be enough 22:46:02 Sorry, but I have no idea what Rashomon is. 22:46:17 <TimStarling> presumably a storage service 22:46:31 parent5446: https://www.mediawiki.org/wiki/Requests_for_comment/Storage_service 22:46:49 it is the revision storage service we wrote for HTML storage 22:46:54 <TimStarling> if you need more applications, maybe you could include EhcacheBagOStuff? 22:47:22 anything that speaks HTTP basically, and is worth making more convenient to work with 22:47:56 * gwicke looks up EhcacheBagOStuff 22:47:58 <TimStarling> I'm just worried that if the only immediate application is swift, it will end up looking swift-like, and everything that uses it in the future will have to fit into a swift-like API 22:48:22 <TimStarling> unless the API is planned to be solely following HTTP? 22:49:03 it is supposed to be a very convenient and parallel way to do HTTP 22:49:30 <TimStarling> ok, so who is going to write the API, because that is going to be an action item 22:49:41 <ori-l> tying it to HTTP seems a bit odd 22:49:45 <ori-l> what do you gain by that? 22:49:51 the idea is to put effort into the design of the HTTP APIs so that they don't need much extra wrapping apart from some convenience like auth, Content-MD5 etc 22:49:52 <TimStarling> ori-l: that's just what it is 22:49:54 <ori-l> rather, what would something more generic not be able to provide? 22:49:56 <TimStarling> an HTTP client 22:50:10 <TimStarling> more generic things can be built on top of it 22:50:20 * AaronSchulz doesn't want things to be too generic 22:50:28 <ori-l> hmmm 22:50:29 the paths can map to non-http stuff too 22:50:32 <TimStarling> like MaxSem's key/value store 22:50:32 <AaronSchulz> you end up not able to assume anything, or something very complex 22:50:35 but the abstraction is still HTTP-like 22:50:46 paths, headers and HTTP verbs 22:50:48 <ori-l> so the thought is that this would steer people toward designing restful services with nice APIs? 22:51:09 <ori-l> i like that, but it makes it especially important that the API be intuitive, easy, and well-documented 22:51:13 <TimStarling> #action AaronSchulz to propose an API 22:51:17 that too 22:51:53 ori-l: see the problem statement 22:52:12 so any comments on the API that is proposed in the RFC? 22:52:47 <TimStarling> that's half an API 22:53:05 <TimStarling> not even half 22:53:24 I don't see an API so much as a list of lists 22:54:01 you might be able to extrapolate past the things that are not spelled out explicitly 22:54:14 <TimStarling> it's a bit tricky since I'm only reading this for the first time in this meeting 22:54:35 *nod* 22:54:45 <TimStarling> but I don't get what $wgStoreBackends is and what a generic store is for 22:54:48 It's definitely going to be a bit restricting having requests represented as arrays rather than proper objects with properties. 22:55:02 <TimStarling> if it's an HTTP client, it shouldn't need configuration, it should be configured by its constructor 22:55:31 TimStarling, it is a service client that is close to HTTP 22:55:33 <ori-l> I think the idea is that $wgStoreBackends is Rashomon-specific, but the class (not included in the RFC) that transforms the array of URLs into an API is what is being proposed 22:55:55 The internal implementation could use lists but it should have a builder interface of some sort I would think. 22:56:05 but the configured services are additionally load-balanced and can have some more service-specific behavior 22:56:10 s/lists/arrays/ 22:56:14 they might not actually speak http to the backend for example 22:56:22 dispatching is based on paths 22:56:31 So it's HTTP-oriented but not necessarily HTTP-backed. 22:56:48 <TimStarling> ok, well I'm sure an API with class names and methods in classes will make this clearer 22:56:49 the default storage service is closely related to the bucket idea in the storage service RFC 22:57:25 TimStarling: are you mainly interested in the API backends should implement? 22:57:30 <TimStarling> AaronSchulz: have you surveyed existing HTTP client libraries in PHP? 22:57:44 AKA, Guzzle 22:57:51 ;) 22:58:20 the store API is fairly thin so far 22:58:33 a run method and maybe some convenience wrappers for get/post etc 22:58:39 and arrays as input 22:58:56 <AaronSchulz> I don't think we anything too complex so far 22:59:12 <TimStarling> #action AaronSchulz to survey existing HTTP client libraries for ideas and potential bundling 22:59:15 * AaronSchulz remembers encountering guzzle in some AWS code...it's bit complex 22:59:18 the interesting stuff would happen in the handlers selected by path prefix 22:59:44 <AaronSchulz> unless a good portion of it's features were useful and non-trivial I wouldn't bother 22:59:54 Guzzle really isn't complex at all. In fact, if you're working with an actual REST API, most of the configuration can be done in JSON. 22:59:57 * AaronSchulz doubts that the auth stuff would be adaquate 23:00:10 It also has OAuth support and other auth. 23:00:13 * AaronSchulz is talking about source code 23:01:09 one alternative: https://github.com/kriswallsmith/Buzz 23:01:28 the basic functionality is already pretty much there in the curl_multi clients we have been using 23:01:37 (I know nothing about it other than quick search for Guzzle alternatives) 23:01:41 afaik much of the missing stuff is auth 23:02:02 * AaronSchulz snickers at https://github.com/kriswallsmith/Buzz/blob/master/lib/Buzz/Client/MultiCurl.php 23:02:28 I don't know about Guzzle's source code, but as a library it is definitely feature-ful and easy to use. There are very few things it does not support that we would need (except for, of course, non-HTTP backends). 23:02:57 parent5446, it might or might not be useful as a backend- we'll see 23:03:31 IMO that should not matter at this point 23:03:42 But it's not a backend. It has an API to use. If you're thinking of covering up Guzzle with *another* API then it's pointless. 23:04:26 if you don't see any value in the abstractions mentioned in the RFC, then yes 23:04:52 <TimStarling> alright, any other action items for this RFC? 23:05:12 <TimStarling> we're out of time now 23:05:54 the hope was to get some feedback on the backend abstractions in the RFC 23:06:11 <AaronSchulz> gwicke: well you could build it over guzzle, though that would be pretty heavy 23:06:15 we can just go ahead and implement parts of it though 23:06:39 AaronSchulz: yup 23:07:05 <TimStarling> I don't understand the bit about storage backends 23:07:58 the backends do whatever is needed to convert a path into an URI and actual request data (in case it is actually HTTP) 23:07:58 generally speaking, I would hope we either take off-the-shelf components, or play to win (i.e. create a library that many others would want to use) 23:08:17 so they can massage headers, handle auth, encode query strings etc 23:08:29 ^this. The reason I'm pushing Guzzle so much is because it would require little, if any, new code inside of MediaWiki itself. 23:08:31 do load balancing and fail-over 23:08:50 or use a non-HTTP transport 23:09:03 if that can be integrated into the curl event loop 23:09:28 <TimStarling> and how does this relate to the idea of a generic curl_multi client? 23:09:53 <TimStarling> would it a subclass or just be integrated? 23:09:55 can you be more specific? 23:10:10 they are backend handlers that just implement an interface 23:10:21 for auth they might want to cache some state 23:10:25 my fear whenever someone talks about doing a "lightweight" abstraction is that it gets heavyweight pretty quickly, and no effort is put into making it any sort of standard so bitrot sets in 23:10:27 so using an object seems to be reasonable 23:11:32 <TimStarling> would you have a class with an event loop, which holds a collection of backends to delegate certain logic to? 23:11:58 that would have a method to massage new requests and responses and possibly some to handle errors and maybe retries 23:12:40 the requests are all handled in the curl_multi event loop 23:12:51 trivially so if everything is actually handled by curl itself 23:13:19 by doing additional calls out of the curl_multi loop in case we also use other transports that have a similar 'do some work' interface 23:13:52 in the simple http case, all it does is complete the request with auth etc and pass it to curl 23:14:06 on the way back maybe do some error handling and retry 23:14:14 that's pretty much it 23:15:17 <TimStarling> ok, well if you could write that on the RFC, and ideally propose an interface, that would be helpful 23:15:26 <TimStarling> #endmeeting