Architecture meetings/RFC review 2015-08-19

Topic: https://phabricator.wikimedia.org/T107595

Actions / infos:

[14:53]  #action daniel to describe blob storage services in more detail [14:53]  #info my observation/summary: to me, i guess the first step is to establish that there are, indeed, multiple editable independent (and hence non-derived) streams per revision .. and so far, based on the discussion, it seems that there are. [14:57]  #action daniel to write more on the RFC about what benefits this system would provide for derived dat [14:57] #action DanielK_WMDE_ to clarify interaction with services like RESTBase [14:58]  #info James_F thinks B/C adds complexity with minimal benefits. Brion thinks that 3rd party code that access the database directly should die. [14:58]  #info may be good to specifically mention access APIs [14:59] #info gwicke strongly cheers for establishing clear APIs for the storage layer [15:00]  #info gwicke preferrs the association of data with revisions to be programmatic, rather than materialized in the sql database [15:00] #info gwicke sceptical about the utility of the indirection between blob and per-plob metadata [15:01]  #endmeeting

Full IRC log:

[14:08] DanielK_WMDE_: would you like to give an intro? [14:08]  marktraceur might be able to restart meetbot [14:08]  gwicke: yea, can do [14:09] I can't get logged into tools-dev or tools-login at all; probably the maintenance they are doing... [14:09]  ok, let's start then [14:09]  So, I'd like to get a first round of feedback today on my proposal for supporting multiple content streams per page [14:09]  #link https://phabricator.wikimedia.org/T107595 [14:10]  The idea is that we want to have a) multiple user-editable content objects on a single page (e.g. wikitext plus structured data for categories plus extra info for, say, the lead image for mobile) [14:10]  ...and b) we'd want to permanently store various derived kinds of data for a given revision (rendered html, diff, blame map, etc) [14:11]  To allow this, I propose to introduce another level of indirection between the revision record and the actual data blob (currently in the text table or external store) [14:11]  we now have: page -> revision -> text (-> ext store) [14:12] <DanielK_WMDE_> we would then have: page -> revision -> slots -> urls; urls can refer to the text table, or ext store, or whatever other storage mechanism we like, e.g. RESTbase [14:13] <DanielK_WMDE_> any questions about the idea so far? [14:13] the capability to store multiple bits of content per revision is definitely something we need [14:13] * aude thinks this would be nice from a caching perspective [14:13] reminds me of resource forks on classic mac os :) [14:14] <DanielK_WMDE_> brion: NTFS calls them streams, I think [14:14] the same need prompted the creation of RESTBase in the first place [14:14] e.g. we need to invalidate the site links html on wikidata (because of dom change), but not have to invalidate everything [14:14] <TimStarling> what do you think will be the first use case? [14:14] i'm a little nervous about being able to update 'derived' slot data [14:14] <DanielK_WMDE_> TimStarling: stroctured data associated with file description pages would be an obvious use case for two "primary" content objects [14:15] * brion likes immutable things [14:15] <DanielK_WMDE_> brion: either we have updates, or we have sub-revisions. think of the derived data as a persistent parser cache [14:15] basic idea of multiple data blobs sounds useful for many cases [14:15] *nod* [14:15] <DanielK_WMDE_> (which would be one function it may actually have) [14:15] we have a lot of use cases in RB land, like HTML, data-parsoid, data-mw, revscore data, derived mobile html, etc [14:15] it seems like the problem of managing multiple resources gets a lot harder with the strictly linear revision model MW currently has [14:15] <TimStarling> sub-revisions? [14:16] this does bring us back to classic debates such as 'should viewing an old revision show you the old versions of the images/templates' [14:16] and this extends into current revisions & tempalte/etc updates [14:16] <DanielK_WMDE_> robla: with respect to the page history, any edit to any of hte resources would create a new revision [14:16] DanielK_WMDE_: edit conflicts will be a lot more fun! :-) [14:16] while template updates for example would probably not create a new revision [14:17] they'd only add a new 'render' [14:17] <DanielK_WMDE_> if the page has streams A, B, and C, revision 1 would be (A1, B1, C1). If an edit changes only stream B, the second revision would be (A1, B2, C1). The unchanged slots would point to the same data blobs again [14:18] <TimStarling> sure [14:18] <TimStarling> and each slot is immutable? [14:18] <DanielK_WMDE_> robla: the code for displaying diffs and handling conflicts would need to be extended to support multiple objects. [14:18] <DanielK_WMDE_> robla: but it's not a new problem. it's like git doing a diff/patch over multiple files [14:18] <DanielK_WMDE_> it's not conceptually difficult. but yes, code needs to be written [14:18] <TimStarling> you call it a subrevision but it has a new rev_id, it is not like we have a subrevision ID [14:18] the main cases we are talking about here are a) primary data, and b) derived data [14:18] <TimStarling> it is an actual revision [14:19] derived data would normally be updated whenever the primary is updated [14:19] adding revisions to the visible page history on template/image update would drive editors mad, I worry [14:19] <TimStarling> it will appear in page history as such [14:19] the relationship between primary data items is more interesting [14:19] but it might actually be a good idea [14:19] <DanielK_WMDE_> TimStarling: well, there are user-editable slots, which would be immutable. and there are derived slots, which are mutable. The data url associated with that slot of the revision would be replaced with a different data url [14:20] <TimStarling> if you have structured data which is not derived from wikitext, and allow that to be edited, then you need to represent that in history [14:20] DanielK_WMDE_: would it be too complicated to have the mutable and immutable slots be distinct? [14:20] eg stored/accessed separately [14:20] <DanielK_WMDE_> TimStarling: my current idea is to not have sub-revisions, but to update derived data of revisions silently, just like we update the parser cache silently when templates change [14:20] i can see convenience in putting them together (one common infrastructure) [14:20] TimStarling: it depends on how central that information is to the article [14:20] but also the other way (immutable storage can be "really" immutable, archived to different disks, etc) [14:20] for example, it is debatable whether a lead image update is critical to the article content [14:20] brion: i think having images/template updates in page history is something that users would want (generally) [14:20] <DanielK_WMDE_> brion: it would add complexity, but it's a possibility, I think [14:20] <TimStarling> lead images should be reviewable [14:20] * aude imagines a filter for it though [14:21] TimStarling: +2 [14:21] aude: +2 :) [14:21] <DanielK_WMDE_> brion: my idea was to code the distinction between mutable and immutable deep into the storage service. [14:21] yeah, but that doesn't necessarily mean that everything has to be the same kind of 'edit' [14:21] it just means that it needs to be trackable & reviewable [14:21] <DanielK_WMDE_> the storage layewr would just refuse to update primary (user editable) slots [14:21] <DanielK_WMDE_> there would be a flag for that in the database [14:21] DanielK_WMDE_: so same API to access them, but potentially could be separate backend storage (something that the frontend wont have to worry about) [14:22] <DanielK_WMDE_> TimStarling: lead images would count as primary content. it's not derived data. changing them would create a new revision. [14:22] DanielK_WMDE_: annotations about the page content too? [14:22] well today, lead images are extracted from the list of images available in a file -- they literally are derived data [14:22] so what happens if something that's derived today becomes content tomorrow? [14:22] <DanielK_WMDE_> brion: the storage of the actual blob could be configured per slot. wikitext could go one place, wikibase json another, html yet another. media content could live on the file system directly. [14:23] <DanielK_WMDE_> data urls are flexible. that's another layer of abstraction that i have only hinted at in the rfc, since i didn'tz want to overburden it [14:23] <TimStarling> so derived data is not treated as an archive, for backups etc.? [14:23] <DanielK_WMDE_> gwicke: annotations could be stored separately, sure [14:23] TimStarling: depends on the interest [14:24] there is interest in HTML dumps, for example [14:24] <DanielK_WMDE_> TimStarling: that's my current thinking, yes: primary data is archived and revied, derived data is treated much like the parser cache. [14:24] <DanielK_WMDE_> doesn't even have to be persistent, if that is not desired. [14:24] dropping caches can be hell on performance though [14:24] <DanielK_WMDE_> of course [14:24] i would recommend keeping the data unless it's actaully invalidated :) [14:25] <DanielK_WMDE_> yes, i just mea nto say that the architecture can accommodate volatile as well as persisnent data [14:25] And speaking of invalidation... I'm envisioning suppression could get a little messy [14:25] <DanielK_WMDE_> csteipp: why? [14:25] <DanielK_WMDE_> you could still suppress a revision, just like you do now [14:25] derivation makes suppression definitely more interesting [14:25] <Krenair> People might want to be able to suppress parts of content rather than the whole content? [14:26] there are more dependencies to track [14:26] Right, if something derived includes user_text, then renames / suppression gets hard. But we can manage it. [14:26] <DanielK_WMDE_> Krenair: that's a new feature, which would be doable, but i'm not sure whether it's terribly useful, or woth the cost [14:26] <DanielK_WMDE_> csteipp: just like the parser cache. [14:27] it's a bit more interesting than that [14:27] once you store derived content that's composed, you have to track those historical dependencies [14:27] that's what we realized while working through this in RB [14:28] <DanielK_WMDE_> gwicke: yes, any content blob can depend on any other content blob [14:28] <DanielK_WMDE_> dependency tracking would happen slot-to-slot. [14:28] nice [14:29] <DanielK_WMDE_> implementing this is no requirement though. we don't *have* to store html for all revisions. we don#t have to put any html in there at all. that's just one possible use case. [14:29] <DanielK_WMDE_> full fine grained dependency tracking would be cool, but poses some challanges wrt scalability. [14:29] <DanielK_WMDE_> blob-to-blob dependencies can easily go into the billions [14:30] we'll see what we can do ;) [14:30] it's definitely not a trivial problem [14:30] <DanielK_WMDE_> one thing i'm wondering about is backwards compat of the database schema. [14:31] <James_F> Is it even possible? [14:31] would extensions be able to add arbitrary slots to pages? [14:31] 'easy' way -- store primary slot in text table, others in other table ... [14:31] <DanielK_WMDE_> do we still want the revision table to point to the text table at least for the "main" slot, so tools working directly against the database wouldn't break? [14:31] legoktm: or some service [14:31] <DanielK_WMDE_> but on labs, the text table is useless anyway... [14:31] tools working directly against the database should die [14:31] <DanielK_WMDE_> legoktm: yes. [14:31] at least for reading text :) [14:31] it's fairly easy to set up a service that keys on title/revision [14:31] <TimStarling> no, the easy way is to use rev_text_id for the main content [14:31] <DanielK_WMDE_> brion: +1 [14:31] <James_F> What if the "main" content changes? [14:32] <James_F> E.g. move from wikitext to HTML. [14:32] <TimStarling> there's no need to have a second text table [14:32] (late to the party .. ignore if already addressed) there seems to be some similarity in this rfc and what restbase wants to do .. but i suppose this is more a proposal to change core mediawiki storage abstractions, and not just about a storage implementation? [14:32] brion: yeah [14:32] <James_F> Are we just kicking that down the road? [14:32] James_F: ideally, you only switch format on new revisions [14:32] <DanielK_WMDE_> TimStarling: yes. for b/c, we could duplicate the link to the main content there [14:32] means you keep a wikitext parser around forever of course [14:32] <James_F> brion: Is that ideal? Yeah. :-( [14:32] <DanielK_WMDE_> for consistency, i'd also want to have it in the new table [14:32] James_F: immutable data 4evahhhh [14:32] we have it in separate storage right now [14:33] <DanielK_WMDE_> subbu: exactly. in my mind, RESTBase would be one of the storage machanisms used by core [14:33] <James_F> brion: But "main type is X for rev < A, Y for A < rev < B, Z for rev >= B… [14:33] subbu: yeah imo the abstraction & how it affects our internal and external apis is the important part [14:33] DanielK_WMDE_: one question I had on the task is whether you see this as a wrapper for every service providing revision-related data out there from MW's perspective [14:33] DanielK_WMDE_, brion thanks .. (will read backlog after). [14:33] <James_F> gwicke: Do you mean, would we also want to do page properties like this? [14:34] <TimStarling> the obvious alternative to this proposal is to have multipart content for what we are calling "primary content", and keep derived content merely linked, like it is now [14:34] James_F: well, type of the main revision in a particular item should probably be able to change over time (even from arbitrary rev to rev maybe) [14:34] <DanielK_WMDE_> gwicke: yes, pretty much. i mean, nothing keeps some 3rd party service from providing extra data associated with a page revision, without it being recorded in the db. but we'd get the infrastructure that could record any such association of extra content [14:34] <James_F> brion: Where would we store that knowledge? [14:34] <James_F> brion: Just the current CH type? [14:34] James_F: type should be stored along with the revision i think, conceptually at least :D [14:35] * brion rereads [14:35] <TimStarling> with multipart content, the interface changes would be limited to EditPage and its consumers, instead of also touching Revision [14:35] <DanielK_WMDE_> TimStarling: it would be nice to at least have a standard infrastructure for storing such associated data, and for storing and querying the links to the revisions. [14:35] <DanielK_WMDE_> which is pretty much what i'm proposing [14:35] DanielK_WMDE_: we are looking into storing wikitext in Cassandra as well [14:35] <DanielK_WMDE_> brion, James_F: type already is store dwith the revision. then it would be stored per slot per revision. [14:35] nothing short term, but it's a possibility [14:36] DanielK_WMDE_, what are the non-derived streams in a revision that you envision, concretely? [14:36] <TimStarling> the RFC says [14:36] metadata, html, images, are all derived data .. from wikitext. [14:36] <TimStarling> about why not to use multipart... [14:36] <TimStarling> "1. they are more flexible and more efficient with respect to blob storage" [14:36] <TimStarling> which could be addressed at the storage layer [14:36] <DanielK_WMDE_> gwicke: yea, you can implement blob storage services based on Cassandra, or whatever you like [14:36] <TimStarling> "2. they avoid breaking changes to APIs the allow access to raw page content, by presenting the content of "main" slot there per default. Attempting the same with multi-part revisions would lead to round-trip issues when only the main part of the content gets posted back from an edit." [14:36] there is definitely a lot of overlap between this RFC and RESTBase [14:37] <TimStarling> which could be addressed by having a reassembly layer between the API and Revision [14:37] it's basically moving the MW-internal storage to a model that's closer to RESTBase's [14:37] <DanielK_WMDE_> subbu: wikitext (obviously), media info (associated with file description pages), lead image data, possibly tags and categories (no more need to put them into the wikitext) [14:38] <DanielK_WMDE_> gwicke: yes, I think the two go well together. [14:38] <James_F> And TemplateData, Graphs. [14:38] <TimStarling> why is it nice to have a standard infrastructure for derived data? [14:38] <James_F> Other things that abuse PageProps. [14:38] <DanielK_WMDE_> James_F: templatedata, yes! [14:38] <TimStarling> it does not seem modular [14:38] i'd say there's some definite use for things that are done as subpages today [14:38] but that opens the 'why not just use subpages?' question ;) [14:39] subtitles in various languages for a video [14:39] <DanielK_WMDE_> subbu, James_F: also, template definition and documentation could be separate wikitext objects on the same page. no more need for cruft. [14:39] <James_F> brion: Because sub-pages suck. [14:39] data table for a graph [14:39] to me, the question is mostly 'why should derived data be stored in MediaWiki'? [14:39] James_F: indeed. ;) being able to treat something as a unit is nice [14:39] DanielK_WMDE_, ok, so, you are proposing that the current monolithic wikitxt model .. to, at the very least, consider metadata and core data as separately and independently editable. [14:40] <TimStarling> gwicke: yeah, that's what I mean, the MW borg is trying to assimilate all your data [14:40] <James_F> So #REDIRECT would be deprecated (or, at least, not actually stored in the wikitext blob)? [14:40] <DanielK_WMDE_> TimStarling: to define a new slot for derived data, you give the name of the slot, and register the blob storage handle to be used with it. that's it. you can plug in whatever you like. [14:40] <TimStarling> it's not modular [14:40] <James_F> DanielK_WMDE_: What derived data would you put there? [14:40] <DanielK_WMDE_> TimStarling: the advantage is that it is simple to find all data associated with a revision [14:40] TimStarling: yeah [14:40] <TimStarling> I mean, if you had blame maps, that is probably a massive system written in some other language [14:40] <DanielK_WMDE_> James_F: diffs, blame maps, rendered html, ... [14:40] <James_F> DanielK_WMDE_: I was imagining multiple 'real' slots ('primary' in your language). [14:40] <TimStarling> listening on a change bus for article text changes [14:40] DanielK_WMDE_: https://en.wikipedia.org/api/rest_v1/?doc is providing a listing currently [14:41] <TimStarling> so now it has to be half written in PHP and store its text in derived slots? [14:41] it's not an exhaustive list, for sure, but it's growing [14:41] <DanielK_WMDE_> James_F: yes, multipled primary slots for things like template data, template docs, media info, etc [14:41] <James_F> DanielK_WMDE_: OK. So there'd be 'content' (one of which was 'primary') and 'derived' (which are derived from… all the content? just the primary content?)? [14:42] to me, i guess the first step is to establish that there are, indeed, multiple editable independent (and hence non-derived) streams per revision .. and so far, based on the discussion, it seems that there are. [14:42] <James_F> DanielK_WMDE_: Because we can't have multiple 'primary' slots (not least because we're talking about putting this into the current DB schema). [14:42] <DanielK_WMDE_> TimStarling: it doesn't have to. the point is having an infrastructure to easily store associated data with revisions without rolling your own for every externsion that needs this [14:42] <DanielK_WMDE_> TimStarling: you still *can* roll your own, if you want to. [14:42] subbu: yeah, I think it's pretty clear that we are moving in that direction [14:42] <DanielK_WMDE_> and tie it in, or not [14:43] DanielK_WMDE_: an important requirement for a storage service is low-latency API access; how would this compare to what RESTBase currently provides? [14:43] <DanielK_WMDE_> James_F: that's why i'm proposing to change the db schema. that's the core of the RFC [14:43] gwicke, that looks like an implementation detail unrelated to the abstraction / rfc? [14:43] <James_F> DanielK_WMDE_: The backwards-compatibility part. [14:44] subbu: the multi-primary-content bit, or performance? [14:44] i mean, if the discussion is about whether the proposed abstraction makes sense. [14:44] performance. [14:44] <DanielK_WMDE_> James_F: ah - for backwards compat, there would be a "main" slot. You can have multiple primary (user edited) slots, but only one "main". [14:44] <DanielK_WMDE_> the concept of "main" is really only needed for backwards compat [14:44] <DanielK_WMDE_> though it may come in handy for other things [14:45] <James_F> DanielK_WMDE_: Yeah. Maybe we should reserve the 'primary' label for that one. [14:45] <TimStarling> ok, if we admit that it's not required to use it for derived data then it becomes somewhat less scary [14:45] <James_F> But whatever [14:45] <TimStarling> as for brion's subpage idea [14:45] <DanielK_WMDE_> gwicke: i don't think it would add much overhead. though going through mediawiki of course is slower than accdessing RESTbase directly. [14:45] subbu: sure, performance mainly relates to whether it is desirable to store all derived content in MW or not [14:46] <TimStarling> maybe we should look at what a page is and how it is represented [14:46] yes, that seems like an impl. detail .. potentially configurable. [14:46] <James_F> Difference between having the feature and using it in WMF production? [14:46] so to me it sounds useful to have an abstraction for revision-related content, but I'm not sold on keeping any kind of shadow state in MySQL for data that's primarily stored elsewhere [14:47] <TimStarling> a page should probably be a single UI-exposed unit with a self-contained history [14:47] <DanielK_WMDE_> TimStarling: my idea is that a page is logically omposed of multiple editable "streams" of primary content. changes to any of the streams creat a new revision. there can be additional derived data associated with each revision. [14:47] <James_F> TimStarling: One of the nice ideas of the RfC is to merge the file and file-page histories into a single history. [14:47] <TimStarling> theoretically you could make the page history UI display a union of subpage changes, but I don't think that is a good way to go [14:47] <TimStarling> yeah for files maybe [14:47] <DanielK_WMDE_> TimStarling: the page forms a logical unit that is addressable from the outside. the content of the different streams may be shown separately, or combined, depending on the rendering mechanism used [14:48] <James_F> TimStarling: So that a new file version and the new description are the same revision, and shown as such. [14:48] <TimStarling> but the API and the UI should have a similar conceptual model [14:48] <James_F> TimStarling: Or the documentation and TemplateData both change /with/ the template. [14:48] <TimStarling> right? [14:48] <DanielK_WMDE_> TimStarling: yes, i agree, a single single history. each revision changes one or more primary streams. [14:48] <TimStarling> if we present a union to the user, we should present a union API [14:48] <James_F> With the ability to filter to a single stream? [14:48] a page history is a timeline of events related to a title [14:49] <DanielK_WMDE_> well, it would already be a union in the database [14:49] * robla notes that we're about 10 minutes from the end of the scheduled time [14:49] <DanielK_WMDE_> TimStarling: something similar can be emulated using subpages. but then you have talk pages for each subpage, and you have to watch them separately (and move, and delete, and protect, etc) [14:49] * subbu waves at robla [14:50] <DanielK_WMDE_> TimStarling: also, subpages don't work for the derived data use case [14:50] <TimStarling> yep [14:50] <TimStarling> ok, 50 minutes past the hour, let's start wrapping up [14:50] <TimStarling> action items and summaries only please [14:51] <DanielK_WMDE_> main questions: do we need something like this? is adding an indirection between revision and data blob the right approach? [14:51] my observation/summary: to me, i guess the first step is to establish that there are, indeed, multiple editable independent (and hence non-derived) streams per revision .. and so far, based on the discussion, it seems that there are. [14:52] <DanielK_WMDE_> in what way should the rfc be extended for further discussion? [14:52] subbu: yeah, I think there is fairly broad agreement that we are headed in that direction, and that it is useful to support it [14:52] DanielK_WMDE_: it might be wise to explicitly call out how this will interact with restbase on sites that use it [14:52] great. [14:52] since there's some overlap in the idea of multiple data items [14:52] yeah, generally the interaction with services is a bit unclear [14:53] (and restbase or something like it may be a good alternate backing store as well) [14:53] <DanielK_WMDE_> brion: basically: it can use restbase, or it can coexist with restbase without using it. [14:53] q: is the multi-part content alternative question resolved? [14:53] <DanielK_WMDE_> #action daniel to describe blob storage services in more detail [14:53] <TimStarling> I think I am leaning towards accepting, despite my skeptical comments [14:53] <DanielK_WMDE_> #info my observation/summary: to me, i guess the first step is to establish that there are, indeed, multiple editable independent (and hence non-derived) streams per revision .. and so far, based on the discussion, it seems that there are. [14:54] I am okay with parts of the RFC, but no okay with others [14:55] the disagreement is more about the how, not the what [14:55] i am okay with the idea of the rfc and the need for an abstraction, but i have to think about the details. [14:55] <TimStarling> you are OK with primary content stored in MW? [14:55] * aude would like more details on how caching is handled and cache invalidation [14:55] <TimStarling> I think that is the part that I am happiest with [14:55] <DanielK_WMDE_> gwicke: do i understand correlcly that you prefer the association of data with revisions to be programmatic, rather than materialized in the sql database? [14:55] <TimStarling> DanielK_WMDE_: can you write more on the RFC about what benefits this system would provide for derived data? [14:55] but agree this is a direction i think we need to go [14:56] may be good to specifically mention access APIs [14:56] eg most things should not be pulling straight out of the DB [14:56] <DanielK_WMDE_> aude: me too, but I'm trying to keep the scope of the rfc reasonable. [14:56] DanielK_WMDE_: ok :) [14:56] (that also makes it easier to use restbase or similar techs as a backend *for* this) [14:56] DanielK_WMDE_: yes, that's a part of it [14:56] * AaronSchulz will need to re-read the rfc a few times [14:56] AaronSchulz: you aren't the only one :-) [14:56] brion, +1 [14:56] <James_F> Can I ask for an #action to propose a couple of options around B/C with pros and cons? I worry it adds complexity at minimal value. [14:57] <DanielK_WMDE_> TimStarling: yes, will make that a todo. I think the answer is "it makes it streight forward to add your own". [14:57] <DanielK_WMDE_> #action daniel to write more on the RFC about what benefits this system would provide for derived dat [14:57] <DanielK_WMDE_> James_F: "it" being the entire RFC, or the B/C bit? [14:57] <James_F> DanielK_WMDE_: The B/C. [14:57] #action DanielK_WMDE_ to clarify interaction with services like RESTBase [14:58] <DanielK_WMDE_> #info James_F thinks B/C adds complexity with minimal benefits. Brion thinks that 3rd party code that access the database directly should die. [14:58] * James_F grins. [14:58] <TimStarling> ok, I suppose we will need another meeting on this some time? [14:58] <DanielK_WMDE_> #info may be good to specifically mention access APIs [14:58] my earlier question: is the multi-part content alternative question resolved or needs to be addressed in the RFC? [14:58] :) [14:59] #info gwicke strongly cheers for establishing clear APIs for the storage layer [14:59] <DanielK_WMDE_> subbu: i tried to describe the downsides of the multi-part approach there. but we didn't discuss it [14:59] * subbu will read rfc [14:59] <DanielK_WMDE_> gwicke: i'm all with you there :) [14:59] <DanielK_WMDE_> subbu: please comment! [15:00] <TimStarling> in next week's RFC meeting we will discuss my tidy RFC https://phabricator.wikimedia.org/T89331 [15:00] <DanielK_WMDE_> #info gwicke preferrs the association of data with revisions to be programmatic, rather than materialized in the sql database [15:00] #info gwicke sceptical about the utility of the indirection between blob and per-plob metadata [15:00] <TimStarling> although we may be short on numbers since there is a management offsite [15:01] <TimStarling> thanks everyone [15:01] <TimStarling> #endmeeting [15:01] * TimStarling says, as if meetbot is listening [15:01] hehe [15:01] :) [15:01] * James_F grins. [15:01] <DanielK_WMDE_> thanks for runnign the show, tim! thanks everyone for your feedback! [15:02] DanielK_WMDE_: thanks for taking the time to write up the RFC! [15:03] as much as we quibble about the details, I think we are in agreement on a lot of the big picture