Technical decision making/Decision records/T291120

= What are your constraints? =

= Decision =

= What are your options? = Resource:

https://www.atlassian.com/blog/inside-atlassian/make-team-decisions-without-killing-momentum

= Use Cases and required MediaWiki state events = ** To show that even though we could solve a lot of comprehensiveness problems with option 2 (EventBus), that still leaves a gap of using that data to compute something new, we likely need a centralized platform to compute new datasets.

= Decision Record Drafting Meeting Notes =

2022-03-01: Otto and Giuseppe discussion:

 * Possible we want to enrich events with stuff that might come from other places than MW.
 * Want to free MW app worker as soon as possible.
 * 1 or 100 more API requests per edit is still okay.
 * Stream processing approach is the more long term sustainable one.
 * BUT, if you want something here and now for some short term goal.  EventBus okay. Worry: that thing will remain there forever.  Don’t want to maintain both forever.
 * Need to make developing services around the big thing easier.  They tend to want to store the data in docker image now.
 * Preference for stream processing over eventbus.

Feb 15, 2022 | petr & otto discussion
Attendees: Petr Pchelko Andrew Otto Dan Andreescu

Notes


 * PP: we should dismiss the job queue idea.  Worst of both worlds.  Still in PHP, but jobs are delayed and can get lost.  All downsides.
 * Making just content events might be ok in eventbus.  But if we have 500 new events, maintaining in MW might be difficult.
 * PP: What about consistency?
 * DA: perhaps Debezium on just the content table for content events.  Rev_id and content, that’s it.  This should be a considered solution.
 * PP: then we could generalize it: when MW table schema is ‘reasonable’ we could just use Debezium for other things too.  When not reasonable, use EventBus.
 * AO: people also will want html content, and page links changes.
 * PP: maybe sending 4mb of content and12mb html on every edit in a PHP deferred update (eventbus) isn’t great.
 * PP: my preferred solution: start with EventBus, then do separate streaming service.  If fat events gets traction and we need more and more, then we do streaming service solution.
 * DA: would be easiest now, but what about performance about producing all that data from the app servers after an edit?
 * What would giuseppe say?  Will this bog down app servers?

Action items


 * Talk to SRE about emitting from EventBus, if okay with them, let’s do it.
 * However, if this needs to emit many different kinds of events, then maybe doing it in EventBus is not that flexible and we should do streaming service anyway.
 * Talk with Giuseppe: he prefers streaming service idea.  Doing this in EventBus will likely just be tech dept.

Feb 14, 2022 | Discuss Comprehensive MediaWiki Events Decision Record
Attendees: Luke Bowmaker Andrew Otto Petr Pchelko Leszek Manicki David Causse Andy Craze

Note


 * https://libwas.readthedocs.io/en/latest/What MW state would be most useful to have in streams now?
 * Wikitext content
 * Wikitext diffs
 * Html content
 * Page links changes
 * Wikibase entity data
 * Citation changes?  (is this different than links?)


 * AC: ORES preprocessing for models?
 * Most are just fetching article text or diff.
 * Every ores model is at the revision level, text and diffs most useful
 * In the future, lots of things we can do, depends on use case.


 * LM: From wikidata/wikibase
 * Could be rubbish!? :)
 * Wikidata edts are slow sometimes because of abuse filter. Could we build this functionality outside of request pipeline.
 * AbuseFilter: Community can set up their own filters, which can slow things down.  This is done before page save.
 * DC: Redirects? These are separate from pages.  When a redirect is added to a page, we would like to have an event for this.  Consider page as an object with its redirects.
 * Existing events have page_is_redirect flag.  We could put where the redirect is to by asking MW.
 * Other side too.  What pages redirect TO a page?  Page A is redirected from Page X,Y,X.
 * PP: redirect sources are stored in a denormalized table, i think page links.

Solution discussion:


 * MW Job Queue vs Stream Processor
 * PP: page content is immutable.  You can attach it to whatever event at any time in the future.  Async is okay here, it will be correct.  Doesn’t really matter if job queue or not.
 * Option 2: at request time (EventBus) is actually okay.
 * MW Job doesn’t really add us much.  Its just more async.  Just adding a step that doesn’t really give you anything.
 * Option 4 is cool, especially if MySQL External Store had its own API separate from MW.
 * Option 4 isn’t really decoupled, its a separate deployment unit, that’s something.  But is it worth it?
 * AO: Option 2 and 3 have to POST to EventGate.
 * PP: there are maybe ok PHP kafka producers now?
 * PP: There are a ton of things that are coded in MW PHP.  Having to recode that in other languages is annoying. E.g. MW normalizing page titles.
 * PP: What are you getting from doing this from just having all consumers asking API for what they need?
 * DC: reading directly from MW events: ordering is hard to accomplish.  Reading multiple topics.  Streaming processor helps, but it is complicated.