Wikimedia Technical Conference/2018/Session notes/Architecting Core: extension interfaces

Theme: Architecting our code for change and sustainability

Type: Technical Challenges

Leader(s): Timo Tijhof

Facilitator: Kate

Scribe: Irene

Description: Extensions are the key way that we add and modify the functionality of MediaWiki. This session looks into the interface for extensions and how they impact the architecture of MediaWiki. The primary goal for this session is identifying (potentially breaking) changes we can make to the extension interfaces to enable the underlying architecture to be changed without breaking compatibility in the future. -- T206081

Would be useful to look at how some other software does extensions. . Two that have recently done overhauls: Firefox, and Wordpress. Both made major breaking changes and invested a lot of effort in this overhaul process. Need to look at what they did, and why.

Question 1: what’s bad about the extension interface that we have?


 * We expose so many internals that we’re not able to make changes any longer
 * Restricting the model will allow for changes without destroying existing extensions

Question 2: Is there only one extension interface? Or are there multiple? Can we classify the existing extension ecosystem into a limited number of interfaces that, collectively, cover most of our use cases.


 * One is a listener, that just receives information
 * One is a filter, that has the potential to modify, something before sending it along.
 * One registers additional implementations of existing abstract

Attendees list

 * Cindy, Daren, Alexia, DJ, Leszak, Subbu, Florian, Raz, Vogel, TimS, DanielK, Kate, Irene, AdamBasso, Gergo, BrionVibber, Timo, AntoineL

Detailed notes

 * Timo’s Presentation:
 * Extension interfaces - should we have them?
 * Hoping to focus on the two specific solutions we are currently thinking of
 * What are the current problems?
 * Hooks expose a lot
 * Internal services, state and methods are implicitly public; which is both good and bad
 * Any change is a breaking change to stable API; there are some historically stable APIS but they might change tomorrow, because we don’t mark them as stable
 * Desired outcomes:
 * Stable hooks - Extensions can be supported buy core for a long time
 * Small hooks - extensions can still change defaults without duplication
 * Need to decide what to do about big hooks - extensions can still easily replace an entire service
 * Improvable core - ideally without breakage
 * Hooks are slow and fragile
 * Synchronously run during the operations; which makes them easy to edit while running, for an ext feature you don’t have to identify all of the hooks, can abort the action in one of various ways, replace the action in its entirety, and extend it, such as a notification after the action is performed
 * Can do “anything” but only implicitly.
 * All stuck together so it hard to pull them apart
 * This raises issues of scalability: data base transactions, such as saving an edit, without workaround means no-one else can modify while someone is editing, which scales very poorly and leads to cascading failure
 * Availability: cascading failure
 * Performance: async is difficult. Would be better to separation actions into chunks
 * What are some questions we should answer together?
 * Q1: What types of modifications can an extension perform right now?
 * Q2: Which ones work well?
 * Q3: Which ones currently suffer from these problems?
 * What are our solutions?
 * Small groups section where folks are discussing the questions above
 * Q1: (see photo for full list)
 * Extensions can do anything, so multi groups
 * Mw core was a lot of things, auth mechanisms, new actions ex exporting,
 * Change whatever the parser is operating
 * Changing the skin
 * Prevent actions
 * Two question marks about where they go
 * Apply data schema
 * Is the api something has [?] or has filters, apparently apis that allow modification of what the module actually does
 * Q2: (see photo for full list)
 * Things that are atomic and don’t store data work well; no side effects
 * Registry and filter hooks work well, but can be improved (esp in messaging)
 * Q3: (see photo for full list)
 * Media handlers, interprets what kind of media type a file is, makes a registration for that, and dictates what happens with the media output - other session have identified that there’s work to be done here; not well defined. Registry but too much for it to do.
 * Parser extensions - have storage and async effects
 * Lack of clarity on scale of hooks
 * When they expose implementation detail as opposed to concept detail
 * Parsing arbitrary data structures around
 * Subclassing wikipage factory → hook for wikipage factory But no wikipage factory Actually exists. Too broad and should kill it
 * Skinning is broken
 * How do we solve all of these problems?
 * Without making things worse ideally
 * Still in fluxx on how we do this. What works well is
 * We categorize hooks into two broad areas
 * Something happens and you as another extension want to respond to that (mirror edits to elsewhere, ex), not changing semantics of the default (user re-name)
 * Filters - modifying a value in some way based on other parameters, don’t want side-effects
 * Services (not currently a hook but should be) ex kitchen sink; are these hooks or are these services? Timo says services
 * Overall, these would allow these to be more bundled
 * If you miss a hook you have a case where you get errors; using an abstract class would help with that
 * What are the caveats to this solutions?
 * Concerns (see photo for full list)
 * Replacing services does not work for defining → addressed by defining in a registry
 * How long will we maintain the existing hook system? Can we turn it into this with certain caveats? How long can we support the old system? Cindy is in charge of answering this question. Trade-off of requiring one line change and migration, or do we not require one line change with caveats (Timo leaning towards the former)
 * More restrictive with parameters - these principles don’t prevent exposing internals. Need principles on what kind of parameters we allow on a filter.
 * Things that we would solve with implementation (see photo for full list)
 * Industry standard pattern, provides confidence and streamlines work for devs, also limits the start-up cost for onboarding new folks
 * Increased predictability
 * We have inconsistency in hooks and whether or not they defer/can be aborted, so we’d like to have this more consistent
 * We need some guarantees here, or several classes, for some hooks to limit the time it takes; some hooks can be run in a job, yes
 * All the constraints that are there from logic perspective in code definitions, this is a type of hook, regardless of queueing types
 * Who gets to decide this? The implementer, or whoever defines the interface?
 * Generally avoid patterns where you call a hook from a deferred update
 * How much isolation do we want to guarantee by default? → depends of the usecase, depends on the shared cache in the drop-order

See also: some question-annotations in the original notes.