Jump to content

Wikimedia Release Engineering Team/Pretrain/Progress reports/2026-05-01

From mediawiki.org

Report on activities in the Pretrain project for the week ending 2026-05-01.

[FY25-26 WE6.1.4] Establish Pretrain production design for MVP

[edit]

If we document reliability risks and compensating controls to mitigate those risks, we will create a shared understanding of their implications for the Pretrain productionization design, sufficient to establish consensus on a path forward to enable an MVP.

Progress update
  • Stage: Complete
Was this hypothesis supported or contradicted? Why?
The hypothesis was supported. We emerged from this effort with a more complete shared understanding of risks posed or potentially exacerbated by Pretrain and possible mitigation strategies. We also believe that the path forward identified as part of this effort is sufficient to get us to the testwiki-on-Pretrain MVP.
Briefly describe what was accomplished over the course of the hypothesis work (list of deliverables, links to documents, etc.).
The core deliverable from this effort is Productionizing Pretrain :: Risks and mitigations (WMF-only google doc). That contains an analysis that catalogs risk areas and mitigation themes, together with a concrete proposal that aims to balance effort and risk in a pragmatic way. It is supported by additional artifacts focused on specific details (e.g., Automated supervision of MediaWiki deployments (WMF-only google doc)).
What are the key lessons from this hypothesis? How do those lessons contribute to this KR and guide next steps?
One key property we sought to assert was that bad (e.g., defective) code deployed to Pretrain should not impact production outside testwiki. However, we quickly found that is not tractable given the lack of hard isolation across wikis (due to, e.g., shared data). This is what led us to refocus on ways to reduce risk under these architectural constraints, which informs both this work and continuing work under this KR.
That key lesson also led to an important question: How to decide which risk areas most warrant our focus. We found it useful to invert the problem, by starting from a hypothetical implementation of Pretrain and using that to identify risks it can and cannot address, new risks may introduce (e.g., increased complexity), and opportunities it creates for future work to remediate risks we cannot tackle yet (e.g., by starting to differentiate workloads by wiki).
Finally, there is one key lesson to note related to the structure of the hypothesis itself. If an intended outcome focuses on consensus, it needs to make the stakeholders and desision criteria explicit. We did not make this sufficiently concrete, which made it challenging to know when we had completed the hypothesis and could turn our focus to implementation.