Wikimedia Enterprise/Updates

2021-06: Parsing HTML, Schema, API Organization, and Public Access

 * Parsing HTML
 * We are entering the world of "what we can do to make the data easier to use" as we near having reliable pipes as the core of the Enterprise product.
 * First stop, parsing HTML. We are working with the Parsing team to find ways that Enterprise can support the open-source project to make parsing Parsoid HTML easier at scale for our end users.
 * Data Model / API Schema:
 * We are sending our schema work into the technical decision making process at the Wikimedia Foundation, follow on this ticket from the architecture team.
 * We have decided to adopt snake_case in our APIs as it has more flexibility with non-english languages, as we look down the line of more accessible apis.
 * Launch API Organization
 * Next week we will add to our docs page our final API name-spacing and structure for launch, we are including endpoints to quickly discern if anything has changed from project to project. Stay tuned here, I'm just typing them up in draft.
 * Public Access
 * We are finalizing the space to add 2-week Enterprise exports into the Wikimedia Dumps. Track progress here.
 * We are also finalizing the allow-listing for WMCS users. Track progress here.

2021-05: Schema, Public Access, Documentation, and Firehose

 * Data Model / API Schema:
 * We are finalizing the v1 schema for July launch of our Bulk, Structured Content, and Firehose APIs. Our need is something universally usable across the entire customer base that we will also not need to change, at least not often.
 * As of now, we are planning to steer near schema.org's "CreativeWork". Track progress here.
 * Public Access:
 * We are aiming to provide access to some of our work publicly by mid-June:
 * Wikimedia Dumps, we are providing 2-week entire corpus exports for every text based project. Track progress here.
 * Wikimedia Cloud Services, we are providing access to our Bulk APIs to users of WMCS. Track progress here.
 * Documentation:
 * For now, we are hosting our documentation on-wiki here until we build out our larger sitemap for the Wikimedia Enterprise product. This work is in progress but feel free to watch that page for updates.
 * We are live on phabricator and all Wikimedia Enterprise related technical work is documented on our board!
 * Firehose API:
 * We have scoped the v1 release of the Firehose API and it will include filtering of Project and Page-Types (namespaces) for easier ingestion. Track progress here.
 * The Firehose will include the data from the above schema in a real time feed.

2021-04: Beta, Transparency, and Roadmap

 * Beta Launch!:
 * The team launched a "closed beta" for our bulk and structured-content api endpoints! So far, great feedback but still working through kinks that come with a beta offering.
 * Follow this ticket for more information on when public access will be available via Wikimedia Database Dumps. Note these will be experimental, if interested in providing feedback, feel free to post on our phabricator board - we appreciate it!
 * We are finalizing a timeline with the Technical Engagement team to find how we can provide access to folks with access to their tools. Stay tuned.
 * Project transparency improvements:
 * We are moving all of Wikimedia Enterprise's project management to our Phabricator board over the next week or two.
 * We are reflecting/iterating on our open-source workflow to provide a better window into our Github push schedule for those who are interested in following along. More to come here.
 * Roadmap:
 * The next big roadmap item is refining the "data schema" work we have already done and publishing updates here. We are looking to include more contextual data to revisions as part of our ingestion feeds.

2021-03: Community conversations

 * Refreshed documentation
 * Publication of completely refreshed documentation on MediaWiki.org and Meta. See Meta talkpage with significant amount of community feedback/comment.
 * Landing-page website
 * Launched! Incremental improvements in temporary code.
 * The website content itself is temporary and a placeholder until a fully featured page is launched alongside the product in a few months