Architecture Repository/Patterns

Patterns enable us to design for emergence

Patterns
Patterns enable us to design for emergence: create interrelated capabilities that can become greater than the sum of their parts. We focused on patterns that enable stable, predictable, changeable and encapsulated parts. Patterns that let us design a system by focusing on:


 * the data model (the shape of) "knowledge"
 * the parts that deliver the necessary capabilities (things the system does)
 * the relationship between those parts
 * and the structure of their interaction

The patterns we've explored include:

Canonical data modeling
Allows content/knowledge to be understood by people, programs and machines outside the traditional boundaries of MediaWiki. And, as far as possible, allows consumers to request only what they need.

What is the structure of "knowledge" and how does it flow across the system? Building this data model requires defining boundaries around data objects and their interrelationship. A page, for example, is a collection of sections. (And templates, which we did not tackle here.) Sections are also part of collections about a topic (physics, for example.) In our modeling, we:


 * Defined a predictable structure[5] using industry-standard formats like schema.org (to support predictability and reusability)
 * Broke down preexisting structures (all the content on the Philadelphia page) into parts (a section on the History of Philadelphia) and establish interrelationships between the parts (to support "only what they need") using hypermedia linking.
 * Enhanced the structure with contextual information by associating parts with Wikidata (to enable natural collections like US Cities) and indexing collections with Elasticsearch.
 * Enabled interaction with the structure via API calls. Multiple API calls can be wrapped into a single payload -- or not.

Loose coupling
New ways to interact with, enhance or process content (capabilities) that operate independently and are built on top of (or adjacent to) the data model.

Event-based interactions and event sourcing
Activities in the system happen only when they need to happen (asynchronously) with only the information they need to accomplish their aim. We’re also exploring event streaming and designing Source of Truth in a distributed system.

CQRS
Differentiating between reading and editing. In the PoV, the current structure inside of MediaWiki is left alone, it is the "trusted source". When changes happen in MW, the new system reacts by getting the necessary information and translating it into the canonical data model. This means the design works for reading but not for editing. If > 90% of the requests are for reads, can editing be a separate part of the system? We're looking at the editing workflow next.

Leverage points
The scope of modernization -- transforming the the world's largest reference website into the world's largest knowledge system -- is monumental. To understand where to focus our time and attention, we've identified three leverage points."'Folks who do systems analysis have a great belief in “leverage points.” These are places within a complex system where a small shift in one thing can produce big changes in everything.' -- Donella Meadows"However we approach it, the first step is a doozy. There is no iterative path towards transformation. Neither is there a lift-and-shift migration option. We need to find capabilities in the system that we can decouple from the current day-to-day operations. As challenging as leverage points may be to find and to change, they unlock highly-valuable opportunities. While simultaneously laying a strong and cohesive foundation for the future system.

The leverage points explored so far include:

Giving shape and structure to Knowledge
Honestly, we don't know if it's humanly possible to "structure" Wikipedia content sufficiently. the knowledge we want to share with the world isn't made for modern distributions. We must try. Also, knowledge is currently shaped by the context of "web page" and that doesn't fit emerging contexts.

Designing inherent relationships between knowledge parts to create collections
Collections are relationships developed, programmatically or by editors, between pieces of knowledge. The way humans envision and plan these relationships shapes the way the knowledge is developed. The PoV pre-builds the knowledge payload (an answer to the queries) based on the relationships we know are the most valued. How would we expand this over time?

Building decoupled relationships between parts of the system
Rather than building capabilities into the software. This includes changing the choreography of essential activities ... in many ways, the paradigm itself is changing.

Exploring patterns and identifying leverage points helped us prioritize questions that need our attention.