Architecture Repository/Artifacts/Phoenix books

= =

References are a rich opportunity to create data objects from an article's page content

Status: v1 released May 2021

Work we did before
As an initial prototype, we built a structured content store (aka knowledge store) as a “mid tier” between a content source (in this case, Simple English Wikipedia) and consumers. The video linked below shows the demo site we created.

The goal was to build


 * A tiny, experimental modern platform
 * that can serve collections of knowledge
 * created from multiple trusted sources (although we only used one source, more could be added using the same patterns)
 * to many product experiences and other platforms.

The easiest way to familiarize yourself with it is to read the summary, watch the video and peruse the repository.


 * Structured content summary
 * Model
 * Video
 * Repo

Why references?
References (aka citations) are a rich opportunity to create data objects from an article's page content. We could, for example, interrelate books with the pages and subjects they reference. At the moment, references are a wikitext list at the bottom of pages. (And are formatted differently across the ecosystem.) For example, Albert_Einstein

Distributed as data, they could be used to display content as rich content.


 * References future ideas
 * Event storming overview (we asked people to model how the current editing process works)
 * Group 2 model
 * Group 1 model
 * Simplified version

What we’ll do
We will make a first attempt at untangling references … beginning with books that are referenced on Simple English Wikipedia. There are questions we’ll need to answer together as we go. We can use as much of the previous work as we’d like.

At a high-level, we will:


 * Get an article from Simple English Wikipedia
 * Break it down into parts (sections and citations)
 * Structure it according to the canonical data model (we’ll add a structure for books as citations)
 * Save it to the knowledge store (S3) as data objects interrelated by hypermedia links (aka a graph)
 * Import the topics associated with those objects from the previous knowledge store
 * Associate book with page and topic(s) in elastic search (lots to talk about here)
 * Repeat for all articles
 * Add a query language (GraphQL) on top and configure it to return
 * Books associated with a page or section (TBD which level we associate)
 * Books associated with a topic
 * A single book (TBD: how would you find it?)
 * Consider how this might also help the editing process (see models above)

We will also deliver an artifact that enables others to understand our thinking.

Job Stories
(most are epics we’ll break down)

Risks & notes

 * The source data is inconsistent wikitext
 * Do we associate citations with the page or sections?
 * There are tools that structure citations for wikitext editing, can we leverage that?
 * There are lots of ways we can interrelated this data but we need to decide how much effort each is worth
 * We need to write down any issues we choose to ignore or tradeoffs we make for the summary artifact