Architecture Repository/Patterns/Canonical data modeling


Wikimedia logo Wikimedia Architecture Repository
Home | Artifacts | Process | Patterns

Canonical data modeling[edit]

Allows content to be understood by people, programs, and machines outside the boundaries of the system

Last updated: 2022-12-16 by APaskulin (WMF)
Status: v1 published September 2021


A canonical data model is a predictably-structured, technology-agnostic data structure that represents the system as a whole instead of each component having its own representation of the data. Discrete bits of information are interconnected based on relationships between them and contextualized with metadata. This allows users and machines to consume content easily without specifically caring about the underlying technologies driving the system.

Related to[edit]

  • Knowledge as a service: This strategic initiative transforms knowledge created as a single web page into discrete units of predictably structured information that are interrelated.
  • Federated API: Defines a unified, consistent response to all API queries regardless of the module or product it requests, while allowing individual subsystems to evolve and change independently.

Product benefits[edit]

  • Structured content: Having an agreed-upon, standardized, technology-agnostic data structure enables the universal structuring of the content across our different products.
  • Interconnectedness: Each structured data piece includes information about how it relates to other pieces (for example, by keywords, by hierarchy, etc). This enables finding and utilizing context-related links between pieces of information to produce powerful product outputs.

Example product narratives[edit]

This architecture pattern enables the following product narrative examples:

Read more[edit]