Editor campaigns/Technical design

This document presents the technical design of the Editor campaigns project. The most significant development decision we've taken is to try to use domain-driven design. This approach is justified by the expectation that the domain will get more complex quickly. We've also created general facilities to encapsulate persistence.

As noted below, aspects of how this software will work beyond the first release are not fully scoped out.

Use cases and stories[edit]

Here's a use case diagram based on the user stories:

Some additional possible use cases and stories, most involving more advanced functionality, are:

Campaign organizers, participants and others want to study and compare campaigns using a variety of metrics. They might:
- Link data about campaigns to data from other sources, including Wikidata.
- Find campaigns by the Wikidata categories or Geotags of the articles they work on.
- Place a campaign's activities on a timeline together with events described in Wikidata. (For example: visualize edits by a project about the Ukraine alongside major events there.)
- Create automatic text analysis of articles and link it to campaigns to compare their results. (Campaign results could be compared in terms of content persistence or the types of discourse on related Talk or Flow pages, for example.)
- Describe campaign participant networks. See, for example, this study of Twitter networks.
Campaign organizers and participants want to organize their work in steps (i.e., as workflows) according to the requirements of a specific campaign or type of campaign, and link those workflows to UI elements.
Campaign organizers and participants want a variety UXs and data points for different types of campaign.
Users want flexible, productive, fun and easy-to-use tools for working and hanging out on Wikipedia in groups.

Architecture[edit]

Persistence: This layer is the only link between domain objects and a persistence store (i.e., a database, at least initially). It uses the store to persist, find and instantiate domain objects, following the data mapper pattern. This is the only layer that directly interacts with Mediawiki′s database classes. It doesn′t depend on anything Campaigns-specific, and could be included in Mediawiki core; see the related RFC (todo: add link).
Domain: This layer expresses core Editor Campaigns domain logic and provides a narrow interface for simple operations on campaigns and user participations. It depends only on the persistence layer. See below for more details.
Services (tentative): We may find there are typical Campaigns actions that are more complex than the actions available via the domain layer and that involve additional Mediawiki components (like logging and page revisions). Such actions could be encapsulated in this layer, which would depend on the Campaigns domain layer and whatever Mediawiki components are involved. It seems likely that services would be created mainly for modifying Campaigns entities (in the spirit of CQRS).
UI (mostly tentative) and Web API: Details of the UI and its internal structure remain to be worked out. So far, we've implemented a read-only Web API and opt-out controls on the Create account page. The UI and the Web API may depend on the domain and services layers. See below for more.
Components that build on Campaigns: Campaigns provides core functionality for using Mediawiki in groups. Further specialization for specific use cases (Education Program courses, projects, edit-a-thons, etc.) should be built in other components (i.e., other extensions), which may depend on the Campaigns domain and services layers and, if they add data points, the generic persistence layer. It is unclear whether any sharing of UI functionality will be appropriate.; Other functionality that′s related to Campaigns but is not part of its core mission could also go in other components. Examples of such functionality are workflow management and activity feeds. Those components could have hard or soft dependencies on Campaigns.

Domain-driven design[edit]

In domain-driven design, you build software around a model of what the software is “about” in the real world. That model goes in the domain layer. Two authors we've drawn on for ideas about this are Martin Fowler and Eric Evans.

According to Fowler, “With a Domain Model […] we build a model of our domain which, at least on a first approximation, is organized primarily around the nouns in the domain."^[1]

Here are some points Evans makes:

“The goal of domain-driven design is to create better software by focusing on a model of the domain rather than the technology.”^[2]
“In a model-driven design, the software constructs of the domain layer mirror the model concepts. It is not practical to achieve that correspondence when the domain logic is mixed with other concerns of the program. Isolating the domain implementation is a prerequisite for domain-driven design.”^[3]
“The domain objects, free of the responsibility of displaying themselves, storing themselves, managing application tasks, and so forth, can be focused on expressing the domain model. This allows a model to evolve to be rich enough and clear enough to capture essential business knowledge and put it to work.”^[4]
Domain-driven design makes a lot of sense when you expect a system to grow more complex^[5].
The problem with not following this methodology is that as a system grows, “more and more domain rules become embedded in query code or simply lost.”^[6] In such cases “[W]e are no longer thinking about concepts in our domain model. Our code will not be communicating about the business; it will be manipulating the technology of data retrieval.”^[7]

Domain layer[edit]

The domain layer has this outward-facing PHP interface:

** Method not yet implemented.

A few elements of this design are still in flux.
IParticipationRepository and ICampaignRepository are repositories in Evans's terminology. According to Evans, a repository “represents all objects of a certain type as a conceptual set (usually emulated). It acts like a collection, except with more elaborate querying capability. Objects of the appropriate type are added and removed, and the machinery behind the repository inserts them or deletes them from the database.”^[8]
In Evans's terminology, a campaign is an entity (since it has a persistent identity) and a participation is a value object (since all that matters about it are the values it contains).
The purpose of the ITransactionManager is to allow the interface consumer to control transaction scope. Consistency rules will only be enforced when flush() is called.
By using a separate persistence layer, we push existing MW classes for database access out of domain logic and into a lower, infrastructure level.
This current implementation only stores current participations. The details of storing and querying a campaign′s participation history have yet to be worked out.
The details of campaign termination/retirement/deletion are also still up in the air.

Web API[edit]

Initally, two API query modules are provided: (1) for listing and searching for campaigns, and (2) for retrieving lists of participants.

list=allcampaigns[edit]

Parameters:

allcprefix Search for campaigns whose name begins with this value. Optional; if omitted, get a list of all campaigns.

allclimit Maximum number of results to return.

Example:

api.php?action=query&list=allcampaigns&allclimit=10

list=campaignparticipants[edit]

Parameters:

camppid The id of the campaign to get a list of participants for.

campplimit Maximum number of results to return.

Example:

api.php?action=query&list=campaignparticipants&camppid=1&campplimit=100

The purpose of these modules is to let Wikimetrics create cohorts from campaign participant lists.

UI (mostly tentative)[edit]

Implemented:

A notice and opt-out checkbox for users who go to Create account via a Campaigns URL.

Tentative, needs flushing out:

A Special page for listing and searching for campaigns. Each campaign name would be a link to the page for that campaign. Authorized users would see an add button and/or delete buttons.
Campaigns would have a page in their own namespace. Pages would be implemented using ContentHandler, and would have tabs for viewing, editing, viewing history, moving and deleting (again, depending on user rights). The edit page would let users modify general campaign information and (for admins) remove users. The view page would also list the participants/organizers and provide a means of inviting more users.

Operation on account creation[edit]

See the main Editor Campaigns page. Legacy operation won't be affected: account creation from a Campaigns URL will still be logged via event logging (unless the user opts out). Also note that a new campaign entity will be created the first time any user visits Create account via a given Campaigns URL.

Deleted users (tentative)[edit]

Only users with the hideuser right can view deleted users. Since we don't require authentication to access campaign participations, deleted users should be automatically removed from campaigns. It remains to be determined how this will work for campaign participation histories.

References[edit]

↑ Fowler, Martin, with David Rice, Matthew Foemmel, Edward Hieatt, Robert Mee, and Randy Stafford (2003). Patterns of Enterprise Application Architecture. Boston: Addison-Wesley, 26.
↑ Evans, Eric (2004), Domain-Driven Design: Tackling Complexity in the Heart of Software. Boston: Addison-Wesley, 148.
↑ Ibid., 75.
↑ Ibid., 70−71.
↑ Ibid., 78.
↑ Ibid., 149.
↑ Ibid., 150.
↑ Ibid., 151.

[1] Fowler, Martin, with David Rice, Matthew Foemmel, Edward Hieatt, Robert Mee, and Randy Stafford (2003). Patterns of Enterprise Application Architecture. Boston: Addison-Wesley, 26.

[2] Evans, Eric (2004), Domain-Driven Design: Tackling Complexity in the Heart of Software. Boston: Addison-Wesley, 148.

[3] Ibid., 75.

[4] Ibid., 70−71.

[5] Ibid., 78.

[6] Ibid., 149.

[7] Ibid., 150.

[8] Ibid., 151.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]