Documentation/Tools/Documentation metrics dashboard

This page contains requirements for the documentation metrics dashboard. For more information, see. For initial design ideas, see Documentation metrics dashboard/Design.

Rationale
Technical writers are missing a single source of relevant information about the ecosystem of technical documentation. This makes it difficult to effectively plan work that focuses on anything more than a few pages at a time. Reasoning about documentation collections, or technical documentation as a whole, is pretty much impossible.

Goal
Provide staff and community members with access to statistical data about technical documentation pages and collections they're interested in.

Predicted result
Establish a baseline of data to collect about single pages and collections of pages, and then present that data on a publicly available dashboard. For MVP, the dashboard tracks readership and edits, Phabricator mentions, and other details, ensuring that all insights relevant to stewardship of technical documentation are always easy to access.

Data collection
The purpose of this project isn't to introduce any new tracking, but to collect the already available metrics in a single place for ease of use.

Open questions

 * How to deal with page translations? Specifically, should we treat translations of documentation pages as the same page or different pages?

PagePile
PagePile is a service that allows you to create lists of wiki pages. It's available on https://pagepile.toolforge.org. Each list, a page pile, has a unique identifier.

Instead of providing an interface to specify lists of pages directly, the current version of the dashboard uses PagePile as a way of defining collections.

A rewrite of PagePile - GULP - is underway, with the new version available on https://gulp.toolforge.org. The plan is to support GULP in a later version of the dashboard.

Throughout this document I refer to the service as PagePile, and refer to individual lists available on PagePile as page piles.

Application, dashboard
I use the terms "application" and "dashboard" interchangeably to refer to the documentation metrics dashboard we are building.

Listing page pile contents
As a user

I want to specify a page pile ID and view the pile's contents before opening the dashboard

So that I can ensure I'm requesting the dashboard for the correct list Given that I am a user on the application's landing page When I provide a valid page pile ID with pages from tech wiki, meta wiki, or mediawiki.org And I click Load Then I will see the list of pages from that page pile

Given that I am a user on the application's landing page When I provide an ID of a page pile that doesn't exist And I click Load Then I will see an error message that explains this page pile doesn't exist

Given that I am a user on the application's landing page When I provide an ID of a page pile that exists but contains pages from a wiki other than tech wiki, mediawiki, or meta wiki And I click Load Then I will see an error message that explains this page pile is unsupported

Page pile history
As a user

I want to see a list of page piles I accessed previously

So that I can quickly find information about collections interesting to me Given that I have explicitly set the Remember my page piles setting And I have consented to the application's usage of cookies and local storage When I successfully load content of a valid page pile Then this page pile is added to my page pile history

Given that I am a user who previously used the application in the same browser And I have explicitly set the Remember my page piles setting and consented to the applications' usage of cookies and local storage When I open the metrics dashboard Then I will see the list of collections I opened previously

Opening the dashboard
As a user

I want to access the statistics dashboard for pages in the page pile I requested

So that I can view the relevant page statistics Given that I have loaded pages from a valid page pile And I am seeing the list of pages from that page pile When I click Open page pile statistics Then I will see the dashboard for that page pile

Dashboard - default view
As a user

I want to see aggregated information about an entire page pile based on information about individual pages

So that I can make high-level decisions about work needed to improve specific collections Given that I am a user with the dashboard for a page pile open When I scroll down the page Then I will see different representations of aggregated documentation metrics As a user

I want to see a list of pages in a page pile

So that I can view the statistics for a single page instead of the entire page pile Given that I am a user with the dashboard for a page pile open When I click a specific page in that page pile Then I will instead see the statistics for that page only As a user

I want to sort pages listed on the dashboard based on statistics

So that I can find pages where my contributions could have the highest impact

Dashboard - page view
As a user

I want to see the statistics for a single page instead of the entire page pile

So that I can understand which pages require documentation work

Dashboard - available metrics
As a user

I want to have access the following information on the dashboard: So that I can reason about the quality of pages and collections
 * Number of page views, breakdown by day - in graphical or numerical form
 * Number of page edits, breakdown by day - in graphical or numerical form
 * Number of major edits vs minor edits - in graphical or numerical form, number of small (<1000b) and big (>=1000b) edits
 * List of most active contributors with their statistics
 * List and number of Phabricator tasks that mention the page
 * General page information, page categories, incoming and outgoing links (only for individual pages), numbers of incoming and outgoing links (aggregated for the entire page pile), information about whether the page is marked for translation (and the percentage of pages in a page pile that are marked for translation).
 * Indication whether a page uses specific templates, such as "Draft", "DoNotTranslate", "Outdated", or "Historical".

Cookies or local storage usage
As a user

I want to see a blocking cookie or local storage popup or modal window when I attempt to use any application feature that uses cookies or local storage

So that I can grant the application my explicit consent to store data on my device

Functional requirements (later)
This sections describes functional requirements that we plan to meet in a later version of the dashboard.

Support for GULP
As a user

I want to provide a GULP list number

So that I can see the statistics dashboard for the collection represented by that list

Support for manually constructed lists and doc.wikimedia.org
As a user

I want to manually specify a list of valid pages from doc.wikimedia.org, tech wiki, mediawiki.org, and meta wiki to include in a collection

So that I can see statistics for more complicated page lists

This feature will allow us to construct lists with valid pages from multiple wikis and doc.wikimedia.org. PagePile only support lists with pages from a single wiki.

Note that some statistics might not be available for doc.wikimedia.org pages. We need to consider and implement this carefully so that we don't undermine the accuracy of our data.

Sharing custom lists
(depends on Support for manually constructed lists)

As a user

I want to share my custom list using a simple code or number

So that I can help other users gain insight about lists of pages I have already built

Custom collections as a tree or table of contents
(depends on Support for manually constructed lists)

As a user

I want to see the hierarchy of pages within a collection created directly in the application represented as a tree instead of a list

So that I and others can reason about the information architecture of the collection

Flat lists have limited use for technical writers. Trees are better at representing hierarchies of information. While this will not impact the statistics, having access to hierarchies of information in the form of trees will definitely help us reason about collections.

Linter integration
As a user

I want to have access to linter output for a specific page on the dashboard

So that I can see content change recommendations based on commonly used style guides

To implement this using vale.sh, we would need to create and maintain support for wikitext syntax ourselves (see https://vale.sh/docs/topics/scoping/#formats), or come up with a workaround.

Export to JSON
As a user

I want to be able to export dashboard information to a well-structured JSON file

So that I can use and process the data in another tool

Dashboard - additional metrics
As a user

I want to have access to the following additional information on the dashboard: So that I can reason about the quality of pages and collections and consider translation impacts in my work
 * Number and percentage of pages that have been translated. Number of translation pages for each page.

Public access
The application is publicly available on the internet, under a URL that's easy to use and remember.

Personal preferences
The application doesn't require login and supports only one type of user - an anonymous user. Any user preferences and lists history are stored either in a cookie or in local storage (subject to user’s consent).

Localization
The application supports localization and allows users to set their preferred interface language. This setting is stored in their cookies or local storage if they have consented to their usage.

The application supports both LTR and RTL languages.

Inclusive design
The governing design principle is: "The application is easy to see, easy to hear, easy to interact with, and easy to understand".

The application allows users to set their preferred interface font to a dyslexia-friendly typeface.

The application's design follows the principles of neurodiversity design available on https://neurodiversity.design.

The application is designed from ground up to use all the modern accessibility features, with WCAG being the baseline.

The application is designed to minimize the CPU and RAM requirements of a device it is used on.

The application is designed to work well on internet connections with lower bandwidth and worse connection stability.

The application aims to serve users at the 75th percentile of devices and networks.

Caching
The application uses caching to minimize the number of requests to external APIs. Consecutive requests for the same data do not result in consecutive calls to these APIs.

Non-requirements
This section describes what we've decided not to implement.

User login and management
The dashboard does not provide any mechanisms of user management, login, or registration. Any preferences, custom data, and (eventually) manually-created lists are stored locally and are visible only to their creator.

Content moderation
The dashboard requires no moderation as it doesn't allow users to edit any publicly-visible data. Any user-created data is only visible to them.

Usage instructions

 * 1) Navigate to the application’s home page.
 * 2) In the PagePile ID field, specify the identifier of the page pile you want to analyze and click Load.
 * 3) The application displays the list of pages in the page pile. Click Open page pile statistics to proceed with the loaded pile, or specify a different PagePile ID and click Load again.
 * 4) (Optional) Click Remember my page piles and consent to the use of cookies and local storage if you want to enable page pile history on the landing page.
 * 5) Clicking Open page pile statistics opens the statistics dashboard for the page pile. Scroll down the page to analyze the documentation metrics.
 * 6) The dashboard features a table with pages in the page pile. You can sort that table by some metrics. To view statistics for a specific page, click that page.

To access your preferences, click Preferences. Note that preferences work if you have consented to the application's usage of cookies and local storage.