Wikimedia Technical Documentation Team/Doc metrics/Prototype
This page explains the data sources and calculations implemented in the documentation metrics generator tool.
Quick reference: doc data used in metrics
[edit]The table below summarizes the information explained in more detail in the following sections:
| Metric | Data retrieved | Type of signal | Data source | Bucketized based on benchmarks? |
|---|---|---|---|---|
| Succintness | Section count | Content | XTools API | Benchmarks + buckets |
| Succintness | Page size in bytes | Content | Action API | Benchmarks + buckets |
| Developer relevance | Percentage of page watchers who visited in the last 30 days
|
Traffic/edits (popularity) | Action API | Benchmarks + buckets |
| Developer relevance | More than one edit in the last 6 months? | Traffic/edits (popularity) | Analytics API | Buckets only, no benchmarks needed |
| Developer relevance | Links to code repos from wiki pages
|
Content relevance | Action API | Buckets only, no benchmarks needed |
| Developer relevance | Presence of code samples on page
|
Content relevance | Action API | Buckets only, no benchmarks needed |
| Developer relevance | Incoming links from the same wiki
|
Content relevance | XTools API | Benchmarks + buckets |
How the doc metrics generator works
[edit]Gather and combine raw doc data from APIs
[edit]The script uses the Action API Info module to get three pieces of data:
- "length": Page length in bytes, used in Succinctness metric.
- "watchers" and "visitingwatchers": Used to calculate percent of watchers that visited the page ("visitingwatcherpercent") in the past 30 days, which is then used in Developer relevance metric. If less than 30 watchers, no data is returned for that page.
Then, the script uses the XTools API Page Links endpoint to get three pieces of data:
- "number_of_sections": Always at least one section. Used in the Succinctness metric along with page length, to calculate a ratio reflecting how structured the page is.
- "incominglinks" and "redirects": Used to calculate "incominglinksnoredirects", the number of incoming links to the page from other mediawiki.org pages. Excludes transclusions and redirects, but includes translation pages. Used in the Developer relevance metric.
Next, the script uses the Analytics API Edits endpoint to get a single datapoint indicating whether the page had any edits in the past 6 months:
- "sixmonth_edits": Calculated based on the time range starting six months before the current date (the date when the script is run). The API returns a 404 status if no edits occurred during the requested time range. Used in the Developer relevance metric.
Finally, the script uses the Action API to parse wikitext, and check for the presence of two types of content relevant for technical audiences:
- "code_samples_in_content": Checks for classes used in code snippets. Used in Developer relevance metric.
- "links_to_gerrit", "links_to_github", and "links_to_gitlab": Checks for links to the primary code repositories used by Wikimedia projects. Used in Developer relevance metric.
These last signals are noisy: they may contain links to upstream code, or snippets that no longer work. These constraints are further discussed in the v0 metrics test assessment.
Calculate metrics scores
[edit]Succinctness metric outputs
[edit]- Succinct score: the overall score for this metric. Lower score means a page is less succinct (this is bad). The overall score is calculated by adding together the scores for "Length in bytes" and "Section to length ratio". Those sub-scores are not returned as output; instead, the output includes the actual values for those data elements, since that is more useful information to guide documentation work.
- Length in bytes: The bucket min/max values are based on benchmarks from the v0 metrics testing. Warning: page length in bytes is not always an accurate reflection of actual, rendered page length: templates can include content that makes pages very long to a reader, but shorter in byte size.
- Section to length ratio: Calculated based on number of sections and page length. The ratio itself (not the score it generated) is returned in the results output, because this is more useful data to guide documentation work. The ratio is scored according to benchmarks data, with ranges of values being placed into buckets that receive a score based on whether they reflect a dense and unstructred vs. more succinct, more structured page.
The table below summarizes how the scores are calculated. To see the math behind this, look at the calculate_succinct function in the script.
| Raw input | Impact on score | Min/Max possible score |
|---|---|---|
| Page length in bytes | If length is:
|
10 - 50 |
| Section to length ratio | If number of sections divided by page length in bytes is:
Every page has at least 1 section. |
10 - 50 |
Developer relevance metric outputs
[edit]Developer relevance score: the overall score for this metric. Lower score is worse. The overall score is calculated by adding together two sub-scores:
- Technical content score: the sub-score for technical content signals. Lower score is worse. The raw values for these input signals are included in the metrics output to help guide documentation work:
- Links to code: The number of links to code repositories found by parsing the content.
- Code samples: TRUE/FALSE, whether code samples were detected on the page.
- Popularity score: the sub-score for popularity (revisions/traffic) signals. Lower score is worse. The raw values for these input signals are included in the metrics output to help guide documentation work:
- Incoming links: raw number of incoming links
- Visiting watcher percent: rounded to two decimal places; calculated based on watchers and visiting watchers.
- More than 1 edit in past 6mo: boolean (1 or 0): 0 indicates there was not at least 1 edit in the past 6 months. That time range is calculated from the date that the script is being run.
The table below summarizes how the scores are calculated. To see the specific implementation, look at the calculate_devrelevance function in the script.
| Raw input | Impact on score | Min/Max possible score |
|---|---|---|
| Links to code repos? |
|
0 or 50 |
| Code samples on page? |
|
0 or 50 |
| Total Technical Content score | 0, 50, or 100 | |
| Incoming links |
|
0-40 |
| Visiting watcher percent |
|
0-30 |
| More than 1 edit in past 6 months? | If true: Popularity score + 30 | 0 or 30 |
| Total Popularity score | 0-100 |
See also
[edit]- Metrics generator user guide
- Metrics generator app source code and developer documentation
- Metrics generator PAWS notebook
Benchmarks from test data
[edit]Our test dataset included 140 pages from mediawiki.org. For the data elements listed below, the value distributions from the test dataset helped define the buckets that the metrics prototype uses for scoring. For full details of the metrics testing, see Doc metrics/v0#Outcomes of metrics testing.
- Incoming links: Average: 31.85; Median: 17.5; Min: 0; Max: 271
- Visiting watcher percent: Average 36%; Median 25%.