Documentation/Planning for better technical documentation

This page contains information to help you audit a collection of technical documentation and determine a strategy to improve it.

Define the goals of your documentation collection
Before you start assessing your docs, take a moment to consider why they should exist.


 * Why does someone need to know about the topics your docs cover? Try to fill in the following sentences:
 * "These docs will help ______ [audience] to understand what _____ [technology/project/process] is, and how to use it to _____ [do what?]."
 * "After visiting these docs, ______ [audience] should be able to ______ [do what?]."


 * What will happen if you don't create or improve these docs? What is the problem?
 * "Without these docs, ______ [audience] will continue to be confused about ______ [technology/project/process]."
 * "Without these improvements, it will be harder for ______[audience] to ______ [do what?]."

Identify the docs to improve
Docs can be related to each other in multiple, overlapping ways. To make it easier to identify and prioritize the documentation improvements that are in scope for your topic area, start by collecting the content relevant for your goal.

Make a tracking document
Create a spreadsheet or another document to track the docs you identify during this step. Using a tracking doc for this process helps you standardize your analysis, understand the boundaries of your doc collection, and prioritize improvements across many docs.

Avoid adding a generic "notes" field to your tracking doc. If you do that, much of the meaningful information you need will end up in a column that you can't use to sort and filter your list of docs.

Find the docs
To populate your tracking document with the list of docs you should assess, use some or all of the following techniques.

Landing page
Does your team or project already have a landing page? Use Special:PrefixIndex to identify subpages of your landing page and add each of those pages to your tracking document.

Category
Check whether pages discovered in the previous step (landing page and its subpages) belong to a specific category or set of categories. Then, investigate if these categories contain any additional pages, not identified in the previous step, that should be considered part of your collection.

Linked pages and information hubs
Do your docs use a navbox or other navigation template? If so, add to your tracking document all the docs that are linked in the navigation template.

Pull additional links from key docs:
 * If you already know some docs in your topic area that are the most important: add to your tracking document any page that is linked to from those core docs.
 * As you do this step, note which links are superfluous or unnecessary. These links go to pages that don't contain crucial information related to your topic, nor help the reader complete the essential tasks that you identified when you were defining your documentation's goals in the previous step. In your tracking spreadsheet, create a column or mechanism to mark these pages as "potentially not in scope" (in the template, this is the "Relevance" column).
 * If you don't already know which docs in your topic area are the most important, move on to the next step.

Search
Search across multiple wikis (like MediaWiki, Meta, and Wikitech) for keywords related to your project or topic. You can do this one wiki at a time, or use this PAWS notebook. If you find relevant results, add the docs at the top of the search results to your tracking spreadsheet if they're not already there (or just add them and deduplicate later).
 * If you didn't know which docs were most important for your topic area in the previous step, go back and use the docs you just found to complete the previous step of adding to your tracking document the pages linked to from your key docs.

Search code repositories and static sites for relevant documentation: (TODO: link to a script to pull all the links from a given wiki page, perhaps some nice dashboard that can pull page information but for a set of pages based on prefix (or other criteria), instead of just per-page).
 * Use the Code Search tool and/or search in Github, Gerrit, Gitlab and https://doc.wikimedia.org/.

If you identify individual pages that should belong to your collection, but are not currently connected to it, consider how to fix that. The best options are adding pages to your navigation template, adding them to a Category, or linking to them in an appropriate and prominent location.

Understand how the docs relate to others
This step helps you understand whether any of your content exists in isolation and is self-contained, or if it requires additional resources to provide readers with the full picture. Consider:
 * What upstream technology, processes, or systems should a reader be aware of before they can understand or use your docs?
 * What technical, social, or other dependencies impact your reader's ability to use the information in your docs? For example: must they have certain software installed, or must they have access to specific systems?

Explore: use Special:WhatLinksHere to review the docs that link to your landing page and to several of your key docs. Do you understand why that connection exists?


 * Is the content that links to your docs something that is essential for readers of your docs to understand? If so, do your docs cover that topic? Do they cover it by linking to the other doc, or do they duplicate that information?
 * What might readers who land on your docs from those other docs need to know? What would they be trying to achieve? Do your docs cover that?

As you explore this, you may want to start filing Phabricator tasks to track issues like this to cleanup when you're done with your initial doc survey. While it's important to understand and improve connections between your docs and other docs, it's easy to get overwhelmed as you start to understand how your topic has content scattered across the wikiverse. Stay calm, remember the goals of your documentation, and focus on what your readers need to know to complete their key tasks.

Assess information overload
For each of the pages in your collection (the docs you have listed in your tracking document): skim the content and keep track of your impressions for the following data points (the recommended spreadsheet from the previous step aligns with each of the following sections).


 * TODO: template
 * TODO: finish alignment of spreadsheet and the content here

Content duplication
Is there content that is already covered elsewhere? If so, consider: is there any reason to cover it in this doc or set of docs?
 * Example: if a user's goal is "create a bot", they may need to understand a large range of concepts, like MediaWiki APIs, Wikitext and the structure of wiki pages, bot accounts, and coding conventions. A tutorial about how to create a bot should not cover all those concepts, because they are relevant for many topics beyond just bot development.  Instead, a bot tutorial should link to documentation that covers those topics, and clearly state at the beginning that understanding them is a prerequisite for completing the tutorial.

Audience focus

 * Who is the audience or type of user this documentation should help?
 * For example, your audience might be "MediaWiki developers" or "new bot developers".
 * Are there docs or sections of docs that aren't appropriate for your audience? Move these to a separate collection, or delete them.

Don't combine documentation for different audiences. At the beginning of your docs and on your documentation landing page, declare who the intended audience of the docs is, and direct other audiences to the content that is made for them.

Doc relevance
Use page information or https://pageviews.wmcloud.org/ to assess the traffic on the docs you've gathered. There is no way to pick a general benchmark here. Some pages might receive traffic only as a result of bigger changes in the documented code, or at specific seasonal peaks.


 * Pages with close-to-zero numbers should be considered for archiving, especially if their content is volatile or outdated.
 * Pages with high numbers might benefit from being broken up, especially if they cover multiple subjects and receive high traffic as a result of containing all possible information about a topic but in a wall of text that offers no navigation guidance for the reader (Help:System_message is an example of this type of information overload).

Content freshness

 * Is the content outdated? Does it need to be updated, or can it just be deleted?
 * Do the names of components or technologies used in the doc reflect the current reality, or have the names (or dependencies) evolved?

Missing content

 * Are there any features that are undocumented?
 * Are there requests for clarification or additional content on Talk pages or in Phabricator tasks?
 * Are you including the necessary information for the full range of technical expertise or understanding that your audience may have or not have? If you look at a doc and imagine that you know nothing about the topic, the page should:
 * Either: indicate prerequisite knowledge or steps at the beginning of the doc, and link to where the reader can go learn or complete them (example),
 * Or: be a subpage of a more general document where the reader can back up to go learn what they need to know, especially if they landed on this page without context.
 * Is it clear who is the maintainer of these docs, or where readers should ask questions about the content? Indicate whether you expect doc requests to come through Talk pages, Phabricator tasks, or other mechanisms.

Assess maintenance status
The steps in this section help you understand the problems others have faced when trying to work on these docs. Reviewing past and current maintenance efforts helps you identify pitfalls and avoid redoing work that others have already considered.

Phabricator tasks

 * Review the last 20 active Phabricator tasks that mention your key pages and/or your topic.
 * Reviewing by general topic can help you identify high priority content to improve or work on.
 * Reading the task history can help you identify how or why the content is difficult to change, manage, or fix.
 * Review the last 10 Phabricator tasks that were closed. How were they closed? Tasks closed without a fix might indicate some problems in the project.

Revisions
Starting with the most important docs in your collection, and proceeding through as many pages as possible, assess:
 * When was the last edit made? If it was long ago, is it because content can be considered stable, or abandoned?
 * How often do edits typically happen? Did the regular editors stop maintaining content for some reason?

This needs to be evaluated in context. When the project is stable, there is no longer any need to update its documentation. If a given project is actively developed, documentation might become outdated far quicker.