User:APaskulin (WMF)/API reference docs

The purpose of this page is to start a conversation about how to improve API reference documentation in the API Portal.

Problem statement
What is the problem?


 * API reference docs in the API Portal are untrustworthy. Since the docs are created and updated manually, there is no guarantee that they are accurate. Even if they are accurate, there is no way to be sure, which erodes trust in the Portal as a whole.

What does the future look like if we do this?


 * Better developer experience for creating and consuming APIs using the API Portal

What happens if we do nothing?


 * Developers lose trust in the platform; the API Portal is not adopted
 * Significant increase in information debt due to having two sets of manually maintained docs per service

Strategic alignment


 * Systems of empowerment
 * Platform evolution

Objectives
Increase confidence in the accuracy of API Portal reference documentation

Although there are many ways to establish guarantees around accuracy, the common outcome is increased confidence in the docs. This is critical for everyone interacting with the docs, including API producers, API consumers, tech writers, and product managers.

Integrate documentation into the WMF API development process

Our objective is not to build documentation tooling in isolation, but to use this tooling to improve the API development process as a whole. Practically, this means focusing on the needs of API developers in building workflows to create and update the docs.

Requirements
Integration with API Portal site (MediaWiki)

The API reference docs currently exist as wikitext within the API Portal. Regardless of the approach we take in the future, the resulting docs must be presentable in a way that integrates with the API Portal MediaWiki instance.

Internationalization of descriptive text

In order for the API Portal to be a global system of empowerment, we need the ability to translate the descriptive text within the API docs.

API sandbox

An API sandbox (or “Try it out” feature) is a powerful tool for getting started with an API. This can also be a starting point for building more interactive documentation capabilities, using tools like Jupyter Notebooks, in-browser shells, and user-aware code samples.

Opportunities
Contract-driven development

Contract-driven development is an API development process in which backend and frontend teams work together to create a description of how the API should work. Once the contract is in place, teams can develop their code in parallel. By combining this process with a machine-readable API specification, we can build tooling to validate the contract and maintain a trusted source of truth for API behavior.

OpenAPI/swagger

OpenAPI has emerged as the leading API specification format for REST APIs. By using OpenAPI, we can leverage a growing set of open source tools including spec generators, validators, sandboxes, and HTML generators. Most importantly, OpenAPI is already being adopted organically within WMF. See the appendix for a list of interesting OpenAPI tools.

Integration with testing pipeline

Evaluating the accuracy of API reference docs and performing API integration tests can be very similar processes. As we look at how we can guarantee the accuracy of our docs, we should look at ways we can integrate docs with API tests.

Unknowns

 * Can OpenAPI/swagger support internationalization? Could the internationalization patterns used by Toolhub help with this?
 * Can we programmatically verify that an OpenAPI/swagger spec matches the behavior of an API?
 * Can we leverage the integration testing pipeline to combine the process for tests and docs?

Background
The API Portal is a MediaWiki-based documentation wiki at api.wikimedia.org. It contains documentation for APIs that are served through the API gateway. This means that the APIs documented in the API Portal almost always have another source of docs at the service level. (For example, API:REST API and /core; this spec and /feed; this spec and /service/linkrecommendation) This often results in two sets of manually-maintained API docs per service, an unsustainable situation to say the least.

Historically, we’ve had pretty great API doc tooling for the last 5+ years with the MediaWiki Action API. But with new services, new API styles (REST, GraphQL), and the API gateway, there’s a need for tooling that provides the features we’ve relied on in the past and works across services.

Terms
API Portal/Portal

The website located at api.wikimedia.org. Visit Wikitech for more information.

API docs/API reference docs

In this document, “API docs” refers to API reference docs: a list of endpoints, objects, parameters, responses, and other information that provides an index of available functionality in an API. This is in contrast to narrative documentation, such as guides and tutorials. See the appendix for a list of elements in the API Portal reference docs.

Accuracy

The degree to which a set of API docs can be guaranteed to correctly describe the behavior of an API. We can think about this in terms of the validity of the information present and the completeness of the information based on available versions of the API.

API contract

A description of the way an API is expected to work

API behavior

The way an API works in practice

API specification/spec

A machine-readable description of an API

Spec generation

The process of automatically creating an API spec from other sources (such as code comments)

HTML generation

The process of converting an API doc from its raw format to HTML. For example, swagger-ui is an HTML generator, not a spec generator.

Case studies
MediaWiki Action API docs (API:Main page)


 * Tools: MediaWiki, translatewiki
 * Accuracy: Information comes from metadata provided by the API module in MediaWiki Core. Example requests are included in individual API modules (example). This presentation by Brad J. contains more details about the Action API docs system, including lessons learned. These docs don’t include generated response schema.
 * Translation: Translation is handled by MediaWiki’s standard i18n system via translatewiki
 * Presentation: Docs are available as HTML via the API itself (example) and on wiki via Special:ApiHelp and the associated template. Code samples on wiki are generated automatically using mediawiki-api-demos and manually copied into wiki pages.
 * Sandbox: A sandbox is provided as a single page on wiki at Special:ApiSandbox. Individual module docs link to pre-filled queries in the sandbox.
 * Versioning: MediaWiki serves the version of the docs for the current version running on the wiki. For example, these docs on mediawiki.org correspond to the version of MediaWiki running on mediawiki.org while these docs on enwiki correspond to the version of MediaWiki running on English Wikipedia.

RESTBase API docs (wikimedia.org/api/rest_v1)


 * Tools: OpenAPI, swagger-ui
 * Accuracy: The underlying OpenAPI spec is a combination of several different specs, each of which is maintained manually, so there’s no guaranteed accuracy when it comes to properties returned by the API.
 * Translation: No translation available
 * Presentation: Served by the API itself, swagger-ui provides a nice presentation that uses expanding and collapsing sections for navigation
 * Sandbox: swagger-ui provides a built-in sandbox embedded within the docs
 * Versioning: The spec is versioned as a whole, so if a v2 existed, I assume it would be available at a different URL.
 * Lessons learned: User:EEvans_(WMF)/Opinions/OpenAPI

Toolhub (toolhub-demo.wmcloud.org/api-docs, source)


 * Tools: OpenAPI, Django REST framework, drf-spectacular, RapiDoc, translatewiki
 * Accuracy: Docs are generated from the code. This actually happens whenever you load the page!
 * Translation: This project uses an innovative integration with translatewiki to translate doc strings.
 * Presentation: RapiDoc looks even nicer than swagger-ui, in this case using expanding and collapsing sections for navigation
 * Sandbox: RapidDoc provides a built-in sandbox embedded within the docs
 * Versioning: I’m not sure, but I assume the spec would be versioned as a whole

MediaWiki API integration tests (docs)


 * Tools: Mocha, SuperTest, Chai
 * This isn’t a documentation system, but I think it’s worth considering in our investigation. These tests are stored within the API source code repos, so they’re nicely decentralized. This tool is designed to provide a simple, flexible way to test APIs at the endpoint level. Because of this, tests contain a lot of the same information we need in the docs (example).

Wikifeeds language support (API Portal)


 * Tools: /availability endpoint, random python script, MediaWiki
 * Accuracy: Currently depends on periodically testing the response from the /availability endpoint
 * Translation: Theoretically possible using translatewiki but not currently enabled
 * Presentation: Nice-looking, Wikimedia-branded API Portal skin
 * Sandbox: None, could probably be created as a gadget
 * Versioning: If needed, could be done through MediaWiki

Link recommendations (API Portal)


 * Tools: OpenAPI, swagger-ui, MediaWiki, flasgger
 * Accuracy: Based on copying the content from the spec into wikitext, plus some manual testing. (To do: Follow up with Kosta H. about how flasgger works.)
 * Translation: Theoretically possible using translatewiki but not currently enabled
 * Presentation: Nice-looking, Wikimedia-branded API Portal skin
 * Sandbox: The API Portal docs link to swagger-ui which provides a sandbox
 * Versioning: If needed, could be done through MediaWiki and separate OpenAPI specs

Wikimedia OCR (spec)


 * Tools: OpenAPI, NelmioApiDocBundle for Symfony (PHP)
 * Accuracy: Based on code comments in the source code
 * Team: Community Tech

Experiments with tools that generate docs from code comments


 * apiDoc
 * swagger-php

Elements of API Portal reference docs
The following pieces of information are used to describe REST API endpoints in the API Portal reference docs.

Namespace

For example: /core or /feed. Determined by API gateway config

Endpoint group

Endpoint groups are used to break up larger namespaces and make the sidebar easier to navigate. For example, the /core namespace is broken up into the search, pages, media files, and revisions groups based on the endpoint paths.

Version

For example: v1. There may be multiple version of the same endpoint available.

Supported projects

Which Wikimedia projects the endpoint supports. For example: wikipedia and wiktionary

Supported domains

Which project subdomains the endpoint supports for each project. For example: an endpoint could be supported for only en, fr, and ru wikipedias. Note that multilingual projects do not accept a subdomain parameter.

Path

The full endpoint path, including the namespace, version, and parameter placeholders. For example: /core/v1/{project}/{language}/page/{title}.

Method

For example: GET. Conventionally, an endpoint that supports multiple methods is represented in the docs as multiple separate endpoints.

Endpoint name

A short phrase describing the endpoint’s essential function, used for navigation. For example: Get page history

Endpoint description

The text presented under the main endpoint heading, describing the use case for the endpoint and any helpful information. Ideally, we’d be able to add complex formatting to this section. For example: Returns information about the latest revisions to a wiki page, in segments of 20 revisions, starting with the latest revision… (including a link and a warning callout).

Request parameters

Includes the name, path/query/data, required/optional, an example, and a description. For example: title, path, required, My_Wiki_Page, Title of the wiki page being accessed.

Headers

Includes name, example, and description. For example: If-Modified-Since, If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT, Returns the response only if the content has changed since the provided timestamp. Takes a timestamp in the format ,   : : GMT.

Responses

Information about possible responses returned by the endpoint. Includes the status code, a description, and an example or embedded sandbox. For example: 200 Success { “my json”: “blob”}

Response schema

Includes the property name, type, required/optional, and a description. Example: latest, required, string, API route to get the 20 latest revisions.

Request examples (sample code)

Request examples in a few popular programming languages. The API Portal currently provides curl, python, php, and JavaScript. These example should encourage best practices and make use of important parameters where possible. For example:

OpenAPI/swagger tools
These are some interesting (mostly untested) tools that I’ve come across for OpenAPI/swagger.


 * RapiDoc: Nice-looking HTML generator, supports a sidebar-based navigation (example, docs)
 * swagger-i18n-extension: Spec translator
 * mocha-swagger: Generate a spec from mocha tests
 * express-openapi-validate: Validate requests based on a spec
 * oatts: Generate basic unit test scaffolding for a spec
 * dredd: API testing framework for multiple spec formats