Platform Engineering Team/API Value Stream/As-We-Go-Blog-Notes

A space to drop notes about our work along the way. Including our steps and learnings to hopefully turn into a blog post about the API Platform by the end of the project.

Creating a repository

 * Using Gerrit, I placed a request for the repository.
 * After the repository was created, I cloned it.
 * I then cloned the Service-template-node in the Gerrit repository that was created and pushed to master.

Discovering existing APIs

 * Mostly Mediawiki/Wikitech searches where multiple lists of apis/services exist
 * Codesearch searching under "Wikimedia Services"
 * Differentiating between "service" and "api" is ambiguous on a lot of our documentation
 * Extension APIs can be considered part of ActionAPI although not available on all wikis, we may need a way to distinguish what wikis they are/are not enabled on.
 * Asking in engineering-all slack channel

Configuring CI Pipeline

 * Clone Integrations Repository
 * Create jobs in the project-pipeline
 * Define the jobs in the layout
 * Push for review

Testing CI Pipeline Configuration

 * Pull recent changes from example-node-api repository
 * Add a pipeline/config file to direct the jobs created to the main blubber file
 * Make any small change
 * Push to repo for review to see if tests run.

Development/POC

 * Can be either on Toolforge or CloudVPS
 * Setup your API as a systemd service if running on CloudVPS
 * Should create a separate CloudVPS project for your API

Logging

 * We have a "staging" logstash instance for projects under "deployment-prep".
 * Logs must be sent to kafka, cannot be sent directly to logstash
 * CloudVPS is blocked off from production environment
 * Can send logs to production logstash from deployment-prep CloudVPS
 * To send logs from deployment-prep to logstash, setup your API as a systemd service. Ensure logs are either getting sent to stdout OR a supported logfile. Follow instructions to add your service to lookup table
 * Services under K8s get automatic logging, but it must adhere to common logging schema

Monitoring

 * No central monitoring/alerting system for individual Cloud VPS/Toolforge hosted projects, just VM monitoring collected from a graphite service
 * Currently, using Diamond to collect basic info about CloudVPS instances themselves. metricsinfra cloud vps project will replace the current Diamond/graphite data collection
 * POC for Prometheus monitoring system exists, but not scaled for all CloudVPS projects. Work on a fully-fleshed out system is tracked in https://phabricator.wikimedia.org/T266050

Storage

 * If API is experimental/prototype:
 * If relational storage is desired:
 * CloudVPS relational DB storage. You can set up postgres, mariadb, mysql db instance easily through horizon interface. Docs: https://wikitech.wikimedia.org/wiki/Help:Adding_a_Database_to_a_Cloud_VPS_Project


 * If another type of storage is desired:
 * SQLite or another in-memory db solution
 * Submit ticket for non-prod storage request tagged with "DBA" with this basic template: (I just copied this from previous tickets)
 * If intended for production
 * Submit ticket for storage request tagged with "DBA" with this basic template: (I just copied this from previous tickets)

DEI
There seems to be limited intersection between DEI and API production/consumption, but there is a bit we can do:
 * we can encourage people to consider Accessibility APIs as part of APIs they produce: https://www.w3.org/TR/core-aam-1.2/
 * we can explore and maybe even contribute to solutions for translation support in API documentation. For example: https://github.com/svmk/swagger-i18n-extension
 * our API Guidelines can provide recommendations on how to handle multiple languages in an API. For example, when is it best to use an Accept-Language header vs a language path parameter, etc.
 * API complexity may impact DEI. An overly complex API design may be less approachable for a developer (volunteer or otherwise) who did not have the opportunity for a formal technical education and/or who may be in the process of educating themselves on API usage without the benefit of paid training.

Security
Some thoughts inspired by the API Specifications 2021 conference:
 * You can't secure what you don't know about
 * Specs are the map to the API catalog, which in turn tells us what APIs we have
 * From a security point of view, we might (1) require a spec, (2) require that spec to be discoverable, and (3) require the implementation to match the spec. Those seems like necessary prerequisites for an API security review.
 * Digging in code to see if an API implementation has security issues is probably a necessary step, but it should probably be toward the end of a security review and not the beginning. Spec review from a security perspective can likely reveal many issues for less reviewer effort.