Platform Engineering Team/API Value Stream/As-We-Go-Blog-Notes

From mediawiki.org

A space to drop notes about our work along the way. Including our steps and learnings to hopefully turn into a blog post about the API Platform by the end of the project.

Creating a repository

  • Using Gerrit, I placed a request for the repository.
  • After the repository was created, I cloned it.
  • I then cloned the Service-template-node in the Gerrit repository that was created and pushed to master.


Discovering existing APIs

  • Mostly Mediawiki/Wikitech searches where multiple lists of apis/services exist
  • Codesearch searching under "Wikimedia Services"
  • Differentiating between "service" and "api" is ambiguous on a lot of our documentation
  • Extension APIs can be considered part of ActionAPI although not available on all wikis, we may need a way to distinguish what wikis they are/are not enabled on.
  • Asking in engineering-all slack channel

Configuring CI Pipeline

Testing CI Pipeline Configuration

  • Pull recent changes from example-node-api repository
  • Add a pipeline/config file to direct the jobs created to the main blubber file
  • Make any small change
  • Push to repo for review to see if tests run.

Hosting

Development/POC

  • Can be either on Toolforge or CloudVPS
  • Setup your API as a systemd service if running on CloudVPS
    • Should create a separate CloudVPS project for your API

Logging

  • We have a "staging" logstash instance for projects under "deployment-prep".
  • Logs must be sent to kafka, cannot be sent directly to logstash
  • CloudVPS is blocked off from production environment
  • Can send logs to production logstash from deployment-prep CloudVPS
  • To send logs from deployment-prep to logstash, setup your API as a systemd service. Ensure logs are either getting sent to stdout OR a supported logfile. Follow instructions to add your service to lookup table
  • Services under K8s get automatic logging, but it must adhere to common logging schema

Monitoring

  • No central monitoring/alerting system for individual Cloud VPS/Toolforge hosted projects, just VM monitoring collected from a graphite service
    • Currently, using Diamond to collect basic info about CloudVPS instances themselves. metricsinfra cloud vps project will replace the current Diamond/graphite data collection
    • POC for Prometheus monitoring system exists, but not scaled for all CloudVPS projects. Work on a fully-fleshed out system is tracked in https://phabricator.wikimedia.org/T266050

Storage

    • If another type of storage is desired:
      • SQLite or another in-memory db solution
      • Submit ticket for non-prod storage request tagged with "DBA" with the below template
  • If intended for production
    • Submit ticket for storage request tagged with "DBA" with the below basic template:

(Template copied from previous tickets)


QPS:

Size:

DB Name:

User:

Accessed from server (s):

Backup Policy:

Grants needed:

DEI

There seems to be limited intersection between DEI and API production/consumption, but there is a bit we can do:

  • we can encourage people to consider Accessibility APIs as part of APIs they produce: https://www.w3.org/TR/core-aam-1.2/
  • we can explore and maybe even contribute to solutions for translation support in API documentation. For example: https://github.com/svmk/swagger-i18n-extension
  • our API Guidelines can provide recommendations on how to handle multiple languages in an API. For example, when is it best to use an Accept-Language header vs a language path parameter, etc.
  • API complexity may impact DEI. An overly complex API design may be less approachable for a developer (volunteer or otherwise) who did not have the opportunity for a formal technical education and/or who may be in the process of educating themselves on API usage without the benefit of paid training.

Security

Some thoughts inspired by the API Specifications 2021 conference:

  • You can't secure what you don't know about
  • Specs are the map to the API catalog, which in turn tells us what APIs we have
  • From a security point of view, we might (1) require a spec, (2) require that spec to be discoverable, and (3) require the implementation to match the spec. Those seems like necessary prerequisites for an API security review.
  • Digging in code to see if an API implementation has security issues is probably a necessary step, but it should probably be toward the end of a security review and not the beginning. Spec review from a security perspective can likely reveal many issues for less reviewer effort.

Documentation