Wikimedia Performance Team

From MediaWiki.org
Jump to navigation Jump to search

As the Wikimedia Foundation’s Performance Team, we want to create value for readers and editors by making it possible to retrieve and render content at the speed of thought, from anywhere in the world, on the broadest range of devices and connection profiles.

Team[edit]

Focus[edit]

Outreach. Our team strives to develop a culture of performance first in the movement. Through communication, embedding ourselves in the product lifecycle and training, we want to make performance a prime consideration in technological and product developments across the movement.

Monitoring. By developing better tooling, designing better metrics, automatically tracking regressions, all in a way that can be reused by anyone, we want to monitor the right metrics and discover issues that can sometimes be hard to detect.

Improvement. Some performance gains require a very high level of expertise and complex work to happen before they are possible. We undertake large projects, often on our legacy code base, that can yield important performance gains in the long run.

Knowledge. We are the movement's reference on all things performance, which requires keeping up with rapid changes in technology across our entire stack. In order to disseminate correct information in our outreach, we aim to build the most comprehensive knowledge base about performance.

Current projects[edit]

Availability. Although Wikimedia Foundation currently operates five data centers, MediaWiki is only running from one. If you are an editor in Jakarta, Indonesia, content has to travel over 15,000 kilometers to get from our servers to you (or vice versa). To run MediaWiki concurrently from multiple places across the globe, our code needs to be more resilient to failure modes that can occur when different subsystems are geographically remote from one another.

Performance testing infrastructure. WebPageTest provides a stable reference for a set of browsers, devices, and connection types from different points in the world. It collects very detailed telemetry that we use to find regressions and pinpoint where problems are coming from. This is addition to the more basic Navigation Timing metrics we gather from real users in production.

ResourceLoader. ResourceLoader is the MediaWiki subsystem that is responsible for loading JavaScript and CSS. Whereas much of MediaWiki's code executes only sparingly (in reaction to editors modifying content) ResourceLoader code runs over half a billion times a day on hundreds of millions of devices. Its contribution to how users experience our sites is very large. Our current focus is on improving ResourceLoader's cache efficiency by packaging and delivering JavaScript and CSS code in a way that allows it to be reused across page views without needing to be repeatedly downloaded.

Presentations and blog posts[edit]

Dashboards[edit]

A big part of our work is devoted to collecting and analyzing site performance data to ensure that we have a holistic and accurate understanding of what users experience when they access Wikimedia sites. You can discover our dashboards by visiting the Wikimedia performance portal. A selection of our dashboards is also provided here.

Tools[edit]

Below is an overview of the various applications, tools, and services we use for collecting, processing, and displaying our data.

Data collection[edit]

Maintained by Wikimedia:

  • wikimedia/arc-lamp - [PHP] Collect data from HHVM Xenon and send aggregated and sampled profiles from production requests to Redis. Used for flame graphs.
  • Navigation Timing (docs | GitHub) - [JS] MediaWiki plugin to collect Navigation Timing data.
  • WebPageTest runner (GitHub) - [JS] Collect data from WebPageTest API and send to Statsd or Graphite.
  • Jenkins configuration (GitHub) - Jenkins job that triggers WebPageTest runs.

We also use:

Processing and display[edit]

Maintained by Wikimedia:

  • navtiming (GitHub) – [Python] Process data from Navigation Timing beacons and send to Statsd/Graphite.
  • EventLogging – [Python] Platform for schema-based data.
  • coal (see | GitHub) - [Python] Custom Graphite writer and Web API. Frontend graphs made with D3.
  • PerformanceInspector (docs | GitHub) - [JS] MediaWiki plugin to profile the current page and find potential performance problems.
  • Statsv – [Python] Receiver for simple statistics over HTTP for statsd.
  • performance.wikimedia.org (see | GitHub) - Static website.
  • perflogbot (source) - [JS] An IRC bot tracking behaviour of ResourceLoader in Wikimedia production (#wikimedia-perf-botsconnect )
  • Xenon CLI tools [Python]

We also use:

Data storage[edit]

Milestones[edit]

Workflow[edit]

Contact[edit]