Wikimedia Performance Team/Sprints

From mediawiki.org

2023[edit]

Outreach:

Insights:

Improvement:


Other goals that we considered but were post-poned, cancelled, or incomplete:

2022[edit]

Insights:

Improvement:

  • Research opportunities in static.php traffic to identify simpler and longer-lasting caching policies. Reduce backend traffic to static.php by more than 70%, and removing a custom WMF-specific endpoint in the process, in favour of standard MediaWiki routes, requiring less maintenance going forward. (T285232, T302465)

Other goals that we considered but were post-poned, cancelled, or incomplete:

  • Multi-DC BagOStuff interfaces (Aaron)
  • Find someone to run user interviews (Larissa) — Both Desiree and Marshal cannot help us at this time. Marshal suggested I run a couple of interviews on my own first, but we currently don't have the bandwidth to come up with a solid interview script and do the necessary pre-work

2021[edit]

See also internal 2021-2022 roadmap and internal Jan-Mar 2022 achievements.

Outreach:

  • Support product development by Inuka Team (Wikipedia Preview), Reading Web (NearbyPages, and RelatedArticles), CPT (WebAuthn), Design Systems Team (WVUI/Vue.js), and WMDE (Kartographer-revid)
  • Participate in SLO working group to help establish an SLO around MediaWiki Save Timing SLO.
  • Participate in W3C WebPerf WG, provide feedback to Chrome team on Google Web Vitals and Chrome bugs.
  • Organise the Web Performance devroom for FOSDEM 2021 (recordings).
  • Speak at the We Love Speed conference (recording).
  • Organise four Web Perf Hero awards.

Insights:

  • Migrate our device lab to BitBar.
  • Evaluate and build proof-of-concept synthetic testing on bare metal instead of at AWS.
  • Write runbooks for investigating RUM alerts, WPT alerts, and WPR alerts.
  • Support to SRE Observablity in developing a new Prometheus-compatible MW-Stats client library.
  • On-going maintenance of WebPageTest, WebPageReplay, and Fresh-node.

Improvement:

  • Multi-DC: Deploy MainStash DB and migrate away from Redis-based MainStash (T212129).
  • Multi-DC: MariaDB-TLS tested and enabled for all wikis.
  • Multi-DC: CDN routing logic written and deployed to Beta and Prod behind feature flag.
  • ResourceLoader debug mode v2, reduce wait time on complex pages from ~1 minute to ~1 second.
  • Guidance and code review for DBA-led normalization of "templatelinks" MediaWiki database table, to reduce storage pressure and improve query performance. (T299417)
  • Support to SRE ServiceOps for MW-on-K8s project.
  • Develop precache-based GlobalUserEdit API for CentralAuth, following an incident.

2020[edit]

See also internal 2020-2021 roadmap.

Outreach:

Insights:

  • Expand navtiming RUM metrics pipeline with new Layout Shift metric.
  • Kobiton setup for our device lab, expand to include iOS in addition to Android.
  • Explore BitBar for our device lab.
  • Explore moving WPT/WPR infra away from AWS.

Improvement:

  • Multi-DC: Implement multi-dc strategy for ChronologyProtector (T254634).
  • Multi-DC: Determine and start implementing strategy for MainStash DB (T212129).

2019[edit]

See also 2019-20 Q1#Performance and internal 2019-2020 roadmap.

  • Outreach:
    • Design and implement the AS Report, to expand and formalize collaborations to leverage our influence with browsers vendors and ISPs. (Announcement on Techblog).
    • Initiate and work on Wikimedia Foundation becoming an official W3C member organization. This expands the Performance Team's participation in web standards and moves us from an "invited expert" (individual) to a represented membership organisation. (Announcement on wikimediafoundation.org)
    • Support product launches by Parsing Team (Parsoid-PHP launch), Editing Team (DiscussionTools launch), Growth Team (GrowthExperiments launch), and Inuka Team (Wikipedia KaiOS app launch).
    • Support RelEng around establishing production error triage workflows and semi-automation thereof.
    • Organise WMF-wide frontend web performance training.
    • Provide performance expertise to Frontend Architecture Working Group (FAWG).
    • Get published in the Web Performance Calendar (2x: Measuring LT and FID, Big questions on RUM)
  • Insights:
    • Research and develop and test new RUM metrics that better match user perception (T187299, Meta-Wiki, Rossi 2019 paper).
    • Organise and oversee implementation of First Paint metric in WebKit for Apple Safari (blog post).
    • Introduce automatic developer-facing performance metrics for specific chunks of MediaWiki code in core and extensions, powered by WANObjectCache (T197849).
    • Add more RUM metrics to the navtiming pipeline, including instrumentation for First Input Delay (T332012).
    • Participate in Chrome Origin trial for Element Timing and provide feedback on upcoming W3C standard (blog post).
    • Release WikimediaDebug v2 (blog post).
    • Create our own Mobile Device Lab.
    • On-going first-respondence to synthetic testing alerts, including investigating regressions after Chrome/Firefox releases and comms with upstream browser vendors.
    • On-going maintenance of WebPageTest and WebPageReplay.
    • On-going maintenance of XHGui, including dealing with MongoDB becoming non-free software by developing and upstreaming MySQL drivers for XHGui, and migration our install from MongoDB to MySQL.
  • Improvements:
    • PHP7 Transition: Finish the transition from HHVM and support SRE with instrumentation, sampling, and benchmarking.
    • Multi-DC: Start work on MainStash DB.
    • Faster MediaWiki backend startup time to reclaim PHP7 latency increase in certain areas. (T233886, T189966).
    • Faster page load time, by reducing ResourceLoader startup cost (blog post).
    • Guidance, CR and testing for new AbuseFilter parser (development by Daimona) to improve Save Timing (T156095).

2018[edit]

See also 2018-19 Q1, 2018-19 Q2, and internal 2018-2019 roadmap.

Insights:

Outreach:

Improvement:

  • Annual Plans/FY2019/TEC1: Improve MediaWiki availability and reduce read-only impact from data center switchovers.
    • Multi-DC: Develop integration and support for Mcrouter service in MediaWiki's WANObjectCache, support SRE's rollout of mcrouter service. (T198239)
  • Annual Plans/FY2019/TEC4: PHP7 Migration: Guide the work and support other teams.
  • Introduce support for packageFiles to ResourceLoader (T133462).
  • Introduce support for WebP compression format to Thumbor.
  • Reduce page load time by refactoring the startup module to need only one roundtrip instead of two, effectively loading jQuery in parallel outside the critical path. (T192623).

2017[edit]

See also Annual Plan/2017-2018#Technology, 2017-18 Q3, 2017-18 Q4, and internal 2017-2018 roadmap.

Outreach:

Insights:

  • Program 1. Availability, performance, and maintenance.
    • All production sites and services maintain current levels of availability or better.
    • Maintain a comprehensive toolset to measure the performance of our platforms.
  • Research reverse proxies technologies with objective to obtain more stable metrics from synthetic testing infrastructure, increasing confidence, reduce minimum regression size for detection. Evaluated Mahimahi, WebPageReplay, and mitmproxy; selected WebPageReplay. Deployed WebPageReplay+Browsertime to complement and eventually replace WebPageTest (T153360).

Improvement:

  • Support for HHVM-PHP7 migration and upgrade.
  • Expand support in Thumbor to private wikis. Thumbor service replaces MediaWiki ImageHandler (3-part blog post series).
  • Program 8. Progress towards multi-datacenter support (wikitech:Performance/Multi-DC MediaWiki).
  • Faster Wikipedia time-to-logo. (blog post, T100999)
  • Faster edit save timing. (blog post)
  • Faster page load time. Reduce load time on 3G-Slow connections by one whole second, from 14s to 13s. T164299#3572231
  • Phase out "mediawiki.legacy.wikibits" module to reduce page view cost. T122755
  • Migrate MediaWiki core and all deployed extensions to jQuery 3, multi-month cross-team effort. T124742

2016[edit]

See also Perf Matters at Wikipedia in 2016 (Blog post), and Annual Plan/2016-2017 Program 4: Improve site performance.

Insights:

  • Enhance performance testing infrastructure, including speeding up the infrastructure to achieve hourly testing instead every 3 hours (T151197), and adding new metrics for DOM size (T159362).

Improvement:

2015[edit]

See also Perf Matters at Wikipedia in 2015 (Blog post).

See also[edit]