Reading/Web/Release Manager updates/Logstash Instructions

From mediawiki.org

Purpose of this document[edit]

To provide clear guidelines for Web team members on how to effectively monitor and report JavaScript errors in Logstash across different groups (Group 0, Group 1, and Group 2) based on deployment dates and affected sites.

Context[edit]

Every week, one team member is assigned to check Logstash for JavaScript errors. These errors are categorized into Group 0, Group 1, and Group 2, representing different sets of affected sites based on deployment dates.

Procedure[edit]

Understanding Groups and Deployment Dates:[edit]

Note: The following is a typical timeline using real dates for the 1.42.0-wmf.20 release.

  • Group 0 (Tuesday): Represents the sites affected by the MediaWiki version deployed on 27th February 2024, including mediawiki.org, test.wikipedia.org, and test.wikidata.org.
  • Group 1 (Wednesday): Represents the sites affected by the MediaWiki version deployed on 28th February 2024, including Catalan Wikipedia, Hebrew Wikipedia, Italian Wikipedia, test2.wikipedia.org, and all non-Wikipedia sites (Wiktionary, Wikisource, Wikinews, Wikibooks, Wikiquote, Wikiversity, Wikivoyage, Wikidata, and others).
  • Group 2 (Thursday): Represents the sites affected by the MediaWiki version deployed on 29th February 2024, including all Wikipedias
Monitoring Approach:[edit]
  • Team members should pay attention to errors in all groups but particularly focus on any significant changes or spikes in error rates on the specified days of the week.
  • Group 1 should be checked on Tuesday.
  • Any unusually high spike in errors in Group 1 should be immediately investigated and acted upon.
  • A spike in errors in Group 1 usually indicates a potential issue that might impact Group 2. However, it's essential to note that spikes in Group 1 can also indicate commons or wikidata specific bugs, requiring careful analysis.
Release Day Procedures:[edit]
  • On Thursdays, code rolls out to Group 2, which provides an opportunity to assess whether an issue is unresolved before the deployment.
Other Chore Duty Guidelines[edit]
  • When on chore duty, team members should check errors in Group 1 every day for any red flags.
  • The goal is to identify and resolve issues before they reach Group 2, allowing for smoother deployments and minimizing disruptions.