Scrum of scrums/2015-02-11

Facilitating: Grace Gellerman

Apps

 * We'll be meeting soon to go over the RESTBase / Node.js service content spike

Parsoid

 * We spent part of last week figuring out load issues on the Parsoid cluster
 * The deploy of https://gerrit.wikimedia.org/r/#/c/173834/ on Jan 28th exposed a few latent bugs in Parsoid on a tiny subset of pages. https://phabricator.wikimedia.org/T88864 in particular caused those pages to send Parsoid into an infinite loop, and our timeout handling wasn't properly killing these stuck processes in all cases. This caused load spikes Thursday and Friday because of repeated retries on 2 enwiki and 3 plwiki pages. The repeated retry from the job queue is https://phabricator.wikimedia.org/T85939. The infinite loop bug has since been fixed and the fix deployed and load is back to normal since Saturday.
 * Continued focus on VE goals
 * Will deploy code today to reduce size of parsed HTML (by stripping some private attributes that are no longer necessary).
 * Among others, https://phabricator.wikimedia.org/T88495 is close to being resolved.
 * Marc has started work on dependent tasks that are required to reduce size of references HTML ( https://phabricator.wikimedia.org/T88290 .. heads-up to the Content Translation team about upcoming changes to )

RelEng/QA

 * Browser tests found a lot of bugs in several repos this week. Nothing blocking that we know of.
 * Search in beta labs has been broken since Monday, sorry
 * We plan this week to remove browser test builds that target test2wiki. Please let us know if you object.
 * Upcoming: errors and fatals in production logs need more attention, and we'll be working on getting them the attention they deserve: https://phabricator.wikimedia.org/T89049

Security

 * Security release soonish
 * Gerrit 187728 (mobile) and T78730 (ops) - in progress
 * Review for SMTP errors (ops), Capiunto (wikidata) starting soon

MW core

 * Bryan working on T88732 (Decouple logging infrastructure failures from MediaWiki logging)
 * Ori started discussion about paying attention to logs and fixing problems
 * Wikidata Query Service continuing to review alternatives to Titan
 * Documenting authorization use-cases/user stories to flesh out AuthStack RfC
 * Draft RfC for Multi-Datacenter concerns 
 * Preparing for security release with fixes for issues found by iSec security review

Fundraising Tech

 * Working on language variant support for upcoming China campaigns
 * More DonationInterface cleanup
 * More internal dash customization & a/b testing widget planning
 * Deploying some CentralNotice performance improvements

Services

 * RESTBase deployment - early next week
 * HW config still pending (should be completed early next week) - T76986
 * public API endpoint - T78194

Analytics

 * We should collectively find a way to communicate about changes like the one that caused a drop in mobile pageviews to Commons: https://lists.wikimedia.org/pipermail/analytics/2015-February/003318.html
 * EL suffering serious problems, data since February 4th is unreliable, and we have not backfilled it yet
 * Would love some help from design with a small contained problem (need to add a timeseries graph around http://tools-static.wmflabs.org/wikimetrics/ve.html )

Mobile Web

 * Fixed Central Auth bug by putting all the 1x1s in a div
 * Auth sharing on wikimedia.org mobile domains still broken (https://phabricator.wikimedia.org/T88860)
 * Still working on server-side HTML templating in core

Language

 * Cleaning up bugs
 * Expanding language set
 * Working on Yandex integration
 * Statistics page
 * Working on API versioning
 * Extension registration for CX

Ops
Otto has been out for a week, don't know any ops updates.
 * T76986 RESTBase production hardware - in progress. Should be able to rack them next week.

Editing

 * Regression with auto-height TextInputWidgets in OOjs UI 0.6.6, fixed in master, backport for production coming
 * Ops now responding on https://phabricator.wikimedia.org/T76308, seems to be moving along