Beta Cluster/status

Last update on: 2012-07-16

2012-05-10
Chris, Sam, Antoine, Faidon, and Ryan met in San Francisco the week of May 7 to bootstrap work on this project. Current focus is getting media handling working smoothly.

2012-05-15
As of May 15th: 
 * Apaches instances have been build 100% using puppet classes, the old one will be removed. All queries (thumbs/regular text/bits) hits the applications apaches, upload.beta.wmflabs.org pointing to the IP address shared by all wikis.
 * MediaWiki logging is fine.
 * Blocker: /home/wikipedia needs a decent place with lot of disk space to host MediaWiki checkouts, MediaWiki logs and syslogs.
 * Blocker: no syslog-server yet, since it conflicts with a base class which is always installed.
 * MediaWiki configuration files in progress of being merged from prod to labs.

2012-05-20
Project is now a bit more on par with production status. 
 * A job runner has been setup, currently catching up with all the pending jobs. Apparently, that includes some video resizing for TimeMediaHandler.
 * All code has been updated to a recent version and all databases have been upgraded.
 * Uploading file should work again (as of May 17th)

2012-05-monthly
Chris McMahon, Sam Reed, Antoine Musso, Faidon Liambotis, and Ryan Lane met in San Francisco the week of May 7 to bootstrap work on this project, kickstarting a process of aligning the configuration with our production cluster. Apache web server instances are now completely configured automatically using Puppet classes. A few key Wikimedia configuration files that were previously managed via private Subversion repository are now managed in a public Git repository. Much work remains to make this a stable testing environment, which will continue in June. 

2012-06-25
TimedMediaHandler has been setup though transcoding is not operational yet, since that would require a fully functional job queue. We discovered that the version of Ubuntu currently used in production (Lucid) won’t work with TimedMedia Handler. As a result, Antoine and Faidon updated the Puppet configurations for the Apache web servers to run on the next generation Ubuntu (Precise).

Administrative tools have been setup closely following the way it is done in production. As an example beta, use the exact same workflow to update the l10n cache. We will work on fetching l10n updates from translatewiki.

2012-06-monthly
The primary focus of Beta cluster work in June was in service to TimedMediaHandler (TMH). TMH has been setup though transcoding is not operational yet, since that would require a fully functional job queue. The team discovered that the version of Ubuntu currently used in production (Lucid) won’t work with TimedMedia Handler. As a result, Antoine and Faidon updated the Puppet configurations for the Apache web servers to run on the next generation Ubuntu (Precise).

Administrative tools have been setup closely following the way it is done in production. For example, the Beta Cluster now uses the exact same workflow to update the l10n cache as we do in production. The team plans to further improve this by fetching l10n updates from translatewiki.

2012-07-16
Beginning of July, the labs instances have been migrated to some new powerful hardware enhancing the performances by an order of magnitude. Some instances have been unfortunately corrupted in the process but thanks to our extensive use of Puppet, replacement have been pretty fast.

Antoine written an overview of the beta cluster, still need to be amended with sections about how to update code and debugging issues.