User:ArielGlenn

I am on of those dreaded "ops devs" doing "devops", whatever that means. But basically I write code. I work primarily on the XML dumps infrastructure. I'm very interested in anything that touches on translation of content or interfaces, and anything that impacts the multilingual reach or the Wikimedia projects or facilitates communication between the various language communities of the projects, but this is out of the scope of my WMF work.

If you want to reach me quickly, look for me on irc in #wikitech-l with the user name atglenn or apergos. Timezone: EET. If you want to reach me the slow way send an email to user name ariel with domain name wikimedia.org. (Grrr, spammers!)

All things dump-related that I'd love to see move forward:


 * Excerpts of the dumps in various formats from specific projects. Wiktionary is a popular request.
 * Repackaging the dumps as multiple bz2 files contatenated togather with a few pages per file ("multistream bz2").
 * Re-use of the above multi-stream files with some clever scripting for off-line viewing.
 * Process equivalent to the current rsync of our image server, after images move to Swift.
 * Maintained cross-platform easy-to-use tool for converting XML dumps to MySQL for import (eg mwimport).
 * ...? Other cool dump-related ideas?

---

Links:


 * Dumps/Development_2012
 * Research Data Proposals (WikiSym 2010)
 * Quality Assessment Tools for WP Readers (WikiSym 2010)