Wikimedia Developer Summit/2016/T114019

T114019 - This is the session pad for Dumps 2.0 for realz (planning/architecture), slated to begin at 11:30am on January 5

Purpose
Make headway on the question: "what should the xml/sql/other dumps infrastructure look like in order to meet current/future user needs, and how can we get there?"

Agenda

 * 10 minutes - introductory presentation
 * things we dump, how we dump them, known user complaints, 1 minute of known maintainer complaints (since I'm doing the session!)


 * 70 minutes - open discussion
 * use cases for the dumps, known and desired
 * where we currently fall short or are expected to fall short in the future
 * an ideal architecture for dumps that would address the main issues would look like... what?
 * example: if we want to run true incremental dumps rather than dumping the entire history of page content, asking MW only for changes, what would we need from MW core and what tools would we need to present to the user to update a previous dump based on the incremental data)?

Etherpad
etherpad.wikimedia.org/p/WikiDev16-T114019

Goals
''Please prepopulate this section with the goals of the meeting, and anticipate that collaborative editing around fulfillment of goals. This is a great place to capture action items from the conversation.''

Chronology
''This section is where an attempt is made to capture the gist of who said what, in what order. A transcript isn't necessary, but it's useful to capture the important points made by speakers as they happen.''