Talk:Wikimedia Platform Engineering/MediaWiki Core Team/Backlog/Improve dumps

From mediawiki.org

There is a certain risk that this page becomes a mere duplicate of Research Data Proposals. For sure I hope it achieves more! Things didn't change much since 2010, the same goals apply. --Nemo 19:29, 17 February 2015 (UTC)Reply

There is also a certain risk that a Grand Redesign will halt incremental improvements, and lead to stagnation for another few years. Erik Zachte (WMF) (talk) 16:12, 21 February 2015 (UTC)Reply

Authorship and text age annotations[edit]

Regarding "Should we supply tools for requesting and processing the dumps?", it would be great to have a way to find the original author(s) and age(s) of arbitrary passages as WikiTrust used to do. This is necessary for keeping old content up to date. Whether this is something that would be supplied as a tool to be applied to downloaded dumps, a set of annotations which are available for download in addition to the dumps, and/or a service to compute the information from dumps hosted on WMF servers, it would be fantastic to have more widely available.

Apparently [1] and [2] are implemented in [3], but the authors of [4] prefer the old [5] library. A good implementation will correctly handle restored deleted passages, and text moved unchanged to different parts of the page in question. Jsalsman (talk) 19:07, 22 February 2015 (UTC)Reply

Incremental dumps[edit]

I am very glad that somebody at last pay attention to this topic. IMHO the best thing you can do is introduce incremental dumps, that contain only last month (week?) changes. Alexdruk (talk)