Talk:Multi-Content Revisions/Dumps

About this board

Tgr (WMF) (talkcontribs)

Nitpick, but <content role="filemetadata">... seems more convenient than <content><slot>filemetadata</slot>... - it's easier to filter with an XPath, process with SAX etc.

(Also, <content> should really be called <slot>, no?)

Reply to "role"
Tgr (WMF) (talkcontribs)

The first proposal does not have storage addresses and the second does, is that intentional? It makes sense to consider them internal detail, I'm just not sure why the discrepancy.

Reply to "address"
Tgr (WMF) (talkcontribs)

The most thorny issue here, with respect to B/C, seems the sha1 field. That can't not change semantics: if we use the revision sha1, it does not equal sha1(text) anymore, and the assumption that revisions with the same main slot have the same sha1 does not necessarily hold. If we use the main slot sha1, the assumption that same sha1 means same content does not hold. The first approach seems less bad, but this is something people running old dump processing scripts will have to be mindful of. (And the main slot sha1 should probably be in there, somewhere.)

Reply to "sha1"
There are no older topics