Offline content generator/Bundle format

We're still working on documenting this.
 * metabook.json
 * Containing some version of the "metabook" data. For multi-wiki zips, the per-wiki file will contain just the "items" key with articles for this wiki. This (and nfo.json) are basically the input to the spidering process.


 * nfo.json
 * JSON object with three key-value pairs: "format" being "nuwiki", and "base_url" and "script_extension" copied from the data posted to mw-serve. This is the metadata information needed to allow the spider to make API requests to the appropriate wiki.


 * siteinfo.json
 * The output from the API's action=query&meta=siteinfo&siprop=general|namespaces|interwikimap|namespacealiases|magicwords|rightsinfo


 * licenses.json
 * Containing JSON license data for articles. Is this redundant with rightsinfo in siteinfo.json?


 * redirects.json
 * Containing information on redirects. used to resolve internal links?


 * authors.db
 * sqlite database containing author info. Keys are mediawiki titles (eg, ,  ) and the value is a JSON-encoded array of mediawiki usernames (eg,  ).  Note the presence of "ANONIPEDITS:&lt;number&gt;" which notes how many anonymous editors' IP addresses have been elided from the list.


 * html.db
 * sqlite database containing output from action=parse. Keys are revision ids.  Values are the output as a JSON structure.


 * parsoid.db
 * Experimental addition: parsoid parser output, equivalent to html.db


 * imageinfo.db
 * sqlite database containing image info. Keys are mediawiki titles (eg, ).  Values are JSON-encoded objects (xxx: what API call generates this?) such as


 * revisions-1.txt
 * File containing multiple records of json data. Includes the output of action=expandtemplates for all pages in the book, some other API queries for pages in the book, and image pages for images in the book, possibly among other things. There appears to be no indication of the original queries, just the data.


 * images
 * Directory containing images. Filenames are from MediaWiki with localized "File:" prefix, with tildes replaced with "" and all non-ASCII characters plus slash and backslash replaced with "~%d~" where %d is the Unicode codepoint for the character.

Sqlite DB format
All sqlite databases have a single table named  with the following schema: CREATE TABLE kv_table (key TEXT PRIMARY KEY, val TEXT); That is, they are simple key/value maps. The keys and values are described above.