Requests for comment/Json Config pages in wiki

Rationale
This extension offers an alternative way to store configuration data as JSON blobs in Wiki, as oppose to code files in GIT. This is not a proposal to abolish GIT storage, only an offer for an alternative data storage for cases when it makes more sense. So far I know that two teams, Zero and Logging, have very successfully used this approach, e.g. m:Zero:250-99 and m:Schema:Echo. Both have developed custom code to edit, validate, store, and visualize that data. Since Zero code was based on the event logging code, they both share that code which should be factored out. Here, I propose for the common aspects of that code to be extracted into a separate extension JsonConfig which would reduce duplication and allow other extensions to reuse it for their needs.

A typical file-based configuration workflow involves
 * Editing in a text editor/IDE
 * Git and Gerrit commands to submit that file for review
 * Gerrit review with feedback, revisions, and finally +2
 * Manual deployment steps such as ssh tin && git pull && sync-file

Replacing GIT with a wiki-based storage has a number of pros and some cons, and should not be done for all cases.
 * Pros
 * In-browser editing removes the need to do any git operations (everything from setup, cloning, and pulling to adding, committing, and submitting)
 * Configuration is interactively validated by the same code that will use it later. There is no need to set up an independent Jenkins task.
 * Configuration is easily accessible by PHP and JavaScript code without any additional steps, as well as through the MediaWiki API.
 * Configuration can be visualized in a custom, extension-defined way
 * Optionally, the review process could be done by the flagged revisions extension
 * Configuration becomes active the moment it is saved / marked as reviewed
 * Reverts in production are as fast as reverting a wiki page to an older revision
 * A generic or a custom form-based editing interface would simplify some editing workflows


 * Cons
 * Edit linearity - GIT is much better suited for collaborative editing of the same file in exclusive fashion, and later merging it into a common master. Wiki page history is linear (although I heard of some proposals to change that). Since most changes in settings are fairly minor, I do not think this would be a significant problem.
 * Gerrit offers a much better review system, with per-line and overall comments, multiple reviewers, better email notifications, -1s, etc. Wiki config only offers watchlist notification and a talk page, which might suffice for some, but not all cases.
 * Most complex configs would be broken up into multiple wiki pages, making it harder to perform complex (e.g. regex) searches and would force one page at-a-time editing. This could be fairly easily solved with a simple script or a wiki mass editing tool such as AWB, but searching/editing multiple files is surely easier.

Usage Example
Lets say we decide to store the trusted proxies' IPs as Proxy: pages on the Meta wiki. For instance, the description of the Opera Mini caching proxy page Proxy:Opera could be:

Configuration
Our implementation of the Proxy storage would write a custom content class ProxyContent and set these global vars:

Content class
All JSON pages are handled by the content classes, responsible for parsing, validation and visualization. We may choose to have a free-form JSON, in which case we don't actually have to write any code, and let JsonConfig use the default JCContent class for storage. JCContent does not offer any validation except for JSON parsing, but extensions may choose to override validate($data) function to do custom validation, and getHtml for custom rendering. Alternatively, there is a JCKeyValueContent class that offers a number of useful validation primitives.

JCKeyValueContent treats JSON as a single level key-value storage, with each value being validated by a callback function. The class supports defaults, so the user will not need to check if certain value was given by the user. Page rendering will show JSON with all the defaults as grayed-out values, but will store only the values actually entered by the user. User values that equal defaults are also highlighted in a different color. When saving, the JSON is always reformatted to keep the order of key-values consistent, which makes version diffs easier to view. All unrecognized keys are placed at the end and highlighted.

Data Access
There are several data accessing scenarios: by the code that runs on the same wiki where the data is stored (local), by code that runs on another wiki project but shares all the settings with the storage wiki (cluster, e.g. Wikimedia), and by code that resides somewhere else (JavaScript, another site). The local and cluster scenarios have the benefit of sharing setting and code between the storage and accessor sites.

Local & Cluster Access
$content is an instance of our custom Content class, and could have specific functions to work with the data.

External Access
The stored configuration data may frequently be needed by some external agent such as JavaScript, bot, or some other program. JavaScript could use either JSONP to access needed data, or we could develop a forwarding service if CORS is unavailable. Extension authors may choose to add their own API modules to provide domain-specific information. Lastly, the rvprop=jcddata Query API parameter would return JSON data as part of the API result, not as a text blob that rvprop=content</tt> would return.

Configuration
Extensions that use config extension may choose several usage patterns:

Single page free-form configuration in the default namespace allows users to create just one page called Config:MyExtSettings. As long as the page is a non-empty JSON object, it will be accepted.

If your extension needs multiple similar settings pages, a sub-namespace can be used. This configuration allows any pages named Config:MyExt:...:

For some cases, an extension may choose to have its own top namespace instead of using a sub-namespace. Here we create namespace called Zero:... and Zero praise:...:

Of course at a certain point you would want a custom content class with its own defaults, validation, and HTML rendering. To set it up, specify a model ID and a class that derives from the \JsonConfig\JCValidatedContent</tt>:

$wgJsonConfigStorage
This variable defines if configs should be stored on this wiki, or will be retrieved from another wiki. To make all defined config profiles store on this wiki, set this variable to true. If only some of the profiles should be stored here, add their keys to the array. By default, empty array or false will result in no configs being stored. The only exception are the profiles with 'islocal'</tt> set to true, as they are designed to be stored on every wiki in a cluster.

$wgJsonConfigs
This variable defines profiles for each type of configuration pages. $wgJsonConfigs is an associative array of arrays, with each sub-array having zero or more of the following parameters. By default, the string key is treated as the model ID that this profile represents, but in case you need more than one profile for the same model Id, you can override it with the 'model' parameter.

$wgJsonConfigModels
This variable defines which custom content class will handle which Model ID. More than one Model ID may be handled by the same content class. All content classes must derive from \JsonConfig\JCContent class. If the modelID is mapped to null</tt>, the default JCContent class will be used.

Example:

Implementation details
JsonConfig is implemented as two parts - storage/parsing and editing/visualizing. The editor/visualizer is only available when JsonConfig runs on meta wiki, and allows complex presentation of the Config namespace wiki pages. The storage/parsing is available on all wikis, allowing quick access to cache as well as validation and parsing of the cached json blob.

Implemented Features
These features have already been implemented in Zero and/or logging, and might be useful for other extensions:
 * Visualization shows JSON as an easy to view table rather than code, with some extra highlighting. For example if the value is not provided and a default is used, it is shown in gray, or when the value is the same as default, it shows as purple. For example, see this and this.
 * Code Editor simplifies JSON editing
 * Custom Validation performs complex checks such as checking that the value is in the proper format or that a user ID exists.
 * MemCached caching stores json blobs in memcached under custom keys and expiration policies, and resets them on save.
 * Flagged Revisions support allows configurations to be marked as "reviewed" before going into production
 * Localization of most basic interface elements has been done in many languages, and it would reduce translation work if most common messages would be done just once in one place.

Unimplemented Nice-To-Haves
These features would be desirable to more than one type of configs:
 * Schema validator - Validate against JSON Schema, as most extensions might not need complex validation rules, or might want to combine schema plus extra validation.
 * Custom editor - Zero team has been thinking about implementing a more complex editor, possibly based on JSON Schema.
 * API query support - Allow config pages to be returned as regular API results in all formats - json/xml/... instead of text blobs:
 * api.php ? action=query & titles=Config:Proxy:Opera & prop=jsonconfig</tt>
 * Localization - it would be good to be able to show localized descriptions for each configuration key

Suggested Use Cases

 * Wikipedia Zero - this extension is basically a re-factoring of our internal code, so its a natural migration
 * Logging Schemas - again, most of the concepts were originally borrowed from there, hence its a natural user

 Please add other suggested use cases here