Extension:MirrorTools/Design decisions

Permanently stationary revisions vs. moving and being recalled revisions
Under the "permanently stationary" system, any revisions made on LocalWiki stay at that page ID, page title, and page namespace until they are moved, undeleted, etc. on Wikipedia. Under the "moving and being recalled" system, once a page is deleted on RemoteWiki, those revisions fall under LocalWiki control and can be moved about, but if the page is undeleted, the revisions are recalled to what their page ID, page title, and page namespace are on Wikipedia.
 * Argument
 * Decision: MirrorTools will use the "permanently stationary" system because of a principle that is analogous to "Cool URIs don't change".

Deleting revisions vs. not deleting revisions when pages are mirrormoved onto them
We shouldn't delete anything! On the other hand, redirects to the source page have no information we don't already have, when they're being moved onto.
 * Argument
 * Decision: It was decided that if the only remotely live revision is a redirect back to the page being moved onto it, to go ahead and delete that, but merge everything else.

Keep rev_parent_id value vs. change rev_parent_id value
Should those be set to their local or remote parent_ids? Probably the remote, and then the local ones should be left as they are. Reason being, we don't want to change the old and new lens. On the other hand, what happens when we add a new local edit? Then it would base the difference in page lengths off the imported revision length. Let's let the users sort this out.
 * Argument

Pros to "keep":
 * We don't have a bunch of wacky len changes
 * Performance, e.g. in mirrormoves we don't have to change all the parent_ids

Pros to "change":
 * Things could get theoretically get ugly with the page lengths, e.g. if we have revisions getting moved. But not really, because it will still base it off that revision, even if it's in a different page.

There will be a config setting, $wgMirrorToolsDynamicParentIDs, that defaults to false (i.e. static).
 * Decision

Add a null revision for deletion events
Deletion and restoration are at least as relevant as Wikipeda protection and unprotection events. Before going that route, maybe we should try to find out why some log events have null revisions and others don't. See Manual:Null revision. See.
 * Argument

For now, just imitate what Wikipedia does. See User:Leucosticte/ApiMirrorDelete.php for the code that did null revisions.
 * Decision

Assume vs. don't assume that MirrorPuxxBot has access to the backend
It would be more efficient if we could operate under the assumption that MirrorPuxxBot has access to the backend. But that's not where the main concern about expense is; rather, it's with the pulling from RemoteWiki. Also, who knows where we'll need to operate this bot from. On the other hand, maybe it would impose a delay sometimes to pull from LocalWiki via API; but then again, we'll probably be pulling 500 records at a time.
 * Argument

Don't assume. Operate as if it doesn't have access. Use the API.
 * Decision

Have vs. don't have a page_mt_former
It promotes page_id stability, so that it doesn't change every time someone mirrormoves, mirrorcreates, mirrorpagerestores, etc. and then mirrormoves or mirrorpagedeletes. Thing is, we have page_id instability anyway, because of all these merges, so who really cares about changing it back to what it was before the merge. It also makes the developer's life more complicated, and challenging to keep track of in his head. It does, however, have a certain coolness factor. It could be a way to keep page IDs of mirrored pages below one quadrillion. Also, if the page ID changes, it could be a way to find the page whose page ID changed (although it would only be helpful for the merge, not the unmerge). It could also be a way to keep page_id equal to the rev_ar_page_id. On the other hand, there could be more than one rev_ar_page_id for a given page, so which to use? It also makes the page table less lightweight. Then again, who knows what that data might be useful for? We might need to know, at some point, what the former page_id was. Also, this legacy code (e.g. ApiMirrorMove.php) is challenging to rewrite.
 * Argument

Dump it.
 * Decision

Have vs. don't have mbq_rc_id2
Something just feels messy about having rc_id set to the log event rc_id value rather than the rc_id for the revision row. It doesn't matter, though, because we can just select using multiple columns in the WHERE clause when we need to differentiate. There's a performance hit for having yet another indexed column, and we'll rarely use it. Thing is, we could have a new name for the action to add these mirroredits. Call it mirrornorcedit or something. Then it will map accordingly.
 * Argument

Dump it; use mirrornorcedit-needsrev and mirrornorcedit-readytopush.
 * Decision

Have vs. don't have mbq_extra_params
We have all this stuff like mbq_comment2, mbq_rev_id2, etc. that we don't need. It's just a matter of time before we need to stuff all this stuff in a parameters field. Unless you actually want to run queries on those fields. I can't imagine you would for stuff like mbq_comment2. On the other hand, a blob is a lot of space; what would the effect on performance be?
 * Argument

Keep procrastinating making this change. Maybe do it later, when we have like five of these fields.
 * Decision