Topic on Project:Support desk

Missing pages, but data appear to be intact

15
Jonahgreenthal (talkcontribs)

My wiki is missing a bunch of pages. I think it's specifically the pages that only had one revision. Possibly only if that revision was very old.

Here's an example page: https://www.qbwiki.com/wiki/St._Anne%27s . It says there's no text in this page, but the page, revision, and text tables all seem to have the right data.

Other observations:

The revision #0 of the page named "St. Anne's" does not exist.
This is usually caused by following an outdated history link to a page that has been deleted. Details can be found in the deletion log.

(but the deletion log doesn't show anything)

  • When I try to save an edit, I am told there is an edit conflict ("Someone else has changed this page since you started editing it.")
  • I can delete the page and then re-create it
  • I fiddled around with the PHP code to see where the error text was coming from. It's Article.php line 436, in the fetchRevisionRecord() method:
// $this->mRevision might already be fetched by getOldIDFromRequest()
if ( !$this->mRevision ) {
	if ( !$oldid ) {
		$this->mRevision = $this->mPage->getRevision();

		if ( !$this->mRevision ) {
			wfDebug( __METHOD__ . " failed to find page data for title " .
				$this->getTitle()->getPrefixedText() . "\n" );

			// Just for sanity, output for this case is done by showMissingArticle().
			$this->fetchResult = Status::newFatal( 'noarticletext' );
			$this->applyContentOverride( $this->makeFetchErrorContent() );
			return null;
		}
	…

But I don't know enough about how MediaWiki works (or PHP in general, really) to figure out what's going wrong.

This problem likely started after an upgrade that I struggled with. Unfortunately, I didn't notice this problem until long after the upgrade, and when I finished the upgrade I had thought everything turned out okay, so I don't remember exactly what went wrong.

The upshot is that there are a bunch of pages (this was just one example) whose contents exist in the database but can't be accessed through the website. I'm not sure even how to systematically identify such pages. I suspect some rows or values are just missing from some table(s), but I have no clue which or how to find out.

Thoughts?

Nikerabbit (talkcontribs)

Do you have backups from before the upgrade?

It is most likely that either the actor or comments migration has gone wrong. MediaWiki does a LEFT JOIN on those tables so missing entries in there will cause those revisions/pages to appear as missing.

If it is about comments, see https://phabricator.wikimedia.org/T249904.

If it is about actors, I had the following trick:

  1. identify the user names for those revisions
  2. Create proper users for them, e.g. `User::newSystemUser( '...', [ 'steal' => true ] );`
  3. Run database queries UPDATE revision SET rev_user = 0 where rev_user_name = '...'; (and similar for all affected tables, mostly logging, archive and recentchanges)
  4. Run php maintenance/cleanupUsersWithNoId.php


But you need backups to do this in case the rev_user and equivalent fields are already dropped. I guess it's possible to do it afterwards by updating the rev_actor and equivalent fields too, but I have not done that myself.

Jonahgreenthal (talkcontribs)

I do have backups from before the upgrade, but the upgrade was about a year ago so restoring from the backup isn't viable.

Thanks for pointing me at revision_comment_temp and revision_actor_temp. It looks like the problem is the latter—this query returns 164 rows:

SELECT * FROM revision WHERE rev_id NOT IN (SELECT revactor_rev FROM revision_actor_temp)

Do you agree with that reasoning? (Some of the corresponding pages do exist, but the revisions seem to be missing when I view the history through the web interface.)

The rev_user_text column contains the username, so that should address your step 1, right? Are you able to elaborate on step 2 (how do I do that? what's the steal thing?) and 3 (which tables are affected?)? Thanks so much!

Nikerabbit (talkcontribs)

I'd suggest running `php maintenance/migrateActors.php --force` to observe if there are errors. If there is, you should get list of usernames that match the rev_user_text of those rows. You could try running cleanupUsersWithNoId.php first or maybe even findMissingActors.php (if you have it) to see if is sufficient.

But if they don't work, my step 2 basically creates and user and actor for the name. The issue may be that there is no used account for the name, so actor cannot be created. Step 3 removes broken references to user ids which do not exist, so that cleanupUsersWithNoId can process it. The relevant tables and names should be printed out by the migrateActors script.

Jonahgreenthal (talkcontribs)

Thanks. migrateActors produced a bunch of messages like this:

User name "X" is usable, cannot create an anonymous actor for it. Run maintenance/cleanupUsersWithNoId.php to fix this situation.

cleanupUsersWithNoId produced a bunch of output but didn't seem to actually do anything.

I don't know how to actually do your step 2. It looks like PHP code I'm supposed to run, but I don't know how to run custom code within the MediaWiki environment.

I don't have findMissingActors.

Nikerabbit (talkcontribs)

There is shell.php and eval.php under maintenance, both allow you to run that code interactively.

Jonahgreenthal (talkcontribs)

Thanks. I had to fight with shell.php pretty hard to get PsySH to work, but I think everything works now, including the solution of my original problem. I appreciate your help.

81.174.133.236 (talkcontribs)

I had a very similar problem and this thread was very helpful, thank you. I was able to resolve with this SQL:

INSERT INTO revision_actor_temp (revactor_rev, revactor_actor, revactor_timestamp, revactor_page) SELECT rev_id, 1 as actor, rev_timestamp, rev_page FROM revision WHERE rev_id NOT IN (SELECT revactor_rev FROM revision_actor_temp);

I was not concerned with correctly matching up the actors (and indeed think this data is lost), so replaced all of them with actor 1 (Administrator on my wiki). All of the page revisions are now accessible again. The missing actors were old now non-existent users that must have been lost in the 1.31 to 1.35 migration somewhere.

81.174.133.236 (talkcontribs)

I should add I needed to run php maintenance/update.php afterwards as well.

Krabina (talkcontribs)

thank you for this! the sql statement saved my wiki :-)

YOUR1 (talkcontribs)

The SQL statement above also fixed issues on our wiki's.

Kghbln (talkcontribs)

The solution posted by 81.174.133.236 only works, however, on file pages the connection of the files to their pages is not reinstated. This means the files are not shown on the wiki. Reuploading with the same name is not possible either.

Note that I went from 1.25 to 1.31, to 1.32, to 1.33, 1.34 and now to 1.35. Needless to say that migrateActorswithNoId.php was useless to run. It detected the actors but did not clean them.

Going the rebuildImages.php path does not help, and neither the cleanupUsersWithNoId.php/migrateActors.php path.

I am clueless as to how to mitigate this. Looks to me like I may have run into this which I will still have to investigate. Edit: No, I do not think this is the issue.

Ciencia Al Poder (talkcontribs)
Kghbln (talkcontribs)

Thanks for the pointer to your patched version of the script "cleanupUsersWithNoId.php." You are a hero! In the case of the current wiki I worked on, it was a lifesaver. Let me share my experience:

Going directly from MW 1.31 to MW 1.35 and applying the script on MW 1.35 after running "update.php" did not work. The wiki was in a disastrous condition after the upgrade and is no longer usable. This outcome is expected since the script had nothing to work on in the MW 1.35 database.

Going from MW 1.31 to MW 1.32, then to MW 1.33, using the patched script you linked to on 1.33 after running "update.php," did work. For some reason, the wiki was still unusable in version MW 1.33; however, upgrading to MW 1.34 with "update.php" and from there to MW 1.35 with "update.php" mitigated the issues emerging on the wiki. As a result, the wiki is working fine as it appears to me (still pending user feedback). This way, I could prevent massive issues, including this one, from occurring if I used core software. I am still determining why the wiki is broken in version 1.33, but it is probably another story.

Directly going from MW 1.31 to MW 1.35 is not recommended for wikis, with issues surfacing after the upgrade. Do it branch by branch and apply the patched script to MW 1.33. Going from MW 1.31 via MW 1.32, MW 1.33, and MW 1.34 to MW 1.35 using the script "cleanupUsersWithNoId.php" provided by core will also not work. The result will be a disaster. On the way, "update.php" will complain for MW 1.33 to MW 1.35 that you need to run "cleanupUsersWithNoId.php," however, it will ultimately do nothing.

185.104.138.31 (talkcontribs)

Hi, I would like to share my experience here since I've been struggling with updating my MW from 1.31 onwards. Originally, I wanted to go from MW 1.31 to 1.32 and then 1.33 some time ago. As 1.33 broke my wiki due to the known actor nightmare (cleanupUsersWithNoId.php does not help), I decided to postpone the update.

Now, I gave everything another shot and followed the instructions posted by @Kghbln.

1) MW 1.31 to MW 1.32 using update.php

2) MW 1.32 to MW 1.33 using the enhanced version of cleanupUsersWithNoId.php by @Ciencia Al Poder first and only then executing update.php

3) MW 1.33 to MW 1.34 using update.php

4) MW 1.34 to MW 1.35 using update.php

Afterwards, my wiki was working fine on MW 1.35. As 1.39 is already out, I decided to continue the update process. To be on the safe side, I performed the update for each version separately (1.36 -> 1.37 -> 1.38 -> 1.39). Everything works fine now under PHP 7.4, my next step is going to push the PHP version to 8.1 but that's not related to this issue.

Btw, the query posted by 81.174.133.236 was not necessary as the enhanced script took care of everything.

Another issue I had facing this update was the removal of Manual:$wgDBmysql5. My database was still using latin1 collations and everything was working fine with $wgDBmysql5 = true; until this setting was removed in MW 1.33. Therefore, I had to adjust my DB accordingly to avoid encoding issues with special characters (some hints are given on the talk page of the setting, also see Topic:Wqktznc6b8nyc29g).

I really would like to thank @Ciencia Al Poder and @Kghbln and everyone involved for sharing the script and their paths for the update!

Reply to "Missing pages, but data appear to be intact"