Topic on Project:Support desk

Cannot delete an image, file history seems wrong/corrupt.

13
Justin C Lloyd (talkcontribs)

One of our wikis has an image that a user reported fails to delete with the message Error deleting file: An unknown error occurred in storage backend "local-backend". The file's history shows a missing entry but in the database the corresponding entry looks identical, including the timestamp, to its parent entry right below the missing one. Strangely, I have a dev version of the wiki that is a snapshot of the database from a few months ago but the dev wiki version of the page shows no change history and the image is the original from 2007, and a different image from the 2010 one showing in the live link provided. I have no idea why the dev wiki is so completely different from the live one and whether that is related to this issue.

All deletion attempts to date have failed, including after changing the image as the last revision of the page shows. It seems that the invalid entry is somehow preventing the deletion. Any recommendations/suggestions on how to correct this page so it can be updated with a new image (which is how this got discovered in the first place)?

MarkAHershberger (talkcontribs)

Could you enable your debug log and post the contents of it for the deletion request? That will give us more information about what is happening.

Justin C Lloyd (talkcontribs)

Yes but it's a bit ugly to do so. There are 4 load balanced web servers and a ton of traffic to the wiki, so I'd need to enable it across all of the servers but I need to first coordinate with our editor who is attempting the deletion (I generally just support the platform and our users do all content-related work) and have him test again, so I need to time the debugging well enough with his attempt to delete just to help keep the log sizes down and more manageable. Unless there's a better approach of which I'm unaware...

MarkAHershberger (talkcontribs)

To debug it, you could take a DB dump and load it onto a development server so that you can handle the debugging there. Is that possible?

Justin C Lloyd (talkcontribs)

Possible but will take time. The database is an AWS Aurora database and about 54 GB, and I have a process to "backport" my live wiki databases to my dev environment using snapshots. What's strange is that the current dev wiki databases (I manage 5 GW1 & GW2 wikis) are backports from late March (when I upgraded from MW 1.30 to 1.34) so, as I originally stated, I'm baffled by why the dev history is so different from live.

However, I also have to sync my EFS (NFS) filesystems from live to dev at the same time, part of the backporting, to get the images in sync with the databases. So this takes time and I'm currently in the process of preparing for a complete redesign of the wikis' ALB architecture on Tuesday. So I can do this but probably not until later next week.

MarkAHershberger (talkcontribs)

Sounds like any debugging is going to take time. I'm sympathetic, but I think that is your best shot right now -- figuring out what is happening with that request by using debug logs.

Justin C Lloyd (talkcontribs)

Ok, thanks. My scheduling just changed so I may be able to do this next week. I appreciate the feedback. I'll report back when I've had a chance to try this debugging.

Bawolff (talkcontribs)

try enabling debug log only for people with a certain cookie (using $_COOKIE), so you could just enable it for the user you want.

Justin C Lloyd (talkcontribs)

I've backported my live wikis to dev (through mysqldump and import and nfs filesystem rsyncs) so they should now be identical. However, the page in question shows inconsistent file history pages for the image in question. See this album, the top image is the dev version and the bottom image is live, which you can directly view here. Note that the only real difference is in the current/top row of the pages' file history sections. And while the image in live looks correct (the user tried to change the image to see if that would help deleting it), the dev version shows the same image but with the image dimensions of the dev page's current image (203x718 vs 442x577). FWIW I did verify the image has the same checksum value on both the dev and life EFS filesystems.

Further, since the dev and live databases are now identical, I'm unclear on where the incorrect current dev page information is getting that wrong entry.

EDIT: I just noticed the mention of image metadata at the bottom of the live image page but not on dev, despite the image files being identical. Not sure why. exif metadata and debug info

The top thumbnail on the dev page is referencing an image file that doesn't exist but I'm not clear on where in the database that file is referenced since the databases and all images are identical.

UPDATE: Just discovered at least part of the problem. To synchronize images from live to dev, I have to do live EFS -> live backup disk -> dev backup disk -> dev EFS. However, when I did the rsync from dev disk to EFS, I didn't use the --delete flag so there are old images there that are causing the wrong latest thumbnail to be shown in the file history. Renaming the one it's using started showing a difference, so I'm running the rsync --delete now to fully update it and then I can go from there.

Justin C Lloyd (talkcontribs)

This seems to be the culprit, not sure what's failing here since things seem okay at the OS level.

[FileOperation] FileBackendStore::ingestFreshFileStats: Could not stat file mwstore://local-backend/local-public/archive/6/69/

[DBQuery] FileDeleteForm::doDelete [0.029s] mw-gw1-db-cluster: ROLLBACK

Justin C Lloyd (talkcontribs)

Ok, it does look like the corrupt entry in the file history page for this image, where there is no thumbnail and no filename, is causing the problem. The above error is really saying that it couldn't stat a file named .../6/69/ because there is a blank filename value in the two queries reported in the debug log right before the above error.

Here's a gist with the details: https://gist.github.com/justinclloyd/bb11437a93082e9e585facdd97486f0c

So I'm not sure how to fix the corrupt database/missing file here.

Peculiar Investor (talkcontribs)

@Justin C Lloyd I'm just run into this problem as well. I've got a small number of images that cannot be deleted and result with the Error deleting file: An unknown error occurred in storage backend "local-backend" Checking the file history page the problematic image has no thumbnail and the Date/Time field is plain text, not a link.

The debug log matches the above report and the database query shows a blank oi_archive_name. I've confirmed there is a .gz file present in the filesystem.

Has a fix or workaround been identified?

Justin C Lloyd (talkcontribs)

@Peculiar Investor I'm not aware of any fix to MediaWiki itself, but ultimately my solution was to delete the bad row from the database, i.e. delete from oldimage where oi_name = 'filename.jpg' and oi_archive_name = and oi_timestamp = '2010...', which allowed the page in question to be deleted. The page deletion threw an error that was safe to ignore: {}Error deleting file: A non-identical file already exists at "mwstore://local-backend/local-deleted/path/to/filename.jpg. Hope this helps!

Reply to "Cannot delete an image, file history seems wrong/corrupt."