Topic on Project:Support desk

Rerun PageContentSaveComplete hook

9
Jer Hughes (talkcontribs)

I setup a PageContentSaveComplete hook, but I have some existing pages that I would like to run the hook against. Is it possible to "resave" all my site in an automated method to run the hook against all the articles even though I won't be making any changes to the articles?

Bawolff (talkcontribs)

Can i ask what you are doing with the hook?

It sounds a little bit like one of the LinksUpdate/SecondaryDataUpdate hooks might be better for your needs, but hard to say.

Jer Hughes (talkcontribs)

The hook captures the contents from each page to save into an external database. I'd like to capture this data on some of the pages that were already created, but I don't need to change any of the page's content.

Bawolff (talkcontribs)

So normally in MediaWiki, we use LinksUpdate/RevisionDataUpdates for this sort of thing - where you want to update some sort of secondary data store based on the content of the page (And assuming you can deduplicate the results). Pages in MediaWiki can change for reasons other then an edit (for example if a template changes), and LinksUpdate should run whenever that happens, if people make a null edit, do a linkspurge from the api, or run the script refreshLinks.php.

If you want to keep with the PageContentSaveComplete hook, i guess i would suggest making a dummy edit to all the pages (e.g. adding <!-- foo--> to all pages), but seriously consider if LinksUpdate could be a better fit, because then its much easier to force a linksupdate of all pages.

Jer Hughes (talkcontribs)

Thank you so much, this seems perfect. So my understanding is, if I use either the `LinksUpdate` hook or the `RevisionDataUpdates` hook, then I can reprocess all my pages using the `refreshLinks.php` script. And that the difference between the two hooks is the first one runs synchronously, whereas the second one happens asynchronously. Is that correct?

Bawolff (talkcontribs)

Note: This system has changed a bit recently. I hope everything i say is accurate, but I'm more familiar with the old version than the new version.

Yes, LinksUpdate does run asynchronously (via the refreshLinks job). It also provides access to the LinksUpdate object, which is important in some use cases (but probably not yours). The LinksUpdate will also only be triggered by normal wiki pages (You can use ContentHandler to for example, replace the MediaWiki parser with something else totally different. If you do that, LinksUpdate won't happen because its tied to the parser output)

RevisionDataUpdates can be either I think. If you want it to be asynchronous, you can have your hook insert a job to do the update later. Often people want their update to be in a separate job than the refreshLinks job, because then you can set up separate job queues for it.

There's also several other hooks, and I'm honestly not sure the difference between them all is.

https://github.com/wikimedia/mediawiki/blob/master/docs/pageupdater.md may be a helpful page.

Jer Hughes (talkcontribs)

So I have my hook built using <code>$wgHooks['RevisionDataUpdates'][] = 'onRevisionDataUpdates';</code>. It works when I save my pages, but now I want to "resave" all my existing pages. I tried running refreshLinks.php alone and with runJobs.php but its not working as expected. Do I use a different maintenance script?

Bawolff (talkcontribs)

No, refreshLinks.php should do it (refreshLinks.php calls WikiPage::doSecondaryDataUpdates, which calls DerivedDataPageUpdater::doSecondaryDataUpdates, which calls DerivedDataPageUpdater::getSecondaryDataUpdates, which calls all RevisionDataUpdates hooks. refreshLinks.php also calls DeferredUpdates::doUpdates() which should trigger all updates to be done immediately.

Jer Hughes (talkcontribs)

Thank you very much. I have verified that refreshLinks.php is working. I didn't realize my hook failed when executed on some of the special pages when an element was expected that was not being provided. When it tried to save that page it stopped refreshLinks.php from completing. Once I fixed my issue the script works great now.