Topic on Talk:MediaWiki 1.28

Categories not automatically updated

23
Summary last edited by Ciencia Al Poder 13:06, 3 July 2017 6 years ago
DarkFeather (talkcontribs)

I have a wiki running 1.28.

My "fix" is to run the following when I make large changes.

<pre>

cd /usr/share/webapps/mediawiki/maintenance/; php rebuildall.php; php refreshLinks.php;

</pre>

A note: I currently have my runJobs.php firing off every 5 minutes to clear out the job queue because it is a small, low-traffic Wiki. I don't think I'm running into a collision issue, as even when I make a "large" number of changes and fire it manually it exits swiftly in a couple seconds. I'm slightly worried, though, that I'm running into a race condition.

I'd like to not hackily rebuild my categories nightly, and Manual:How_to_debug didn't generate anything useful in logs. Does anyone have recommendations?

Kghbln (talkcontribs)

Just this week I reported the same problem and investigated a bit further. See here. Basically 1.28.0 is broken in a way that prevents the respective jobs from being run. So you will either have to wait for 1.28.1 which fixes the issue or go to master of REL1_28. Cheers

Waanders (talkcontribs)
Verdy p (talkcontribs)

This may be caused by this change: (task T8948) Numeric sorting in categories is now supported by setting $wgCategoryCollation to 'uca-default-u-kn' or 'uca-<langcode>-u-kn'. If you can't use UCA collations, a 'numeric' collation is also available. If migrating from another collation, you will need to run the updateCollation.php maintenance script.

Apparently this affects wikis not using MariaDB (like Wikimedia wikis) but still MySQL (most other wikis). It seems that you need to run this updateCollation.php maintenance script even if the ICU library was NOT updated on the server (either in its PHP native plugin, or in the Mediawiki extensions, or in the MySQL backend).

So the background job is simply not running at all, it stops immediatetely. Please consider explaining the cause for the background worker not working and being disabled by default.

Example of wiki affected since update to 1.28.0: https://wiki.openstreetmap.org/ (here again it's impossible to update contents of categories: the parser correctly displays immediately the categories where the page should be listed, but these categories are NEVER updated if you add/remove/change the categorization of a page, or add any new page.

Verdy p (talkcontribs)

Or possibly the "updateCollation.php" does not work with MySQL or other backends than MariaDB. It may also not run correctly with some MySQL installations (notably those whose internal encoding is MySQL's "utf8" which is in fact limited to UCS2 and breaks on any supplementary character, causing edited pages to be **silently** truncated when stored even if their preview was correct. Maybe the new "numeric collation" implementation or update script depends on supplementary characters being supported by the backend. Or this is caused by Mediawiki now using ASCII U+001F (UNIT SEPARATOR control) for some of its APIs, and MySQL rejects this cheracter as invalid if the backend uses the **broken** MySQL "utf8" encoding instead of "binary".

Really, Mediawiki should add some workaround support for MySQL backends using a non-conforming "utf8" limited to UCS2 (truncating any pages where valid supplementary characters, such as emojis or extended sinograms, that were correctly UTF-8 encoded as sequences of 4 bytes will be rejected: the alternative is to internally reencode UTF-8 text-content into BOCU-8 before sending to the MySQL backing store, i.e. supplementary characters will be encoded as two 3-byte sequences representing a pair of surrogates, i.e. 6 bytes instead of 4).

Ciencia Al Poder (talkcontribs)

I highly doubt it has to do with collations at all. If what you say is true, you should bugreport it, but please take the time to test it locally (a collation that works, and a collation that causes it to break), to discard other issues (like a job queue problem).

Verdy p (talkcontribs)

Well apparently this is another bug related to preconditions not satisfied with pending writes not commited before background worker thread can run, there are logged messages causing a fatal assumption error in those worker threads and MW 1.8.1 (or 1.28.2) has a pending change to perform a commit and wait before processing with the background jobs, but it is still not fully tested. You should document what to do on wikis that are now broken: updateCollation.php will not work and we will probably need top perform "null-edits" wit an external bot or some other maintenance tool to force parsing again all pages that were created/modified/deleted with MW 1.28.0 (using a range of page version numbers). But it's not clear what we can do for pages that have been deleted/hidden and that are still listed in categories where they should have been removed. Also you should better document that the worker thread must have its priority increased to avoid filling the job queue too fast, because more actions are now asynchronous. I see that 1.28.2 update now restore the default to **not** use background updates in the job queue (in 1.28.0, background workers were enabled for many other things than just categories). I suspect that this bug could affect various other things, including many MW extensions (such as the Translate extension that also uses asynchronous background workers), and that incoherent state of wikis pages is now occuring in many places with MW 1.28.0 (and MW 1.28.1 or 1.28.2 where the bugfix is still not deployed but only experimented for now in Wikitech).

Ciencia Al Poder (talkcontribs)
Verdy p (talkcontribs)

This is visibly worse than expected. It was reported in MW 1.328.0 and supposed to be fixed in 1.28.1, then 1.28.2, now it still shows in 1.30.0 ! How could these fatal flaws in bakcground workers pass the tests for the latest "stable" release ? All looks like 1.28 should have remained in beta for Wikitech, and not deployed at all. Now many wikis are in troubles because they were updated without any warning that it could fail and wags only tested in the specific complex configuration of Wikimedia farms with many servers cooperating, fast responses and lot of specific tuning (where many tasks that are long to perform elsewhere are made more instantly on Wikimedia). Wikimedia also uses a better SQL engine (MariaDB, complex to deploy and migrate) that has faster response than typical deployments originately made with MySQL (which is difficult to migrate to MariaDB). As well deployong a large server farms is difficult for most sites: Wikiemdia uses a lot of specific tools and tunings and various tricks to make it work, not all of them being in fact part of MediaWiki itself. Can't Wikitech have a few servers to test some more typical configurations of single "LAMP" server, or servers without Lua modules support, or servers without shared namespaces with Commons, and also other typical backends (normally supported: MariaDB, MySQL, Oracle, MS/Sybase SQLServer, PostgreSQL) or with/without Memcached, or for non-dedicated hosting (using restricted PHP configurations, typically used by hsiting companies for personal webs), and a few supported PHP versions and a few webserver engines (Apache, IIS...) and integrations (FastCGI, process pools, PHP plugin...).

For now there's still no warbnibg in the official page of this version stating that there's such a severe bug and where to follow the progress on this issue: please create a tracking bug so we won't have to follow various threads for tested patchs or for a few version branches of MW, or in specific WMF installations).

For now I can only conclude that all branches starting at 1.28.0 and later are unstable. 1.28 should no longer be recommended, and 1.27 still made back the current version stable version.

Verdy p (talkcontribs)

Note: 12.28.1 and 1.28.2 releases still have the bug. Still also in 1.30 beta. This bug is blocking many administrative tools and projects that use categorization of pages with tracked problems that are no longer tracked (meaning that maintenance is stalled).

The bug was first reported 3 months ago and still unsolved ? This is becoming a major problem: why don't you ask us to revert back to 1.27 and link the current download stable version to 1.27 only ? Or just ask us that background async jobs must be disabled and how to do that? But even if this workaround is applied (at the price of performance on servers with lot of concurrent accesses), does it solve the problem?

And what can we don on wikis where ALL background jobs are simply not working (notably those using shared installations of PHP in secure mode and where there's no direct access to create custom processes or to create and schedule any "cron" task to run some PHP script to treat jobs in the MW job queue?

Ciencia Al Poder (talkcontribs)

There have been fixes merged recently about those problems. It would be helpful if you could Download from Git the REL1_28 branch and see if those problems are now solved, before it's released.

Verdy p (talkcontribs)

IS that branch released in a subversion ? (1.28.3?)

Ciencia Al Poder (talkcontribs)

REL1_28 branch contains the latest changes to 1.28, even unreleased changes that are going to land in 1.28.3

Verdy p (talkcontribs)

OK so it is still in test, not available for deployment on already working wikis like OSM wiki where the fix for this bug is now urgently needed (it is probably needed as well on several Wikimedia wikis, possibly also in Wikia, and in many other documentation wikis that this release has broken and where category maintenance is now severely broken and various people are deleting "empty" content categories that should have not been empty fed, and where new or edited pages added to maintenance categories are no longer detected, include those normally fed internals by the parsers of Mediawiki or its extensions).

Kghbln (talkcontribs)

I just found out that the current master of the upcoming 1.29 release branch has this problem again. Pages are no longer propagated to the respective category. What the heck? I cannot check if this is working after every upgrade of a wiki. This is utmost exhausting and I am now feeling ennui and fatigue.

Ciencia Al Poder (talkcontribs)

On a new empty 1.29.rc0. I create a page, and edit it again to add it to a category page. After that I go to the category and the page is not there. I refresh the page and then the page appears in the category.

When I create the page, 3 jobs are added: refreshLinksPrioritized, recentChangesUpdate and htmlCacheUpdate. Each one is picked on a request, so I guess it takes a bit to get to the job that adds the link in categorylinks and updates the cache. On wikis with lots of edits and a large job queue it may take longer than normal but should work if you use a cron job to process the job queue every few minutes.

Kghbln (talkcontribs)

I cannot support your observation. After creating a new page I get indeed 3 jobs: 2 x htmlCacheUpadate and 1 recentChangesUpdate. No refreshLinksPrioritize so for this reason the propagation is supposedly not happening. To get the propagation I have to run e.g. rebuildall.php. Admittedly I am a bit further down the road at (37c22fc). I guess this is painful.

Ciencia Al Poder (talkcontribs)

Tim Starling has confirmed the issue in task T168723, I hope this get fixed soon!

Kghbln (talkcontribs)

Cool. I guess now that he acknowledges the problem when something is most likely to be done about it. At least I hope so, too.

Wschroedter (talkcontribs)

task T168723 is tagged "done"!? Is that true for 1.30, too? Where can I download the relevant patch?

Ciencia Al Poder (talkcontribs)

This is not an issue in latest versions of 1.29 and 1.30. 1.28 is no longer supported.

Wschroedter (talkcontribs)

It is an issue in my recent 1.30 installation www.vexilli.net/w/ but maybe it has to be fixed in another way.

Ciencia Al Poder (talkcontribs)

Categorization is done in the Manual:Job queue. I can see from the api of your site that it has 1259 pending jobs. Depending on your configuration, if jobs are picked when a page is loaded, it may take a while to clear them, depending on how many page views you have. If you have shell and root access, try to set up the simple job runner service described on that manual page.

Reply to "Categories not automatically updated"