Manual:Job queue/For developers

Jobs are non-urgent tasks. For a general introduction and management of job queues, see Manual:Job queue.

Deferred updates
Deferred updates (or deferrable updates) are a useful way to postpone time-consuming tasks in order to speed up the main MediaWiki response. Refer to class API and Database transactions for how to use these.

Deferred updates are represented as a callable functions that we queue in an array, and then calll at the end of the MediaWiki PHP process. Typically the call will take place after finishing the response to a web request (e.g. echo and flush everything to the browser), but before we actually exit or return to the web server. This is internally powered by in.

Deferrable updates are executed at the end of the current process. They are only memorised within that same web request (or other process, such as CLI maintenance scripts).

This unlike jobs which are scheduled via a persistent storage backend, to then run some minutes or hours in the future, independent of and after the original request that queued the job. The job queue in MediaWiki is a pluggable service. The default backend is to add jobs to the job table in the wiki's main database. The default job runner is to execute upto one job at the end of random page views.

More information:


 * How to configure an independent job runner, Manual:Job queue.
 * How to configure alternate storage backends,.
 * MediaWiki-Docker/Configuration recipes/Jobrunner

Which one to use?
Deferrable updates should be used for tasks that generally take only a few milliseconds to complete as a way to speed up the web response. By nature of being deferred, this means that failure is hidden from clients since the response has already been sent.

Examples of critical tasks that we don't run via deferred updates. Failure must be known to users, and more generally people should know how and when their action was completed, to then act further knowing that the change is completed. E.g. make further edits that depend on previous ones, possibly scripted or batched through some automation.


 * Database write that creates a page or saves an edit.
 * Create account, change password.
 * Explicit "send email" feature.

Examples of "urgent" tasks that we run via post-response deferred updates after saving an edit. These small transactions are expected to be reflected if the client looks for it afterward, but the result of these is not needed to render the response to the edit itself.


 * Metadata update that adds the article to a certain category listing.
 * Publishing the edit event to the recent changes feed.
 * Updating the account's edit count field.

Examples of "non-urgent" tasks that we run via the job queue:


 * After saving an edit to a template, iterate through potentially millions of affected pages to re-parse and purge (known as "Refresh links" or ).
 * Periodically prune old rows from the recent changes table.
 * After uploading a photo, pre-render common thumbnail sizes.
 * After saving an edit to an article, send emails to the accounts that watch this page with email notifications enabled.

Fallback
Deferrable updates can choose to implement the interface. Such updates can be automatically converted to a job as-needed. For example, if the update fails, MediaWiki will convert it to a job and queue it to try again later. There are also other situations in which we improve reliability or optimise throughput by proactively converting updates to jobs where possible.

Since any MediaWiki code can queue deferred updates, it is also possible for a CLI maintenance script or job to implicitly built up a list of deferred updates. If these batch operations end up queuing a lot of updates, MediaWiki will proactively convert tasks to jobs where possible (handled by the DeferredUpdates class internally).

Use jobs if you need to save data in the context of a GET request
For scalability and performance reasons, MediaWiki developers should generally not perform database writes during page views or other GET requests. If this becomes difficult to avoid, check the Backend performance guidelines first and consider seeking advice from other developers or the Performance Team for how to approach the problem in a different way.

Note that large wiki farms (such as Wikimedia) may operate from multiple data centers and thus run GET requests (which don't expect database writes) from a secondary data center, which should be able to respond to such requests without relying on communicating to the primary DC.

If you're reasonably certain that your feature will only rarely discover during a GET request the need for a database write, and if the write is not urgent, then one option you do have is to queue a job during a GET request. Job queues can be buffered and synced across datacenters asynchronously and thus do not require immediate cross-DC communication. You can then rely on the job eventually being transmitted to the primary DC where it will then execute at some point in the future.

Deferred updates should not be used to perform database writes after a GET request. Attempting this will log a DBPerformance warning message.

Registering a job
To use the to do your non-urgent jobs, you need to do these things:

Create a Job subclass
You need to create a class that will perform your deferred updates

Add your Job class to the global list
Add the Job class to the global array. In extensions, this is done in the file, and in core it's done in. The key must be unique and match the value in the job's constructor, and the value is the class name.

How to queue a job
There is another function to push jobs,, which will be executed at the very end, hence after jobs pushed with.

Job queue type
A job queue type is the command name you give to the parent::__construct method of your job class; e.g., using the example above, that would be synchroniseThreadArticleData.

getQueueSizes
will return an array of all job queue types and their sizes.

getSize
While  is handy for analysing the entire job queue, for performance reasons, it’s best to use   when analysing a specific job type, which will only return the job queue size of that specific job type.

Pushing jobs
The primary function is. It selects the job queue corresponding to the job type and, depending on the job queue implementation (database or Redis), it will be pushed either through a Redis connection (Redis case) either as a deferrable update (database case).

The lazy push function keeps in memory the jobs. At the end of the current execution (end of MediaWiki request or end of the current job execution) the jobs kept in memory are pushed, as the last deferrable update (of type ). As a deferrable update, the jobs are pushed at the end of the current execution, and as an  the jobs are pushed as a single database transaction. See  and   for details.

In CLI, note that deferrable updates (either from  (JobQueueDB implementation), either from  ) are directly executed if the database transaction flag  is free. See  and   for details.

When some jobs are pushed through  but never really pushed (and hence lost), usually because an unhandled exception is thrown, the destructor of JobQueueGroup shows a warning in the debug log:

PHP Notice: JobQueueGroup::__destruct: 1 buffered job(s) never inserted

See for an example of such a warning; this was before MediaWiki 1.29 release for Web-executed jobs, because when a job internally lazy-push a job and the former job is executed in the shutdown part of a MediaWiki request, the later job is not pushed (because   was already called); the fix for this specific bug was to call   in   to always push lazily-pushed jobs after execution of each job.

Execution of jobs
Jobs are ordinarily executed at the end of a web request, at the rate of per request. If, no jobs are run at the end of a web request. The default value of is 1.

All enqueued jobs can be executed at any time by running. This is particularly important when.

The jobs are run by the  class. Each job is given its own database transaction.

At the end of the job execution, deferrable updates are executed. Since MediaWiki 1.28.3/1.29 lazily-pushed jobs are pushed through a deferrable update in order to use a dedicated database transaction (with ).