Toolserver:Replication lag

Replication lag or replag is the delay between data appearing on Wikimedia servers (like an edit), and that data appearing in the Toolserver databases. This delay occurs because Toolserver tools do not access the Wikimedia databases directly; instead, they access copies of those databases, replicated in real-time. Each update to the live database is logged, and the Toolserver databases follow this log to make the same updates.

Replication lag can be dramatically worsened by database crashes, expensive queries, and software or hardware issues.

Determine current lag
While there is a MySQL API to determine replication lag, it does not work correctly. Instead determine the lagging time from the most recent edit on a frequently edited wiki on the cluster.

You can view the Replication lag graphs tool to see current lag and trends.

You can also type @replag in the #wikimedia-toolserver IRC channel on freenode. The replag bot will output something like this: jsmith: s1-sec-c: 13s [-0.01 s/s]; s2/s5-pri-c: 14m 44s [+0.00 s/s]; s3-rr: 60s [+0.00 s/s]; s3-user: 60s [+0.00 s/s]; s4-rr: 14m 44s [+0.00 s/s]; s4-user: 13s [-0.02 s/s]

This indicates the lag for each database server, and the rate of change for each. (sx-c are copies of the Commons database on each server)

The bot checks the most active wikis on each server to determine replication lag. (For example, if the last edit it sees to the English Wikipedia is 7 minutes old, it assumes a 7-minute lag on server 1.) The following databases are checked:

Determining lag by wiki
To determine how much lag is affecting a specific wiki's database, find out what server it is on using the wiki server assignments table, then use the above methods to find out how much lag is affecting that server.