Deployment tooling/Notes/What does scap do
This page documents a completed project. It is being retained for archival purposes. It may include information that is no longer accurate. This documentation describes scap prior to it being ported to python in 2014. |
Scap ("sync-common-all-php") is a collection of shell scripts used to publish code and configuration to the WMF production web servers.
scap
[edit]scap is the driver script for syncing the MW versions and configuration files currently staged on tin.equiad.wmnet to the rest of the MW servers in the production cluster.
- Usage
- scap [--versions=<versions>] [<message>]
- Acquire lock on
/var/lock/scap - Record start timestamp
- Ensure that
SSH_AUTH_SOCKis available (needed fordshto remote hosts) - Check for command line flag to limit activities to a particular MW version
- Export
MW_VERSIONS_SYNCvariable describing software versions to push with sync scripts. Either:- A specific version given with the
--versionscommand line argument (eg 1.23wmf12) - The output of
mwversionsinuse --home
- A specific version given with the
- Lint files in $MW_COMMON_SOURCE/wmf-config and $MW_COMMON_SOURCE/multiversion
- Runs
sync-common- copies files from tin.eqiad.wmnet:/usr/local/apache/common-local to tin.eqiad.wmnet:/a/common via rsync
- Runs
mw-update-l10n - Runs
dologmsgto announce that scap is starting - Runs
scap-1viadshon scap-proxies group - Randomizes list of hosts to update (All hosts listed in
/etc/dsh/group/mediawiki-installation) - Runs
scap-1viadsh - Runs
scap-rebuild-cdbsviadsh - Runs
sync-wikiversions - Compute elapsed runtime
- Runs
dologmsgto log runtime - Runs
deploy2graphiteto log scap run completion - Deletes temp files
- Releases lock on
/var/lock/scap
sync-common
[edit]sync-common is really just an alias for scap-1 in shell script form.
- Runs
scap-1
scap-1
[edit]scap-1 sets up the local host to receive files via rsync, chooses an rsync server to fetch files from and delegates to scap-2 to actually fetch the files.
- Sources
/usr/local/lib/mw-deployment-vars.sh - If
$MW_COMMONdirectory is not found:- Creates
$MW_COMMONviainstall -d -o mwdeploy -g mwdeploy "${MW_COMMON}"
- Creates
- If
/usr/local/apache/uncommondirectory is not found:- Creates
/usr/local/apache/uncommonviainstall -d -o mwdeploy -g mwdeploy /usr/local/apache/uncommon
- Creates
- Initialize
RSYNC_SERVERSvariable to first command line argument (could be empty string) - Initialize
SERVERas an empty variable - If
$RSYNC_SERVERSis not an empty string:- Set
SERVERviasudo /usr/local/bin/find-nearest-rsync $RSYNC_SERVERS
- Set
- If
$SERVERis still empty:- Set
SERVERto$MW_RSYNC_HOST
- Set
- Run
scap-2 "$SERVER"as the usermwdeployMW_VERSIONS_SYNCandMW_SCAP_BETAfrom the current execution context are forwarded to the environment of thescap-2invocation
- Echo "Done"
- Exit 0
scap-2
[edit]scap-2 copies files from the common module of an rsync server to the MW_COMMON directory on the local host
- Usage
- scap-2 [<host>]
- Sources
/usr/local/lib/mw-deployment-vars.sh - Initialize
SERVERas$1 - If
$SERVERis still empty:- Set
SERVERto$MW_RSYNC_HOST
- Set
- Initialize
RSYNC_ARGSas an array containingMW_RSYNC_ARGS[@] - If
$MW_VERSIONS_SYNCis not an empty string:- Add
--include=php-$v/toRSYNC_ARGSfor each $v in$MW_VERSIONS_SYNC[@] - Add
--exclude=php-*/toRSYNC_ARGS
- Add
- Echo that
hostname -sis copying from$SERVER - Run
rsync "${RSYNC_ARGS[@]}" "$SERVER"::common/ "${MW_COMMON}"
mw-update-l10n
[edit]mw-update-l10n generates l10n cdb files and exports their contents as a series of json files that have better rsync compression properties for transfer to cluster hosts.
- Usage
- mw-update-l10n [--verbose]
- Sources
/usr/local/lib/mw-deployment-vars.sh - Asserts that the local host is running some variant of linux
- Checks for a
--verbosecommand line argument and toggles off theQUIETsetting if present - Sets
CPUSto the number of cores on the local host (includes hyperthreading cores) - Sets
THREADStoCPUS- 2 - Sets
mwExtVerDbSetsto the output ofmwversionsinuse --extended --withdb- (eg
1.23wmf11=aawikibooks 1.23wmf12=mediawikiwiki)
- (eg
- For each version in
$mwExtVerDbSets:- Split version string into
mwVerNum(eg 123.wmf11) andmwDbName(eg aawikibooks) - If
MW_VERSIONS_SYNCis set andmwVerNumisn't a version being synced: continue - Make a new temp file and track as
mwTempDest - Run
mergeMessageFileList.phpfor the wikimwDbNameoutputting tomwTempDest - Copy
mwTempDestto$MW_COMMON_SOURCE/wmf-config/ExtensionMessages-"$mwVerNum".php - Copy
$MW_COMMON_SOURCE/wmf-config/ExtensionMessages-"$mwVerNum".phpto$MW_COMMON/wmf-config/unless they are the same location - Run
rebuildLocalisationCache.phpusingTHREADSthreads - Run
refreshCdbJsonFilesusingTHREADSthreads
- Split version string into
refreshCdbJsonFiles
[edit]refreshCdbJsonFiles generates JSON data files and MD5 checksums from CDB databases.
- Usage
- refreshCdbJsonFiles --directory <DIR> [--threads <N>]
- Validate command line arguments
- Create list of
.cdbfiles in target directory - Split list in N parts (N == number of parallel threads requested)
- For each sublist of CDB files:
- Fork a child process
- For each file:
- Compute md5 checksum of file
- If md5(file) === last md5 recorded: continue
- Generate JSON file of key:value pairs found in CDB file to temporary file
- Write md5(file) to $file.MD5
- Move JSON temp file to $file.json
- Wait for children to finish
- Echo status message if any files were updated
scap-rebuild-cdbs
[edit]scap-rebuild-cdbs rebuilds l10n cache CDB database from JSON files
- Sources
/usr/local/lib/mw-deployment-vars.sh - Sets
CPUSto the number of cores on the local host (includes hyperthreading cores) - Sets
THREADStoCPUS/ 2 - Sets
mwVersionsto eitherMW_VERSIONS_SYNCor the output ofmwversionsinuse - For each version in
mwVersions:
mergeCdbFileUpdates
[edit]mergeCdbFileUpdates updates l10n CDB files from JSON data files
- Usage
- mergeCdbFileUpdates --directory <DIRECTORY> [--threads <N>] [--trustmtime]
- Validate command line arguments
- Create list of
.jsonfiles in target directory - Split list in N parts (N == number of parallel threads requested)
- For each sublist of JSON files:
- Fork a child process
- For each file:
- Continue unless JSON newer than CDB / md5 checksums don't match
- Load JSON data from file
- Create a new CDB file with JSON key:value data
- Rename temporary CDB file over .cdb file
- Wait for children to finish
- Echo status message if any files were updated
sync-wikiversions
[edit]sync-wikiversions copies wikiversions files to hosts in the mediawiki-installation dsh group.
- Sources
/usr/local/lib/mw-deployment-vars.sh - Ensure that
SSH_AUTH_SOCKis available (needed fordshto remote hosts) - Run
multiversion/refreshWikiversionsCDB - Ensure that
dshis available locally - Run
rsync $MW_RSYNC_HOST::common/wikiversions.{dat,cdb} $MW_COMMONviadshon mediawiki-installation hosts - Runs
dologmsgto log completion - Runs
deploy2graphiteto log sync-wikiversions completion
mw-deployment-vars.sh
[edit]mw-deployment-vars.sh is a puppet generated shell script that sets several MW related environment variables.
The values of these variables change based on the deployment system in use and the realm of the server. For the sake of this analysis we are only concerned with the values configured for the scap deployment system in the production realm.
- MW_COMMON
- varies by deployment system
- scap:
/usr/local/apache/common-local - MW_COMMON_SOURCE
- varies by deployment system
- scap:
/a/common - MW_DBLISTS
- varies by deployment system
- scap:
/usr/local/apache/common-local - MW_DBLISTS_SOURCE
- varies by deployment system
- scap:
/a/common - MW_CRON_LOGS
/home/wikipedia/logs/norotate- MW_RSYNC_HOST
- varies by realm
- production:
tin.eqiad.wmnet - MW_DSH_ARGS
('-cM' '-g' 'mediawiki-installation' '-o' '-oSetupTimeout=30' '-F30')- MW_RSYNC_ARGS
('-a' '--delete-delay' '--delay-updates' '--compress' '--delete' '--exclude=**/.svn/lock' '--exclude=**/.git/objects' '--exclude=**/.git/**/objects' '--exclude=**/cache/l10n/*.cdb' '--no-perms')- MW_CARBON_HOST
- varies by realm
- production:
statsd.eqiad.wmnet - MW_CARBON_PORT
2003
find-nearest-rsync
[edit]find-nearest-rsync is a perl script that attempts to determine the host with the lowest ICMP ping round trip time (rtt) from a given list of hosts.
- Usage
- find-nearest-rsync [--verbose] <host> [<host> ...]
The host with the lowest rtt will be printed to stdout.
mwversionsinuse
[edit]mwversionsinuse is a shell script to call the local version of multiversion/activeMWVersions
- Sources
/usr/local/lib/mw-deployment-vars.sh - Runs
"${MW_COMMON}/multiversion/activeMWVersions" "$@"
dologmsg
[edit]dologmsg appends a message to an IRC buffer
- Usage
- dologmsg [MESSAGE]
