MediaWiki Job Queue
I'm writing this article at the time while migrating this wiki from MediaWiki version 1.38 to version 1.39. According to the CirrusSearch extension's page and I was in need to migrate from Elasticsearch version 6.8.23 to version 7.10.2.
So after installing Elasticsearch version 7.10.2, instead following the Upgrade manual I've Rebuild the Elasticsearch data from scratch which leads me to an infinite MediaWiki's Job queue. For this reason I was need to get much familiar with the MediaWiki's maintenance scripts related the Job queue.
Envvars
The environment variables used in the following commands.
IP="/var/www/wiki.example.com" # The DocumentRoot directory of the wiki
OWNER="www-data" # The user that owns the $IP directory
Note, in the examples below, ${IP##*/}
is is used instead the name of the certain wiki.
Show Jobs
sudo -u ${OWNER} php "${IP}/maintenance/showJobs.php"
15 # The count of the all pending jobs
sudo -u ${OWNER} php "${IP}/maintenance/showJobs.php" --group
cirrusSearchLinksUpdatePrioritized: 2 queued; 0 claimed (0 active, 0 abandoned); 0 delayed
htmlCacheUpdate: 4 queued; 0 claimed (0 active, 0 abandoned); 0 delayed
recentChangesUpdate: 1 queued; 0 claimed (0 active, 0 abandoned); 0 delayed
refreshLinksDynamic: 7 queued; 0 claimed (0 active, 0 abandoned); 0 delayed
sudo -u ${OWNER} php "${IP}/maintenance/showJobs.php" --group
cirrusSearchLinksUpdatePrioritized MediaWiki_Job_Queue addedLinks=[] removedLinks=[] prioritize=1 cluster= namespace=0 title=MediaWiki_Job_Queue requestId=ZASOXkzdk5n0fxsD5JVbLgAAQRA (id=6794955,timestamp=20230305124311) status=unclaimed
...
htmlCacheUpdate MediaWiki_Job_Queue table=templatelinks recursive=1 rootJobIsSelf=1 rootJobSignature=fcd541151cba4e3ac5b300da797a3163c93407bc rootJobTimestamp=20230305124311 causeAction=page-edit namespace=0 title=MediaWiki_Job_Queue requestId=ZASOXkzdk5n0fxsD5JVbLgAAQRA causeAgent=unknown (id=6794956,timestamp=20230305124311) status=unclaimed
...
recentChangesUpdate Special:RecentChanges type=cacheUpdate namespace=-1 title=RecentChanges requestId=ZASOUUzdk5n0fxsD5JVa-QAAUQs (id=6794954,timestamp=20230305124307) status=unclaimed
refreshLinksDynamic Kali_Linux_Install_GUFW_(gui-ufw) isOpportunistic=1 rootJobTimestamp=20230305121652 namespace=0 title=Kali_Linux_Install_GUFW_(gui-ufw) requestId=ZASIM9BBdr0qUp1CuMirLwAAABA causeAction=unknown causeAgent=unknown (id=6794945,timestamp=20230305121653) status=unclaimed
...
sudo -u ${OWNER} php "${IP}/maintenance/showJobs.php" --type cirrusSearchLinksUpdatePrioritized
2 # The count of the pending jobs for the specified --type
If it is a wiki family, you may need to specify the $wikiId
, like below.
sudo -u ${OWNER} php "${IP}/maintenance/showJobs.php" --wiki="${wikiId}"
Manage Jobs
Re-push abandoned jobs of certain type.
sudo -u ${OWNER} php "${IP}/maintenance/manageJobs.php" --type typeName --action "repush-abandoned"
Delete Jobs of certain type.
sudo -u ${OWNER} php "${IP}/maintenance/manageJobs.php" --type typeName --action "delete"
Use showJobs.php –group
, as it is shown above, to find available job types in the current job queue. Note the quote marks at the action options in the examples above are used here just for better highlight.
Job Queue Service
Job Queue Cron Job
Before writing this article and made the necessary investigation, I was deal with the Job queue by the following script, triggered by by a Cron job.
/usr/local/bin/mlw-maintenance-runJobs-${IP##*/}.sh
#!/bin/bash
# @author Spas Z. Spasov <spas.z.spasov@metalevel.tech>
# @copyright 2022 Spas Z. Spasov
# @license https://www.gnu.org/licenses/gpl-3.0.html GNU General Public License, version 3 (or later)
#
# @name: /usr/local/bin/mlw-maintenance-runJobs-${IP##*/}.sh
# @desc Run the job queue: $IP/maintenance/runJobs.php
#
# Crontab:
# * * * * * sudo -u www-data /usr/local/bin/mlw-maintenance-runJobs-${IP##*/}.sh >/var/log/cron.mlw-maintenance-runJobs-${IP##*/}.sh.log 2>&1
: ${IP:="/var/www/wiki.metalevel.tech"} # The DocumentRoot directory of the wiki
: ${OWNER:="www-data"} # The user that owns the $IP directory
# Use some argument to activate CLI mode (TEST_DEPTH=2)
# otherwise the script fallback to cron mode (TEST_DEPTH=4)
if [[ -z $1 ]]; then
TEST_DEPTH=4
else
TEST_DEPTH=2
fi
echo ''
echo '*'
echo "* week_$(date +%W.%Y-%m-%d_%Hh:%Mm)"
echo '*'
if ps aux | grep -v 'grep' | grep -oq 'mlw-maintenance-rebuild'; then
echo "Some of our 'mlw-maintenance-rebuild-*.sh' is running, try again later..."
exit
fi
if [[ "$(ps aux | grep -v 'grep' | grep -c "$0")" -eq "${TEST_DEPTH}" ]]; then
printf -- '\n*\n*\n* WIKI: %s - RunJobs begin. ------------\n\n' "${IP##*/}"
#sudo chown -R www-data:www-data $IP/cache
sudo -u "$OWNER" /usr/bin/php -dmemory_limit=-1 $IP/maintenance/runJobs.php --conf $IP/LocalSettings.php
else
echo "Another instance of ${0} is running... Skip."
echo "Use 'cli' as argument to activate the CLI mode otherwise the script fallback to cron mode."
echo "Test for other instances: ${TEST_DEPTH} ?= $(ps aux | grep -v 'grep' | grep -c "$0")"
fi
Examples of the Crontab entries – with and without logging.
sudo crontab -e
*/5 * * * * /usr/local/bin/mlw-maintenance-runJobs-${IP##*/}.sh >/dev/null 2>&1
#*/5 * * * * /usr/local/bin/mlw-maintenance-runJobs-${IP##*/}.sh >/var/log/cron.mw-maintenance-runJobs.log 2>&1
Example for a CLI usage.
mlw-maintenance-runJobs-${IP##*/}.sh cli
References
- Continuous service examples:
- MediaWiki: Manual:Job queue
- Semantic MediaWiki: Help:Job queue
- MediaWiki: Manual:runJobs.php
- MediaWiki: Manual:showJobs.php
- MediaWiki: Manual:manageJobs.php
- MediaWiki Docs: JobQueue | JobQueue Architecture | Manual:Job queue/For developers
- MediaWiki Support desk: How to remove abandoned jobs from job queue?