User:AGreen (WMF)/Draft:Dev setup for centralnotice changes monitor

From mediawiki.org

Overview[edit]

centralnotice_changes_monitor checks the CentralNotice API and consumes Kafka messages to follow changes in banners and transcluded pages. Main requirements are: Python 3, MariaDB, a MediaWiki installation with the CentralNotice and EventBus extensions installed, and Kafka.

The way I've achieved this setup is to start with Vagrant on the eventbus role enabled, and install CentralNotice manually there. That role provides a Kafka setup similar to WMF production. I've then tweaked Vagrant to allow consumption of Kafka streams on the host machine, which is where I've run centralnotice_changes_monitor itself. Other setups, including running the script from within Vagrant, are possible, too.

Vagrant[edit]

eventbus role[edit]

Enable the eventbus role and provision.

Notes:

  1. This didn't work for me with the fundraising role enabled at the same time. Probably no harm in trying it, though, if anyone would prefer not to disable that role.
  2. When I first enabled this role, a minor bug caused provisioning to fail, and I had to tweak a file to fix it. I don't remember which file it was. The bug may be fixed by now. If provisioning does fail for you, please ping me. IIRC the fix was easy to find based on the error message.

CentralNotice[edit]

Note: If you already have CentralNotice installed in Vagrant in the main MediaWiki instance, skip this step.

Clone CentralNotice in /vagrant/mediawiki/extensions.

Add the following lines to /vagrant/mediawiki/LocalSettings.php. (Note: There might be a better place to put this.)

wfLoadExtension( 'CentralNotice' );
$wgNoticeInfrastructure = true;
$wgNoticeProjects = array( 'vagrantwiki' );
$wgNoticeProject = 'vagrantwiki';
$wgCentralHost = 'localhost:8080';
$wgCentralSelectedBannerDispatcher = 'http://localhost:8080/w/index.php?title=Special:BannerLoader';
$wgCentralDBname = 'wiki';

In the CentralNotice directory, from within the Vagrant box, run composer update. Then, also from within the Vagrant box, from /vagrant/mediawiki, run php maintenance/update.php --quick.

EventBus service[edit]

From inside the Vagrant box, check that the eventlogging-service-eventbus service is running:

sudo service eventlogging-service-eventbus status

If an error is shown, re-start the service:

sudo service eventlogging-service-eventbus restart

Wait 5-10 seconds, then check the status again. Note: In my current set up, I have to do this every time I reboot the Vagrant box. No idea why.

Consuming Kafka streams from the host machine[edit]

To run on the host machine, centralnotice_changes_monitor needs to consume Kafka streams produced from the Vagrant box. (If you'll run the script from within Vagrant, this step is not necessary.)

Copy /vagrant/support/Vagrantfile-extra.rb to /vagrant and add the following lines:

mwv = MediaWikiVagrant::Environment.new(File.expand_path('..', __FILE__))
settings = mwv.load_settings

Vagrant.configure('2') do |config|
	config.vm.network :forwarded_port,
	guest: 9092, host: 9092, id: 'kafka', host_ip: settings[:host_ip]
end

Then re-start the Vagrant box. (It might be necessary to re-provision, too—I'm not sure.)

On the host machine, add the following to /etc/hosts:

127.0.0.1       vagrant.mediawiki-vagrant.dev

Checking Kafka streams[edit]

You can use kafkacat to check that Kafka streams are being produced.

kafkacat -C -b localhost:9092 -t datacenter1.mediawiki.recentchange

You should see an event for any edit to any page. Here is the list of other topics you can connect to.

Note: If this command fails, just try it again!

Database[edit]

Create a database and a database user, grant the user rights on the database, then run the following command (substituting database, user and password as appropriate):

mariadb -u cn_changes_monitor --password=cn_changes_monitor_pw cn_changes_monitor < sql/create_tables.sql

For development purposes, the SQL to drop all tables is also provided. To use it, copy sql/drop_tables_example.sql as sql/drop_tables.sql and uncomment the last two lines.

Then, you can reset the database like this:

cat sql/drop_tables.sql sql/create_tables.sql | mariadb -u cn_changes_monitor --password=cn_changes_monitor_pw cn_changes_monitor

(Do not deploy an uncommented version of the drop tables file to production. Using the filename sql/drop_tables.sql for the uncommented version will prevent it from being added to the Git repository.)

centralnotice_changes_monitor[edit]

Installation[edit]

Clone from the temporary GitHub repo.

Run the following command from the repository root directory to install with dependencies. (This will create a symlink from a normal python library location to wherever you actually put the repo.)

pip3 install -e .

Configuration[edit]

Copy example-config.yaml to config.yaml. Replace database settings with whatever you used when you created the database and database user.

Execution[edit]

From the repository root directory, run:

centralnotice_changes_monitor/bin/centralnotice_changes_monitor

Optionally you can add the --debug flag, or use --config to point to a different configuration file.

Most changes that come down the pipe should modify the contents of the pages_to_monitor table in the database, so keeping an eye on that can be useful for debugging.

Possible issues[edit]

It seems that sometimes editing banners via the banner editor interface does not trigger a recent change event and does not change core's listings about what is transcluded where. Note that it might be that the event/changes are just not triggered right away. This may be a core or a CentralNotice issue... For the purposes of testing the script, I'd recommend modifying banners as wiki pages, using the "Edit banner on-wiki" link at the top of the CentralNotice banner editor.