Talk with Dan Andreescu 2014-11-05
There's one dashboard server in labs, limn1, easy to add another site to it.
Mobile Web team's report card is more automated that the editor-engagement hacks described on Flow/Analytics. Their repo analytics-limn-mobile-data defines the data generation and the report card web site presentation. We'll clone this for Flow. Some files here:
- config.yaml names the graphs and their SQL file
- If we just point to a CSV it's easy, if we want to tweak we have to point to a datasource.
- edits-monthly-new-active.sql uses Jinja templating so the query is parameterized
- generate.py is run by a cronjob on sta1003 to actually generate stats.
- the SQL queries run against
- databases hosted on analytics-store.eqiad.wmnet (replicated DB stuff, not just EeventLogging but e.g. enwiki revision tables.
- databases hosted onx1-analytics-slave
- We need to make sure that Flow's special DB cluster extension1 with flowdb on it is also accessible to this.
- the SQL queries run against
- operations/puppet has limn config in modules/limn and manifests/misc/limn.pp
There's no way to replicate the whole limn set up locally. Annoying, just run the sql on stat1003 and if it works commit it to flow-analytics repo and hope.
working on stat1003
Set up access to stat1003 (through bast1001.wikimedia.org) For mysql borrow ~milimetric/my.cnf.one-box
$ ssh stat1003.wikimedia.org $ mysql --defaults-file=/home/milimetric/.my.cnf.one-box mysql:firstname.lastname@example.org [(none)]> show databases
This has replication of all the wiki databases like
enwiki (but not
log database has all the event logging tables corresponding to SchemaName_revision on metawiki.
- we'll need to add flowdb here, see ToDo.
mysql:email@example.com [(none)]> use log Database changed mysql:firstname.lastname@example.org [log]> show tables like 'echo%'; +-------------------------+ | Tables_in_log (echo%) | +-------------------------+ | EchoInteraction_5539940 | | EchoInteraction_5782287 | | EchoMail_5467650 | | EchoPrefUpdate_5488876 | | Echo_5285750 | | Echo_5364744 | | Echo_5423520 | | Echo_6081131 | | Echo_7572295 | | Echo_7731316 | +-------------------------+ 10 rows in set (0.00 sec)
- make changes to our repo
- +2 them
- puppet runs, updates stat1003
- next cron job should pick up the changes
Limn data generation
Limn data generation also runs on stat1003,
- The repo code is checked out to /a/limn-mobile-data
- /a/limn-mobile-data/generate.py create stuff in /a/limn-public-data/mobile/datafiles
- the Limn log is /var/log/limn-mobile-data.log (we aren't in the
statsgroup so we can't see it)
- anything goes wrong, bother Dan.
- whatever's in /a/limn-public-data/ gets rsync'd to http://datasets.wikimedia.org/limn-public-data
So we see the change a few hours later.
- milimetric on connect, also mforns and nuria
- Dan create a new repo cloned from mobile analytics, but with a
flowfolder in place of mobile.
- So we think we would put our query definitions and stuff in a here.
- Dan set up a separate repository for our Flow reportcard that points to our virtual host and runs Flow's generate.py from a cron job
- Dan set up a puppet change to set up the new flow-reportcard.wmflabs.org (a reportcard can have multiple dashboards). Limn1 has a
- ErikB will give Dan the DB details for Flow: flowdb on extension1
- everyone make RT request for stat1003
- Dan Andreescu will give us access to limn1.
- Add your labs login here (the think in wikitech instance)