Talk:Flow/Analytics

Talk with Dan Andreescu 2014-11-05
There's one dashboard server in labs, limn1, easy to add another site to it.

Mobile Web team's report card is more automated that the editor-engagement hacks described on. Their repo analytics-limn-mobile-data defines the data generation and the report card web site presentation. We'll clone this for Flow. Some files here:
 * config.yaml names the graphs and their SQL file
 * If we just point to a CSV it's easy, if we want to tweak we have to point to a datasource.
 * edits-monthly-new-active.sql uses Jinja templating so the query is parameterized
 * generate.py is run by a cronjob on sta1003 to actually generate stats.
 * the SQL queries run on the DB host analytics-store.eqiad.wmnet which has access to replicated DB stuff, not just EeventLogging but e.g. enwiki revision tables.
 * We need to make sure that Flow's special DB cluster extension1 with flowdb on it is also accessible to this.
 * operations/puppet has limn config in modules/limn and manifests/misc/limn.pp

In development
There's no way to replicate the whole limn set up locally. Annoying, just run the sql on stat1003 and if it works commit it to flow-analytics repo and hope.

working on stat1003
Set up access to stat1003 (through bast1001.wikimedia.org) For mysql borrow ~milimetric/my.cnf.one-box

This has replication of all the wiki databases like  (but not   yet. Also   database has all the event logging tables corresponding to SchemaName_revision on metawiki.
 * we'll need to add flowdb here, see ToDo.

In production

 * make changes to our repo
 * +2 them
 * puppet runs, updates stat1003
 * next cron job should pick up the changes

Limn data generation
Limn data generation also runs on stat1003, So we see the change a few hours later.
 * The repo code is checked out to /a/limn-mobile-data
 * /a/limn-mobile-data/generate.py create stuff in /a/limn-public-data/mobile/datafiles
 * the Limn log is /var/log/limn-mobile-data.log (we aren't in the  group so we can't see it)
 * anything goes wrong, bother Dan.
 * whatever's in /a/limn-public-data/ gets rsync'd to http://datasets.wikimedia.org/limn-public-data

Help

 * milimetric</tt> on, also mforns</tt> and nuria</tt>

To do

 * Dan create a new repo cloned from mobile analytics, but with a  folder in place of mobile.
 * So we think we would put our query definitions and stuff in a here.
 * Dan set up a separate repository for our Flow reportcard that points to our virtual host and runs Flow's generate.py from a cron job
 * Dan set up a puppet change to set up the new flow-reportcard.wmflabs.org (a reportcard can have multiple dashboards). Limn1 has a
 * ErikB will give Dan the DB details for Flow: flowdb on extension1
 * everyone make RT request for stat1003
 * Dan Andreescu will give us access to limn1.
 * Mattflaschen
 * spage
 * Add your labs login here (the think in wikitech instance)