Talk:Flow/Analytics

Talk with Dan Andreescu 2014-11-05
There's one dashboard server in labs, limn1, easy to add another site to it.

Mobile Web team's report card is more automated that the editor-engagement hacks described on. Their repo analytics-limn-mobile-data defines the data generation and the report card web site presentation. We'll clone this for Flow. Some files here:
 * config.yaml names the graphs and their SQL file
 * If we just point to a CSV it's easy, if we want to tweak we have to point to a datasource.
 * edits-monthly-new-active.sql uses Jinja templating so the query is parameterized
 * generate.py is run by a cronjob on sta1003 to actually generate stats.
 * the SQL queries run against
 * databases hosted on  (replicated DB stuff, not just EeventLogging but e.g. enwiki revision tables.
 * databases hosted onx1-analytics-slave
 * We need to make sure that Flow's special DB cluster extension1 with flowdb on it is also accessible to this.
 * operations/puppet has limn config in  and

In development
There's no way to replicate the whole limn set up locally. Annoying, just run the sql on stat1003 and if it works commit it to flow-analytics repo and hope.

working on stat1003
Set up access to stat1003 (through bast1001.wikimedia.org) For mysql borrow

This has replication of all the wiki databases like  (but not   yet. Also   database has all the event logging tables corresponding to SchemaName_revision on metawiki.
 * we'll need to add flowdb here, see ToDo.

In production

 * make changes to our repo
 * +2 them
 * puppet runs, updates stat1003
 * next cron job should pick up the changes

Limn data generation
Limn data generation also runs on stat1003, So we see the change a few hours later.
 * The repo code is checked out to /a/limn-mobile-data
 * create stuff in
 * the Limn log is  (we aren't in the   group so we can't see it)
 * anything goes wrong, bother Dan.
 * whatever's in  gets rsync'd to http://datasets.wikimedia.org/limn-public-data

Help

 * on, also  and

To do

 * Dan create a new repo cloned from mobile analytics, but with a  folder in place of mobile.
 * So we think we would put our query definitions and stuff in a here.
 * Dan set up a separate repository for our Flow reportcard that points to our virtual host and runs Flow's generate.py from a cron job
 * Dan set up a puppet change to set up the new flow-reportcard.wmflabs.org (a reportcard can have multiple dashboards). Limn1 has a
 * ErikB will give Dan the DB details for Flow: flowdb on extension1
 * everyone make RT request for stat1003
 * Dan Andreescu will give us access to limn1.
 * Mattflaschen
 * spage
 * Add your labs login here (the think in wikitech instance)