Wikimedia Release Engineering Team/DataDataData Sync Up/2019-06-04

= 2019-06-04 =

Phab task

 * https://phabricator.wikimedia.org/T216085

Last time

 * 2019-05-21

Today's Agenda

 * Intro email was sent to Analytics. Nuria responded, asking for a more formal use case document.
 * Dan is on leave starting next week.

What data we have currently or are planning to collect

 * Schema
 * Data samples

How we might want to query that data

 * Our data is highly structured (see schemas)
 * Is Hadoop or ES more appropriate for that? Would we lose structure by putting it in Hadoop?
 * How much do we have to know about how data's structure before we put it in ES?
 * Can relationships/schema be changed after data is stored?

TODOs (by next meeting)

 * Let's start a document for existing use cases and point out need for open-ended logging and reporting (JR will do this)
 * Taskify this. (JR)
 * Jenkins Build Duration report could be one potential use case
 * Instrumenting Quibble (et al) and reporting on test suite/case durations over time and per project
 * Doc https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1FRM1UMaPxrviBFquHFhWFct8R4KmcaoNlh3t2suKDmA%2Fedit
 * Keep communication going with Analytics