Wikimedia Release Engineering Team/DataDataData Sync Up/2019-05-21

= 2019-05-21 =

Phab task

 * https://phabricator.wikimedia.org/T216085

Last time

 * Previous meeting was long ago...

Today's Agenda

 * Vacation and not much movement
 * Review requirements
 * Go over email draft

What data we have currently or are planning to collect

 * Schema
 * Data samples

How we might want to query that data

 * Our data is highly structured (see schemas)
 * Is Hadoop or ES more appropriate for that? Would we lose structure by putting it in Hadoop?
 * How much do we have to know about how data's structure before we put it in ES?
 * Can relationships/schema be changed after data is stored?

TODOs (by next meeting)

 * Dan to send email to Analytics and set up meeting