Wikimedia Technology/Annual Plans/FY2019/CDP1: Privacy, Security, and Data Management/CDP Budget Segment 3/Goals

=Program Goals and Status for FY18/19=

Segment 3 - Analytics
 * Goal Owner: NRuiz (WMF)
 * Program Goals for FY18/19: Develop, maintain and mature our privacy, security, and data management practices in order to protect Wikimedia community member and donor information, comply with applicable privacy and data protection regulations, and ensure safe and secure connection to Wikimedia projects and sites in accordance with the values of the movement.
 * Annual Plan: Segment 3 - Analytics





= Q2 Goals =

Outcome / Output
Ensure the high-quality protection and security of our infrastructure and data.


 * Make systems compliant with security best practices, as vetted and recommended by Security.

Dependencies on: Security team

Goals

 * More restrictive Firewall rules for Kafka.
 * Review the requirements for a service implementing a stronger user authentication scheme for the Analytics Hadoop cluster and possibly for other related tools (like Zookeeper). ✅
 * STRETCH GOAL: implement a prototype in labs that the Analytics team can test and evaluate. ✅

Status
October 19, 2018


 * We are working with SRE evaluating the requirements of Kerberos

November 14, 2018


 * Testing kerberos in labs cluster is now

December 6, 2018


 * We have a prototype running in labs that allows us to test kerberos, need to decide what use cases to hit first when moving this work to prod, but the goals for this quarter are ✅.



= Q3 Goals =

Outcome / Output (Analytics)
Ensure the high-quality protection and security of our infrastructure and data.


 * Make systems compliant with security best practices, as vetted and recommended by Security.

Dependencies on: Security team, SRE

Goal:

 * Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one ✅
 * Set up a Kerberos KDC service in production with minimal puppet automation
 * Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings
 * Create test Kerberos identities/accounts for some selected users from Analytics

Status
February 14, 2019
 * We have set up a shadow test cluster to which we are adding kerberos, we are in track to be able to test critical jobs

March 14, 2019


 * These two work items are ❌ until next quarter:
 * Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings
 * Create test Kerberos identities/accounts for some selected users from Analytics



= Q4 Goals =

Outcome / Output (Analytics)
Ensure the high-quality protection and security of our infrastructure and data.


 * Make systems compliant with security best practices, as vetted and recommended by Security.

Dependencies on: Security Team, SRE

Goal(s)

 * Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings
 * Create test Kerberos identities/accounts for some selected users from Analytics

Status
May 2019


 * We delayed this work due to 1) superset upgrades that took much longer than planned and 2) lack of availability from SRE to troubleshoot the current setup of Kerberos which has some issues. Still, we hope to be mostly done by EOQ.

June 2019


 * Discussed...