Wikimedia Technology/Annual Plans/FY2019/CDP1: Privacy, Security, and Data Management/CDP Budget Segment 3/Goals

From mediawiki.org

Program Goals and Status for FY18/19[edit]

  • Goal Owner: NRuiz (WMF)
  • Program Goals for FY18/19: Develop, maintain and mature our privacy, security, and data management practices in order to protect Wikimedia community member and donor information, comply with applicable privacy and data protection regulations, and ensure safe and secure connection to Wikimedia projects and sites in accordance with the values of the movement.
  • Annual Plan: Segment 3 - Analytics


[edit]

Outcome / Output[edit]

Ensure the high-quality protection and security of our infrastructure and data.

Make systems compliant with security best practices, as vetted and recommended by Security.

Dependencies on: Security team

Goals[edit]

  • More restrictive Firewall rules for Kafka. task T204957
  • Review the requirements for a service implementing a stronger user authentication scheme for the Analytics Hadoop cluster and possibly for other related tools (like Zookeeper). Yes Done
  • STRETCH GOAL: implement a prototype in labs that the Analytics team can test and evaluate. task T198227 Yes Done

Status[edit]

Note Note: October 19, 2018

We are working with SRE evaluating the requirements of Kerberos

Note Note: November 14, 2018

Testing kerberos in labs cluster is now In progress In progress

Note Note: December 6, 2018

We have a prototype running in labs that allows us to test kerberos, need to decide what use cases to hit first when moving this work to prod, but the goals for this quarter are Yes Done.


[edit]

Outcome / Output (Analytics)[edit]

Ensure the high-quality protection and security of our infrastructure and data.

Make systems compliant with security best practices, as vetted and recommended by Security.

Dependencies on: Security team, SRE

Goal:[edit]

  • Set up a Analytics Hadoop test cluster in production that runs a configuration as close as possible to the current one task T212256 Yes Done
  • Set up a Kerberos KDC service in production with minimal puppet automation task T212257 Incomplete Partially done
  • Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings task T212259
  • Create test Kerberos identities/accounts for some selected users from Analytics task T212258

Status[edit]

Note Note: February 14, 2019

  • We have set up a shadow test cluster to which we are adding kerberos, we are in track to be able to test critical jobs

Note Note: March 14, 2019

These two work items are N Postponed until next quarter:
  • Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings task T212259
  • Create test Kerberos identities/accounts for some selected users from Analytics task T212258


[edit]

Outcome / Output (Analytics)[edit]

Ensure the high-quality protection and security of our infrastructure and data.

Make systems compliant with security best practices, as vetted and recommended by Security.

Dependencies on: Security Team, SRE

Goal(s)[edit]

  • Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings task T212259
  • Create test Kerberos identities/accounts for some selected users from Analytics task T212258

Status[edit]

May 2019

We delayed this work due to 1) superset upgrades that took much longer than planned and 2) lack of availability from SRE to troubleshoot the current setup of Kerberos which has some issues. Still, we hope to be mostly done by EOQ.

To do To do June 2019

Discussed...