Platform Engineering Team/Data Value Stream

Mission / Objective
The Data Platform team serves data producers and data consumers by providing a stack of software solutions to support the following capabilities:


 * Scheduled Dataset Creation
 * Data Persistence
 * Data Gateway
 * Data Events
 * Event Driven Dataset Creation
 * Data Discovery

The team's primary focus is building out these capabilities while creating clear and comprehensive documentation so teams can utilize these services to build out their own data pipelines. The Data Platform team will partner with data producers and consumers to understand use case needs, provide design recommendations, review code and deploy code to the stack.

The Data Platform team's ultimate goal is to centralize data pipeline creation and ensure good software development standards are encouraged throughout the process.

What isn't the Data Platform team?
The Data Platform team is not scaled to build out and own data pipelines for other teams unless there is an explicit need or lack of technical expertise in the requesting team. This will be handled on a case by case basis as support is needed.

The team's focus is on building capabilities for the platform to support dataset producer and consumer use cases.

What is generated data?
The Data Platform team defines this as any dataset generated from the results of a data pipeline that requires persistence for use in a process that serves knowledge content and knowledge experiences. The team considers generated data which has a primary use in analytics or machine learning as out of scope.

NOTE: The team will continue to support AQS but our main focus will be on our capabilities. We will review ownership of AQS as we progress with our platform services

What is a data pipeline?
A data pipeline is a series of data processing steps. It typically involves ingestion of data from a source, one or many steps to transform, enrich or aggregate that data and a final step to output the resulting dataset to a data store.

Work Intake Process
For significant projects, you can follow Platform Engineering teams how to work with us process.

For work related to bugs, features or support on Data Platform areas of responsibility, you can contact Data Product Manager Luke Bowmaker directly or create a task on our Phabricator board, tagged with #generated_data_platform assigned to lbowmaker(Luke Bowmaker). The team meets throughout the week and triages on a rolling basis.

Data Platform Value Stream Demo
On a regular cadence, the Data Platform Value Stream team will post demos of our developments/works in progress here to provide transparency and gather feedback.

Demo Sessions
Note: You'll need to be signed in with your WMF account to view these videos.