Data Platform Engineering/Data Platform SRE/Priorities
Here are the high level priorities of the DPE SRE team. The detailed backlog can be found on our main Phabricator board. Our current work can be followed on our "milestone" Phabricator board (there is no stable link to the current milestone, but it can be found as a link in the menu of our main board).
Current main projects
[edit]Archiva is our current solution for artifact hosting for Java / Scala projects and mirroring of external Maven repositories. It is unsupported and as a critical piece of our development and deployment infrastructure needs to be replaced. Gitlab is a component that provides the functionality that we need and is already deployed in our infrastructure, it is the obvious solution.
This project is driven by DPE SRE, but most of the implementation work is done by Search Platform, Data Engineering and Data Products. It is prioritized on top of the usual work for those teams and thus is slow moving.
Links
[edit]- Main phab task: T367315
- Decision brief
Hadoop upgrade:
[edit]Links:
- Main phab task: T379385
- Project plan: Hadoop 3 upgrade: high-level plan
Kubernetes upgrade
[edit]We need to make sure that the dse-k8s-eqiad cluster is using the new version of kubernetes. ServiceOps has largely prepared the upgrade and has begun rolling it out to wikikube. We need to follow suit with our own cluster. We have some plugins and operators that are specific to the dse-k8s cluster, so we need to be very careful with these. We have already completd some preparatory work in T369492 and T377875.
The plan is to test the upgrade on the new dse-k8s-codfw cluster and make sure that all operators and plugins work, before applying the update to the cluster in eqiad, too.
Links:
- Parent epic: T341984 Update Kubernetes clusters to 1.31
Mutualized OpenSearch cluster:
[edit]Links:
- Main phab task:T362105
- design doc
Migrate Current-Generation Dumps to Airflow
[edit]Links:
- Main phab task: T352650 Migrate current-generation dumps to run on kubernetes
- Design doc: Migrating Dumps 1.0 to Airflow and K8S
Recently Completed Projects
[edit]To simplify operations and increase availability, we are migrating Airflow to k8s.
Links
[edit]- Main phab task: T362788
- Design doc
To support the deprecation and removal of Graphite
Links
[edit]- Main phab task: T359033
To support work by the Search Platform team. In particular, DPE SRE is focused on migration of the internal WDQS clients and the operational support of the underlying servers / platform.
Links
[edit]Migration of the Search cluster from Elasticsearch to OpenSearch:
[edit]Links:
- Main phab task: T370147
Usual operational work
[edit]- Incidents
- Various minor software upgrades
- Access requests
- SPARQL Federation requests
High level backlog of projects
[edit]- Kafka upgrade: design doc
- Spark upgrade
- Migration of additional services to k8s
- Presto
- JupyterHub