Jump to content

Flink SIG/Meetings/2025-02-20

From mediawiki.org

Agenda

[edit]
  • Greetings
  • State of Flink at wikimedia
  • Flink 1.20 deployment


Action items

[edit]

AH: start a google doc / wiki to collect existing and future use cases

BK: Document operator dev and maintenance on wikitech. https://phabricator.wikimedia.org/T386950

DC: Research what's coming with FIlnk 2.

GM: wrap up the work on ListState and EXACTLY_ONCE semantics.

Notes

[edit]

AO: is this internal to DPE?

GM: Most users in DPE, stakeholders in other teams

BK: other teams interested if we socialize, observability. Maybe community.

GM: AO work working with community members on CDC.

DC: Resrearch or ML looking into flink for reseach.

GM: research has PoC, need a k8s to deploy. ML maybe for model scoring?

AH: document and compile list of use cases.

GM: 1.20 update. How do we manage operator CRDs?

BK: nothing major

DC: next steps would be for teams to be automous to deploy 1.20. We need SREs to deploy operators. Would be nice to have 1-2 SREs that are Flink experts. AO wrote most of the operator. Is there doc?

DC: three clusters to upgrade, than migrate to 1.20. This group should be sure we have time budget to implement changes and roll out. GM: What do we need? Library are in place. Docker?

DC: we have images for 1.20, but need base images for 1.20.1.

DC: we need to prepare for Flink 2. https://flink.apache.org/2024/10/23/preview-release-of-apache-flink-2.0/ . Eg. Scala APIs have been removed. GraalVM?

DC: how about python?

GM: Async/IO support in python. Should we work with upstream?

AH: on a technical side what would we like? Throughput / avail

GM: some have uncertainty around SLOs.

GM: would it be useful to have demo?

DC: research has a nice event app: trends on all calls made to our site. Real-time aggregation from webrequest? Could be a way to show how Flink can do high throughput. But we need a serving layer, needs a way to show the data.

DC: we could exlpain the jobs we do to give a sense of what it is capable of. Document capabilites. AH: What is the audience for demos?

DC: need to find docs that AO started with use cases.

DC: maybe ML?

AH: https://gitlab.wikimedia.org/repos/data-engineering/mediawiki-event-enrichment/-/merge_requests/86 might come back in Q4 as an official request.

GM: work is pretty fa. Needs k8s support an