Kubernetes SIG/Meetings/2023-04-25


 * Introductions for new members (if any)
 * SIG administrivia:
 * Nothing to report
 * Topic: Kubernetes versioning discussion (continuation)
 * Action items from previous meeting:
 * Speak to our respective managers and get sign-off for pre-planning this upgrade.
 * Get acquainted with reading changelogs and security bulletins
 * Read https://wikitech.wikimedia.org/wiki/Kubernetes/Kubernetes_Infrastructure_upgrade_policy
 * Rough summary from previous meeting
 * Production use cases should converge to the same policies.
 * Testing is a use case that might be hampered by overly strict convergence. That use case could be served in different ways too
 * Overall agreement to support 2 versions (one currently in and one to upgrade to) at most.
 * Needing management buy in to make sure we all allocate time to upgrade our clusters, and plan to upgrade each year.
 * For point releases (specifically security): shared understanding that things won’t break when executing such upgrades in place, will do them on as needed basis (https://kubernetes.io/releases/version-skew-policy/).
 * Parts not discussed in the previous meeting
 * Do we need to support a different method of updating then re-init (for clusters that are not multi-DC and can’t be easily depooled)?
 * Should we try to update components “off-band” (like updating calico while not updating k8s to unclutter the k8s update)
 * Off predefined topic
 * How far have we got with spark and user-generated workload on dse?
 * How can we manage secrets better for users sharing namespaces? Or should we not?
 * Draft document detailing progress with Spark/Hadoop/Hive already open for comments.

Notes

[AO] We can do downtime for DSE, I believe, might want to check with Luca and ML team.

[CD] We can also do downtime for aux-k8s.

[SD] If we could create a business case for upgrading more often, would it lead to less work instead of the work for the big invasive upgrades?

[AK] Currently this hasn’t been yet possible. Upgrading once a quarter hasn’t been a thing up to now cause never managed to do so

[CD] This doesn’t answer the question. It’s orthogonal, as in it hasn’t happened, but we still could

[JM] I think it might be possible we come into a situation where we could do in place upgrades if we happen to prove that we’re able to keep up with k8s upgrade (and don’t have to do big jumps)

[AK] We did in place upgrades in the past (which worked) but mostly without workload

[JM] “off-band” updates of k8s components might help in keeping the actual k8s upgrades smaller and less scary (even help with in place k8s updates)

[AK] Maintaining the compatibility matrix in an ongoing fashion will be hard because of all the inter-dependencies, but overall it has the potential to pay off by making part of the work in piece meal parts.

[AK] Action item: Create a PoC for a Compatibility Matrix of Kubernetes versions vs Cluster components.


 * Big topics at KubeCon CloudnativeCon 2023:
 * Kubernetes Gateway API and Cluster API
 * Developer portals and service catalogs like backstage and crossplane
 * Code sandboxing like Webassembly and ebpf
 * CI (mostly argo)
 * And of course security, AI and energy efficiency
 * I can link relevant talks when they are uploaded

[All] Action Item: Read (and edit) Kubernetes_Infrastructure_upgrade_policy

[All] Action Item: Talk with managers about cluster upgrade resources (roughly 1 engineer a quarter per year)