Kubernetes SIG/Meetings/2023-04-25

From mediawiki.org
  • Introductions for new members (if any)
  • SIG administrivia:
    • Nothing to report
  • Topic: Kubernetes versioning discussion (continuation)
    • Action items from previous meeting:
    • Rough summary from previous meeting
      • Production use cases should converge to the same policies.
      • Testing is a use case that might be hampered by overly strict convergence. That use case could be served in different ways too
      • Overall agreement to support 2 versions (one currently in and one to upgrade to) at most.
      • Needing management buy in to make sure we all allocate time to upgrade our clusters, and plan to upgrade each year.
      • For point releases (specifically security): shared understanding that things won’t break when executing such upgrades in place, will do them on as needed basis (https://kubernetes.io/releases/version-skew-policy/).
    • Parts not discussed in the previous meeting
      • Do we need to support a different method of updating then re-init (for clusters that are not multi-DC and can’t be easily depooled)?
      • Should we try to update components “off-band” (like updating calico while not updating k8s to unclutter the k8s update)
    • Off predefined topic
      • How far have we got with spark and user-generated workload on dse?
        • How can we manage secrets better for users sharing namespaces? Or should we not?
        • Draft document detailing progress with Spark/Hadoop/Hive already open for comments.

Notes

[AO] We can do downtime for DSE, I believe, might want to check with Luca and ML team.

[CD] We can also do downtime for aux-k8s.

[SD] If we could create a business case for upgrading more often, would it lead to less work instead of the work for the big invasive upgrades?

[AK] Currently this hasn’t been yet possible. Upgrading once a quarter hasn’t been a thing up to now cause never managed to do so

[CD] This doesn’t answer the question. It’s orthogonal, as in it hasn’t happened, but we still could

[JM] I think it might be possible we come into a situation where we could do in place upgrades if we happen to prove that we’re able to keep up with k8s upgrade (and don’t have to do big jumps)

[AK] We did in place upgrades in the past (which worked) but mostly without workload

[JM] “off-band” updates of k8s components might help in keeping the actual k8s upgrades smaller and less scary (even help with in place k8s updates)

[AK] Maintaining the compatibility matrix in an ongoing fashion will be hard because of all the inter-dependencies, but overall it has the potential to pay off by making part of the work in piece meal parts.

[AK] Action item: Create a PoC for a Compatibility Matrix of Kubernetes versions vs Cluster components.

  • Big topics at KubeCon CloudnativeCon 2023:

[All] Action Item: Read (and edit) Kubernetes_Infrastructure_upgrade_policy

[All] Action Item: Talk with managers about cluster upgrade resources (roughly 1 engineer a quarter per year)