Wikimedia Release Engineering Team/Book club/Accelerate

Starting June 2021 Release Engineering will be reading Accelerate: the science behind DevOps : building and scaling high performing technology organizations (phew! What a subtitle)

Meta

 * etherpad

Schedule

 * 2021-06-08: Foreword, Preface, Chapters 1-3 (pg 78, Kindle edition)
 * 2021-06-22: Chapters 5-8 (pg 128, Kindle edition)
 * 2021-07-13: Chapters 9-11 (pg 182, Kindle edition)
 * 2021-07-20: Part II: Chapters 12-15 (pg 233, Kindle edition)
 * 2021-08-03: Chapter 16, conclusion (pg 261, Kindle edition)

Purpose

 * The purpose of this book club is to ensure that the Release Engineering Team are subject matter experts about DevOps and, in particular, about release/deployment.
 * This is meant to clarify the structure of these meetings.

Chapter 9

 * Deployment pain and burnout
 * Deployment pain resonated but devops as a cure for burnout did not
 * We don't do devops so that 's a problem
 * Deployment automation - copy paste errors
 * Human error and burnout
 * Like that as a way to avoid blame
 * We don't hide anything in our documentation, but we don't help people
 * Runnable runbooks

Chapter 10

 * Net Promoter Score as some kind of statistical worse-is-better
 * Discussion about working on job descriptions

Chapter 11

 * Yak days came from here!
 * Value alignment protecting from burnout vs value alignment leading to burnout from overwork

Foreword/Preface

 * G: Use the word quibble
 * B: The prose is excrutiating
 * A: software development: half tech work/half political discussion—this is a recurring discussion in France—we have people who are software people who are working to drive the business forward. Lots of management terms
 * G: foreword from fowler. They didn't survey everyone, only people who thought they were doing devops
 * B: Book makes a lot of strong claims about its rigor

Chapter 1: Accelerate

 * T: want to understand how they collect this
 * A: it is apparently a yearly thing. Latest: https://puppet.com/resources/report/2020-state-of-devops-report/
 * T: Did they ever define the term "capabilities"?
 * G: 24 capabilities...
 * B: Maturity vs capability seems like a bigger discussion in management
 * Maturity model use here: https://www.mediawiki.org/wiki/Architecture_Repository/Architecture_practice/Maturity_model
 * A: maturity model—ITIL framework

Chapter 2: Measuring Performance

 * Dancy: Agree with book: Make it fast, make it easy to recover from mistakes
 * Hashar: Change failure rate—disagree since that will depend of the number of things that get pushed. How often we deploy or rollback.
 * Greg: short lead times are important. Time it takes to go from code committed to successfully running in production. It doesn't include. There's a contradiction, but they've termed this the Fuzzy Frontend. Lead time is just when you start work on something; i.e., when you open emacs or VI.
 * Tyler: Bitergia? (yes)
 * Brennen: what lead time do we care about?
 * Jeena: how long is not that bad?
 * G: multiple times per day vs once per day. With all kinds of metrics it's important to know where you are.
 * Antoine: If we wnat to act, we need metrics.
 * Brennen: we say we need to measure before we know what we need to do. If we're talking about lead-time, we know it's about a week unless we get rid of the train.
 * Can be up to two weeks (patch written on tuesday, holiday the week after with no deploy)
 * Backports can be delayed cause of friday/week-end
 * G: Delviery performance combines these 4 metrics—change failure rate was not valid. It's one of the hardest to reason through and the least valid.

Chapter 3: Measuring and Changing Culture

 * Some/all surveys questions could have been added to our developer satisfaction survey.
 * Brennen: we're pathalogical in many ways, but not by the book's definition
 * Post office: coffee machine made things generative. We have a way we can reach out to one another
 * Tyler: Pathological on "shared responsibility" but not on others
 * Tyler: code review culture is healthy -- 21% of people disagreed
 * Antoine: the people who don't think it's healthy are volunteers. Review might not find any benefit (aka nitpicking when one expects insights about the overall architecture/design)

Overall thoughts

 * Greg: lots of words, but little content. Management books say things in multiple different ways so you can have a quote that fits your situation. I think it makes sense for us to try to improve these metrics.
 * Ahmon: agree with Greg. Lots of words to say something we're already trying to do.
 * Antoine: non-native speaker, I found it easy to read. It's big picture; there is no barrier to entry. This book is from 2017 and it looks already dated—the industry already moved passed what's in this book—containers changed a lot of things. I feel like Wikimedia does a lot of what's in this book which makes me proud. Love the claim of the scientific backing for the research. We just need the raw data.
 * Antoine:
 * looks dated already
 * feel like we invented or at least were very early adopters of what is exposed (one click button deployment idea comes from 2013)
 * love the claim of a scientifical approach, but it is eems to be a promise and lacks data / details to realize it.
 * Mukunda: seems to align with my expectations. I think CD and DevOps are extremely important. The difference in morale is important.
 * Jeena: got this book yesterday. I'm interested in comparing this with the CD book.
 * Brennen: reading this book so that I know what to expect in terms of bullshit from the management structure
 * Jeena: what are the most relevant part of this book that we can claim to be expert in? What about this book is something we as experts need to read?
 * Greg: book is making the rounds. Someone read it, suggested it to Grant, and all managers are reading this book (+ our team)

Chapter 5: Architecture

 * "low performers were more likely to say that software they were building was custom software developed by another company" and "in the rest of the cases there was no significant correlation between system type and delivery performance."
 * Cross-team communication and cross-functional team stuff: We're bad at comms and we need a lot of it to do most things
 * List of bullets of team capabilities that in turn make good results.
 * "We don't actually work in a wiki-like way"
 * "more embedding would be a good thing" "more fluid collaboration instead of rigid collab through quarterly goals"
 * "rotating through other teams" would provide a lot of benefits that embedding would without the organizational changes required of embedding
 * Can we use onboading as a means for more cross-team collaboration by using rotations through sponsor teams?
 * Let people choose their own tooling, BUT, others need to be able to debug the service easily so standardization is also important.

Chapter 6: Integrating Infosec into the Delivery Lifecycle

 * Discussion of https://security.googleblog.com/2021/06/introducing-slsa-end-to-end-framework.html
 * Also https://about.gitlab.com/blog/2021/05/27/deep-dive-investigation-of-gitlab-packages/
 * Shifting left includes "security review for all features" - but security teams are tiny, this is something that's underresourced everywhere.
 * We make fun of The Rugged Movement

Chapter 7: Management Practices for Software

 * Agile, Lean, Limiting WIP, using dashboards for quality and productivity metrics
 * Use app perf data to make business decisions
 * Data^3 and Phab improvements here for WIP limit/throughput visualization
 * Is the discussion around Change Approval Boards an argument against the "deployment approval council"?
 * needs to be light-weight ^
 * Deployers need to be the developers of the code
 * Would empower people but also connect them to the results more, drive a sense of responsibility.
 * It doesn't make sense that people who have no idea what's being deployed are the deployers.
 * G: Is it feasible to have people holding the deployment conch?
 * J: Maybe each team could have their own time?
 * M: Could batch things. Closest thing I could see is having backport windows that last most of the day.
 * J: Maybe instead of time of day, teams could have a day of their own...
 * G: don't forget volunteer hour :)
 * B: Needs to be automated/easy for anyone to do it.
 * Discussion of deployment commands tool.
 * Needs to be a single action - be it a command, pushing a button, whatever.

Chapter 8: Product Development

 * Customer feedback, can we do that more for our tools? Get people using them (devs doing their own deploys) :) We have stockholm syndrome.
 * Deployment training has been good for this.
 * Lean dev practices
 * less than a week iterations
 * understand of the flow of work
 * seek feedback
 * authortity to create and change specifications as part of the dev process without approval
 * M: Need more feature flag usage
 * Discussion of needing a unified, easy feature flag system that doesn't take editing a giant PHP settings file
 * some gui for PMs with dropdowns and checkboxes only :)