Talk:Packaging

WMF code
"support tracking of the latest WMF code (pretty much in sync with WMF deploys)" Why is this a goal? Why would anybody else want to run the code that WMF runs (as opposed to master or latest point of master which has passed integration tests as well). Also, WMF runs multiple versions simultaneously.--Nikerabbit (talk) 12:45, 19 March 2014 (UTC)


 * Projects like Parsoid with fairly good tests are following a continuous deploy system (twice a week in the case of Parsoid). Before each deploy, we currently ensure that slow round-trip testing on 160k pages looks good (takes a few hours), so the deployed code gets more thorough testing than master which passed through CI tests only (still something like 12k test cases). Eventually we'd like to perform the slow-test vetting automatically, so that we can automatically upload nightlies if the test run was successful.


 * Multiple versions can be supported in many ways. The main mechanisms I see for this are discussed in the wiki page. Do you see a case where that won't work? -- Gabriel Wicke (GWicke) (talk) 00:44, 20 March 2014 (UTC)

Developer upload
"normal developers can upload new packages"

I am highly skeptical of this. Packages install things directly on the system, and the installation process is being run directly by root. So apart from all the problems that pre or post installation/removal scripts can cause (and someone might argue that they can be solved by having those in different repos with different procedures, vetting and so on) the files installed directly by the packages can also cause problems (suid binaries, backdoors etc). Again you can argue that procedures vetting etc can avoid such issues but it only takes one error and here's your full system compromise.

I also suspect that it will be requested at some point (maybe not at first) that such a repo will override vendor ones (cause of some software being backported/forwardported or something of that nature). So it then becomes possible to cause even more mayhem by overriding a package provided by system (yes a full system compromise is still the worst possible scenario, but others exist that will cause problems - that library that was updated in good will and all, but...)

So while I understand the need to allow normal developers to update the repo so that many blockers are lifted, I am not sure this is the sanest approach. -- User:Akosiaris (unsigned)


 * Promoting code review and wider use and testing of standard packages can only improve the situation over the current status quo of init scripts copied from random git repositories or even wiki pages. For this to scale we need to expand the number of developers who can upload new packages, for example to folks who currently have deploy access. This makes sure that those packages pass through the hands of engineers we already trust with deploys, and promotes a basic level of code review.


 * Not arguing there. -- User:Akosiaris (unsigned)


 * The highest level of assurance is needed for packages used on the WMF cluster. As discussed before on the ops list, those packages should be automatically built using debian directories controlled by ops, mirroring the current split we have between puppet and code deploys. Code builds should controlled by an ops-controlled autobuilder script, which also automatically uploads to both an internal and the external repository. This should also be the only way those packages can be uploaded to the public 'untrusted' repo, so that only WMF-vetted versions of important packages are published there.


 * The current split between puppet and code deploys is not mirrored in this procedure. This is what I am pointing out. There is a technical (versus procedural) safety net being removed. The low privileges of users doing code deploys versus the full privileges of root. -- User:Akosiaris (unsigned)


 * I believe it is. Essentially everything that's in puppet now will either remain in puppet (config) or be in the ops-controlled debian dir (init scripts etc). If you see a reason why something like the script sketch I posted on the mailing list won't enforce this then I'd be interested to hear about that. In any case, while I love discussing longer term ideas lets focus on the practical issue of getting a public Debian repo for third party users ready for now. -- Gabriel Wicke (GWicke) (talk) 19:19, 1 April 2014 (UTC)


 * Easy enough. /usr/share/ /backdoor (a C program, compiled and all that passed code review for some reason - insider threat, bad code review, your pick - with rwsr-xr-x permissions and root ownership). The above path is quite obviously not going to be in the ops-controlled debian dir you mention. After package installation the program makes it to all of our systems. Now use your imagination and make it even worse (yes the possibilities are infinite). It is going to be noticed for sure but does it matter ? The damage has been done and it is full. Again, full privileges of root doing the installation versus limited privileges or users doing code deploys. Does the safety net I mention being removed make sense now ? That is at least one reason - a grave one IMHO - the current split between puppet and code deploys is not mirrored in the sketched procedure. And again, not against Debian packages/repos for third party users. It is WMF I am concerned about.


 * So overall we'll need different levels of trust and assurance, and a path for packages to graduate from low levels of trust to higher ones. We want to raise the level of code review and trustworthiness above that of a random git repo or wiki page. In my opinion the best way to get started in this direction is to set up a shared public repository which our current deployers can start using to publish their packages for third parties to use. A trusted internal repository for use by WMF can then be added in a future iteration. -- Gabriel Wicke (GWicke) (talk) 16:38, 1 April 2014 (UTC)


 * What you argue is that with enough procedures and trust and assurance no big problems should arise. What I am arguing is that it is enough for a simple mistep/mishap in these procedures for irrevocable damage to happen (and given enough time such a mistep/mishap is bound to happen). Please do note though that I do not oppose building packages for third parties to use. But while I favor the "eat your own dogfood" approach, we should be very careful in implementing this specific one within WMF as well. Technical (and not procedural) safety nets should be put in place. -- User:Akosiaris (unsigned)

Scope
I think we need to differentiate the scope between the different goals that you have in mind, because some of these have entirely different requirements and some are more controversial than others. I see at least the following:
 * Deploying in the Wikimedia production cluster
 * Provide packages to developers that want to only work on one component, as means of easy installation, especially in light of SOA
 * Provide an easy way for third parties to install nightlies
 * Have proper packages aligned with Wikimedia's plans in Debian & Ubuntu, in collaboration with Debian packagers

As an example of differing requirements, I and others in ops have serious considerations regarding deploying in production using packages (see above). Another example is that becoming a proper Debian upstream means adhering to the Debian Policy and collaborating with the existing maintainers.

I propose to start small, define the scope of this proposal and seek comments on that and that alone, and iterating on expansions later. Faidon Liambotis (WMF) (talk) 17:06, 1 April 2014 (UTC)


 * See the goals section for the scope of this proposal.


 * I see coordinating with Debian proper as something we should strive for as well, but in case of conflicts we should focus on the main goals first.


 * deb-based deployments beyond the current ops-controlled repo is something we can discuss in a future iteration and explicitly not part of this discussion. Hope that clarifies things. -- Gabriel Wicke (GWicke) (talk) 17:19, 1 April 2014 (UTC)

Meeting notes 2014-04-04
Present: Faidon, Chris, Matt, Bryan, Kartik, Gabriel

Discussion

 * Started with pros/cons of major versions using sections vs. versioned package names
 * Faidon sees issues with config files when using versioned package names, leans towards versioned repository sections that contain snapshots of all associated software
 * Gabriel proposes to have a 'stable' catch-all section for everything except (maybe) the PHP code, keep software discoverable
 * should take wider release management discussion into account
 * Decided to postpone decision on stable, and start with a catch-all 'unstable' section for tested but frequent updates
 * Example: Parsoid uploading on deployment typically twice per week
 * General support for
 * supporting stable releases + security upgrades with unattended-upgrades
 * being pragmatic about Debian guidelines in new packages, can iterate later to make a package acceptable for Debian
 * aiming to have a mediawiki-full metapackage that installs and configures everything including caching etc
 * great if this is Debian-compliant, but not required

Decisions / Action items

 * Ops (Faidon or ??) to set up a public repository at release.wikimedia.org/debian/ or the like
 * investigate support for multiple versions per package in the same section, potentially using mini-dinstall
 * motivation: support for revert / go back to last working version; useful in unstable section and provides testing for a potential later private deploy repo
 * pick an URL scheme that we can keep stable even if we decide to switch repo managers later
 * Package build and upload to be triggered by tags in gerrit
 * Likely collaboration with Antoine & ops
 * need to think about controlling uploads to sections; maybe some tag scheme like unstable-0.3.3 ?
 * Chris: need to keep build infrastructure secure; maybe a dedicated Jenkins slave that's not running normal tests?
 * can switch to a different trigger later, keep build & upload scripts relatively independent