Kask

Kask is an opaque key-value data store with a RESTful (HTTP) interface. It utilizes Apache Cassandra for persistence, making it suitable for very large and/or high-volume data sets, applications requiring geographically aware master-master replication, and high-availability.

Some of its features include:


 * Support for Transport Layer Security (TLS) to secure end-to-end communications, both encryption of communications and providing authentication with public-key cryptography
 * Expiration of values through the use of a service-wide time-to-live (TTL)
 * Simplified consistency model; Reads and writes utilize Cassandra's data-center local quorum, while deletes block for quorum in each data-center

Dependency Management
The libraries an application depends on are as much a part of the final product as the code we write ourselves, and yet it is all too common for us to choose them indiscriminately, retrieve them via untrusted sources, and treat them (and the entire graph of transitive dependencies) as black-boxes. Often this pattern is deeply ingrained in our tools and the culture surrounding them. Case in point: Kask is written in Go, where traditionally little emphasis has been placed on release management; Applications import external dependencies by referencing their remote Git repository, typically the HEAD of the master branch, with a result that is statically compiled (requiring recompilation to link against any updated dependencies). This –run the latest of everything, and hope for the best– mentality is antithetical to quality software. It makes reproducibility prohibitively difficult, and the complete lack of environmental stability makes tracking defects, (including those impacting security) and their interactions intractable.

Tooling notwithstanding, proper dependency management is difficult and labor intensive. It requires that each node in the dependency graph be released managed, and that compatibility between nodes be established to properly inform the edges. Change of any kind is as likely to introduce new bugs as it is to fix existing ones, and changes that alter existing or introduce new functionality disproportionately so. Sound judgement is required to balance the value of an update with the risks. When changes are made, careful testing is needed to ensure continued compatibility, and flag any new regressions. This is a tremendous amount of work, fortunately, there is an alternative to doing this ourselves.

Debian is a Linux distribution founded in 1993, with a long-standing reputation for quality control. Software that is packaged for Debian has been carefully curated. Packagers ensure that an active and responsive upstream exists, but accept responsibility for the duration of a release if an upstream becomes unwilling or unable to address issues. Care is taken to select the most appropriate version for release, and its transitive dependencies are satisfied by dependent relationships with other packages. Changes to a package during a stable release are made only on an as-needed basis (crippling bugs, security vulnerabilities, etc), and are as minimally invasive as possible. Additionally, PGP encryption is utilized to establish a strong chain of trust between the developers who upload packages, and the machines where they are ultimately installed. It would be difficult to overstate the amount of software life-cycle management work that goes into a distribution like Debian, work we do not have to do if we satisfy our dependencies using packaged software.

TL;DR Kask's code dependencies are sourced entirely from what is available in Debian GNU/Linux (Stretch/9.8 at the time of writing).

Setup
Clone Kask's source code repository. For example: Builds at the Wikimedia Foundation are created using a Docker image generated by Blubber; Utilizing Blubber with Kask's deployment pipeline configuration is the easiest way to create a container for development use. Prebuilt, statically linked binaries for most platforms can be obtained from the Blubber download page.

Blubber outputs a Dockerfile based on Kask's pipeline configuration, and  will create the corresponding image.

Building
The following can be copied to a file ( for example) and invoked as a script to issue commands inside the development container.

Releasing
To release a new version of Kask, create an annotated tag, and push it to Gerrit.

Running
The Wikimedia Foundation runs Kask in production using Kubernetes; The easiest way to get the service up and running is to use a Wikimedia Foundation Docker image.

Setup
Since the Foundation's registry does not implement the latest tag, the first step is to browse the list of available image tags and select one appropriate. We'll use  as the tag in the following examples. Once you've selected a Docker image, use  to retrieve a copy locally,   to verify success. It may prove useful to create an alias for the chosen tag, both to have something more descriptive, and to have a stable reference when starting containers. This step is entirely optional though.

Starting a container
The container expects Kask's configuration file to exist as. To accomplish this we'll mount a local directory containing the configuration (as ) inside the container as. The following assumes  is in the current working directory.