Continuous integration/Docker/Dockerfiles

Dockerfile syntax is simple. The syntactic simplicity hides many pitfalls for the unwary. This page outlines some best practices for Dockerfile creation and the building or Docker images.

Other useful guides

 * Dockerfile Best Practices guide

Keep images lean
Don't add packages to an image that are not needed to run a container. If you do need to troubleshoot a container, troubleshooting tools can be added at container runtime. For instance, adding a text editor to a base image would be a bad idea.

Minimize image layers
In general, each command in your Dockerfile creates a layer in the image cache which increases the VirtualSize of your image and the resulting containers.

Subsequent layers cannot reclaim any size added in the previous layer. It is best practice to keep the number of layers to a minimum and reclaim resources within the layer on which they are created.

Consider the following examples:

vs

When you inspect the  of the images resulting from these Dockerfiles you see that the image that has more layers is larger by about 12MB:

The size discrepancy is due to  having an intermediate layer where all the apt information still exists. This can be seen easily with the dockviz tool:

The image layer cache is not your friend
There is much unintuitive behavior that results from using the layer cache. In general, Docker will step through each instruction in your Dockerfile and search the layer cache for a layer created using the same instruction from your Dockerfile. For  instruction docker will also compare the contents of the file being copied to the file created by the   instruction in the layer cache.

The consequences of this behavior are not always immediately evident. For instance, if there were a security fix available in Debian for the latest version of  rebuilding an image from a Dockerfile that included   would not be sufficient to ensure that the version of the git package with the Debian security fix is contained in the resulting image. Running a  using the   option is the easiest way to ensure that the layer cache on a machine is not creating unintended consequences.

Prefer COPY to ADD
From Dockerfile Best Practices guide:

Although ADD and COPY are functionally similar, generally speaking, COPY is preferred. That’s because it’s more transparent than ADD. COPY only supports the basic copying of local files into the container, while ADD has some features (like local-only tar extraction and remote URL support) that are not immediately obvious. Consequently, the best use for ADD is local tar file auto-extraction into the image, as in ADD rootfs.tar.xz /.