Optimize with Multi-Stage Dockerfile

Nádasdi Balázs
Nádasdi Balázs
Head of Engineering
Cloud
2018 okt 20
Docker multi stage

Last year in June, Docker 17.05 was released including a feature that is extremely useful, but is often left unused. What am I talking about? This feature is the Multi-Stage Dockerfile. It enables you to merge separated Dockerfiles into one universal file. In the old days, in a single-staged Dockerfile, we had to create multiple files in order to make the production image clean. Of course we could build huge images for deployment that contains all the requirements to build our application.

 

Let's take a look at a simple example: We have a single-page website. We have a static index.html, a single JavaScript, and a CSS file. As can be seen in this example, it's clean and short, but can be huge.

 

Here is our project directory tree:

 

  1. docroot |-- Gemfile |-- Gemfile.lock |-- Rakefile |-- public |   `-- index.html `-- src
  2.     |-- css
  3.     |   |-- main.scss
  4.     |   `-- modules
  5.     |       `-- article.scss
  6.     `-- js
  7.         |-- boot.js
  8.         |-- init.js
  9.         `-- modules
  10.             `-- article.js

 

Our HTML is just a simple empty document:

 

  1. <!DOCTYPE html> <html>
  2.     <head>
  3.         <meta charset="utf-8" />
  4.         <meta name="viewport" content="width=device-width" />
  5.         <title>Sample Project for Multi-Stage Docker build</title>
  6.         <link rel="stylesheet" href="/app.css" />
  7.         <script type="text/javascript" charset="utf-8" src="/app.js"></script>
  8.     </head>
  9.     <body>
  10.         <div id='mainContent'></div>
  11.     </body>
  12. </html> 

 

It's nice and clean. As we build a huge project, we don't want to write a single CSS and JavaScript file, so we create a beautiful directory structure, and before we deploy our site, we will merge them. For CSS we will use SCSS because it contains a lot of features that can be handy later. To build our project, we use simple rake tasks. So, we need ruby at least. For SCSS, we have to install extra packages such as libffi.

 

The old way

 

What did we do before Multi-Stage Dockerfiles?
Of course we want as small of a deploy image as possible so we create two Dockerfiles- one for building the project and one to deploy.
Firstly, building our project:

 

  1. # Dockerfile.build FROM ruby:2.5.0-alpine3.7
  2. # Install dependencies for native extensions RUN apk add --no-cache build-base libffi-dev
  3. # This will be our application root folder WORKDIR /application
  4. # Copy all the content from docroot COPY docroot /application
  5. # Build our application RUN bundle install RUN bundle exec rake 

 

Now we can create a much simpler image with pre-built artifacts:

 

  1. # Dockerfile FROM nginx:latest
  2. COPY ./build-artifact /usr/share/nginx/html 

 

Wow! Nice and clean but how can we deploy our site? What do we need  to do to get the final image? Let's create a script so we can eliminate human error in the process:

 

  1. #!/bin/bash # build.sh
  2. # Build the "build-image" docker build -t yitsushi/myshinyproject:build -f ./Dockerfile.build .
  3. # Create a temporary container docker create --name temp_container yitsushi/myshinyproject:build
  4. # Extract build artifacts docker cp temp_container:/application/public ./build-artifact
  5. # delete the temporary container docker rm -f temp_container
  6. # Build the final image docker build --no-cache -t yitsushi/myshinyproject:latest -f ./Dockerfile .
  7. # Delete the temporary build-artifact directory rm -rf ./build-artifact 

 

It's ugly. We need a temporary image with a temporary container, and after that, we’ll need to create a local temporary directory. Don't forget to clean up your environment as well, like the build-artifact directory.

 

From here we simply execute our shell script:

 

  1. ❯./build.sh Sending build context to Docker daemon  9.665MB Step 1/6 : FROM ruby:2.5.0-alpine3.7
  2.  ---> 308418a1844f
  3. Step 2/6 : RUN apk add --no-cache build-base libffi-dev
  4.  ---> Using cache
  5.  ---> 677a75453610
  6. Step 3/6 : WORKDIR /application
  7.  ---> Using cache
  8.  ---> 1ba87d6eae13
  9. Step 4/6 : COPY docroot /application
  10.  ---> f415c072262a
  11. Step 5/6 : RUN bundle install
  12.  ---> Running in ba26aed32021
  13. Fetching gem metadata from https://rubygems.org/........... Fetching rake 12.3.0 Installing rake 12.3.0 Using bundler 1.16.1 Fetching ffi 1.9.18 Installing ffi 1.9.18 with native extensions Fetching rb-fsevent 0.10.2 Installing rb-fsevent 0.10.2 Fetching rb-inotify 0.9.10 Installing rb-inotify 0.9.10 Fetching sass-listen 4.0.0 Installing sass-listen 4.0.0 Fetching sass 3.5.5 Installing sass 3.5.5 Bundle complete! 2 Gemfile dependencies, 7 gems now installed. Bundled gems are installed into `/usr/local/bundle` Removing intermediate container ba26aed32021
  14.  ---> 5e60801a48d2
  15. Step 6/6 : RUN bundle exec rake
  16.  ---> Running in c685fe9c164a
  17. Removing intermediate container c685fe9c164a
  18.  ---> cb561d54cd0a
  19. Successfully built cb561d54cd0a Successfully tagged yitsushi/myshinyproject:build a82359e3b871dcbfa7acb1f4ba0f0a3d5b576e33d8ff6b56c1317d7799ef2148 temp_container Sending build context to Docker daemon  10.75kB Step 1/2 : FROM nginx:latest
  20.  ---> 3f8a4339aadd
  21. Step 2/2 : COPY ./build-artifact /usr/share/nginx/html
  22.  ---> f1fab6d81151
  23. Successfully built f1fab6d81151 Successfully tagged yitsushi/myshinyproject:latest
  24.  
  25.  ❯ docker images
  26. yitsushi/myshinyproject   latest              f1fab6d81151        9 minutes ago       108MB yitsushi/myshinyproject   build               cb561d54cd0a        9 minutes ago       247MB
  27.  
  28.  ❯ docker run --rm -p 8888:80 yitsushi/myshinyproject

 

With Multi-Stage Dockerfile

 

Where can Multi-Stage Dockerfiles help us? We can eliminate all the unnecessary temporary files and containers. But how? We create only one Dockerfile with multiple FROM statement:

 

  1.  # Dockerfile, the new one FROM ruby:2.5.0-alpine3.7 as builder
  2. # Install dependencies for native extensions RUN apk add --no-cache build-base libffi-dev
  3. # This will be our application root folder WORKDIR /application
  4. # Copy all the content from docroot COPY docroot /application
  5. # Build our application RUN bundle install RUN bundle exec rake
  6. # Here we start a new stage FROM nginx:latest
  7. COPY --from=builder /application/public /usr/share/nginx/html 

 

What?! Yes, that’s right- we can write a Dockerfile like this. We can define as many steps as we want, but keep in mind that it's not a "build more images at the same time" approach. In the end, only the last one will be available as the final image.
How can we build with this? We don't need our build.sh because it's that simple:

 

  1. ❯ docker build -t yitsushi/myshinyproject .
  2.  
  3.  ❯ docker images
  4. yitsushi/myshinyproject   latest              5ee4b57eb9e9        9 minutes ago       108MB
  5.  
  6.  ❯ docker run --rm -p 8888:80 yitsushi/myshinyproject
  7.  

 

Hidden secret keys

 

What else can we do with Multi-Stage Dockerfiles? We can COPY a private ssh key to use in our build stage. However, in our production image, there will be no private ssh key.
Imagine: you have a private Go GitHub repository, and one of the dependencies is private as well. You can call go get, and there are two alternatives :

 

  • Add a deployment key to your Docker image;
  • Pre-fetch specific dependencies (or all).

 

Now we can add our key in a stage, fetch all repository, and- Voilà! Problem solved!

 

  1.  # Dockerfile with private ssh key FROM golang:alpine as build
  2. # Just add the key from build parameter ARG SSH_KEY RUN mkdir -p /root/.ssh ADD echo ${SSH_KEY} > /root/.ssh/private_key RUN chmod 0600 /root/.ssh/private_key RUN ssh-add /root/.ssh/private_key RUN ssh-keyscan github.com >> /root/.ssh/known_hosts
  3. ADD . /go/src/github.com/Yitsushi/myshinyproject RUN go install github.com/Yitsushi/myshinyproject
  4. # Final image FROM alpine:latest COPY --from=build /go/bin/myshinyproject /usr/local/bin/myshinyproject CMD ["/usr/local/bin/myshinyproject"]

 

The image we’ll receive at the end will be as clean and small as possible. The build is still short and clear.

 

 
  1. ❯ docker build \
  2.     --build-arg SSH_KEY="$(cat ~/.ssh/cheppers_rsa)" \
  3.     -t yitsushi/myshinyproject \
  4.     -f Dockerfile .

 

Now we can deploy our image anywhere without exposing our private key after a local build.

 

Update: As Kaji Bikash pointed out, Docker can't see files outside of its scope during the build. So use SSH_KEY instead of SSH_KEY_PATH.

 

Conclusion

 

Docker gives us the opportunity to match our testing environment with our production environment as close as possible. But does this mean that our production environment must contain all the dependencies to build our project? This is a feature that’s already in there, but as I see it, it’s undeservedly underutilized.  Perhaps the reason being that there are a few resources available on the topic. Or perhaps it wasn’t announced as loudly as some of the other features.

 

The source code is available on GitHub.