Docker multi-stage builds are a powerful feature that allows you to create smaller, more secure container images. By separating the build environment from the runtime environment, you can ensure your production containers only contain what's necessary to run your application. In this guide, we'll explore multi-stage builds with practical examples for different programming languages.

Table of Contents #
- Introduction to Multi-Stage Builds
- The Problem with Single-Stage Builds
- How Multi-Stage Builds Work
- Example 1: Go Application
- Example 2: Node.js Application
- Example 3: Python Application
- Example 4: Java Spring Boot Application
- Best Practices for Multi-Stage Builds
- Measuring Image Size Improvements
- Security Benefits of Multi-Stage Builds
- Cleanup
- Conclusion
Introduction to Multi-Stage Builds #
Multi-stage builds were introduced in Docker 17.05 to solve a common problem: how to create small, efficient container images without sacrificing build tools and dependencies. This guide is based on our Lab4 Multi-Stage Build Example from the Docker Practical Guide repository.
Before multi-stage builds, developers had to choose between:
- Single Dockerfile: Creating large images containing build tools and dependencies
- Builder Pattern: Using multiple Dockerfiles with complex shell scripts to coordinate them
Multi-stage builds elegantly solve this problem by allowing multiple FROM statements in a single Dockerfile. Each FROM statement begins a new stage, and you can selectively copy artifacts from one stage to another, leaving behind everything you don't need in the final image.
The Problem with Single-Stage Builds #
Let's first understand why single-stage builds can be problematic:
# Single-stage example for a Go application
FROM golang:1.17
WORKDIR /app
COPY . .
RUN go mod download
RUN go build -o /app/server .
EXPOSE 8080
CMD ["/app/server"]
This approach works but has significant drawbacks:
- Large Images: The final image includes the entire Go toolchain and build dependencies
- Security Risks: More tools and libraries mean a larger attack surface
- Inefficient Caching: Changes to source code invalidate the cache for all subsequent layers
- Higher Transfer Costs: Larger images take longer to push/pull from registries
Let's see how multi-stage builds solve these issues.
How Multi-Stage Builds Work #
A multi-stage build Dockerfile contains multiple FROM instructions, with each creating a new build stage:
┌────────────────────────────────────────────────────────────┐
│ Multi-Stage Build Process │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ │ │ │ │
│ │ Build Stage │ │ Runtime Stage │ │
│ │ (with all │──────►│ (minimal │ │
│ │ build tools) │ COPY │ runtime) │ │
│ │ │ │ │ │
│ └─────────────────┘ └─────────────────┘ │
│ │
└────────────────────────────────────────────────────────────┘
The key features of multi-stage builds:
- Multiple FROM Instructions: Each starts a new build stage
- Named Stages: You can name stages for clarity using
AS <name>
- Selective Copying: Use
COPY --from=<stage>
to copy only what you need - Discarded Stages: Anything not explicitly copied is discarded
- Multiple Final Images: You can build different final images from the same Dockerfile
Now let's look at practical examples for different programming languages.
Example 1: Go Application #
Go applications are perfect candidates for multi-stage builds because they compile to a single binary:
# Build stage
FROM golang:1.17 AS build
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/server .
# Run stage
FROM alpine:3.15
RUN apk --no-cache add ca-certificates
WORKDIR /app
COPY /app/server /app/
EXPOSE 8080
CMD ["/app/server"]
This approach has several advantages:
- Minimal Final Image: The runtime stage only contains the compiled binary and necessary certificates
- No Build Tools: The Go toolchain is only present in the build stage
- Smaller Attack Surface: Fewer packages mean fewer potential vulnerabilities
- Static Binary: Using
CGO_ENABLED=0
creates a static binary that doesn't depend on libc
For even smaller images, you can use the scratch
base image:
# Run stage
FROM scratch
COPY /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY /app/server /server
EXPOSE 8080
CMD ["/server"]
The scratch
image is completely empty, resulting in the smallest possible container size, often under 10MB for a Go application.
Example 2: Node.js Application #
For Node.js applications, we can separate the build environment from the runtime:
# Build stage
FROM node:16 AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Run stage
FROM node:16-alpine
WORKDIR /app
# Copy only production dependencies
COPY /app/package*.json ./
RUN npm ci --only=production
# Copy built application from build stage
COPY /app/dist ./dist
EXPOSE 3000
CMD ["node", "dist/server.js"]
Key benefits for Node.js applications:
- No Development Dependencies: The final image contains only production dependencies
- Smaller Node.js Base Image: Using Alpine Linux reduces the base image size
- Clean Build Environment: The build stage provides a consistent environment for transpiling or bundling
For front-end applications that generate static files, you can use an even smaller runtime:
# Build stage
FROM node:16 AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Run stage
FROM nginx:alpine
COPY /app/build /usr/share/nginx/html
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
This approach is ideal for React, Vue.js, or Angular applications that build to static assets.
Example 3: Python Application #
Python applications can also benefit from multi-stage builds, especially when using tools like Poetry:
# Build stage
FROM python:3.10-slim AS build
WORKDIR /app
RUN pip install poetry
COPY pyproject.toml poetry.lock* ./
RUN poetry export -f requirements.txt > requirements.txt
# Run stage
FROM python:3.10-slim
WORKDIR /app
COPY /app/requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
In this example:
- The build stage uses Poetry to generate a
requirements.txt
file - The runtime stage installs only the required packages
- The final image doesn't contain Poetry or any development dependencies
For Python applications with compiled C extensions, this approach can significantly reduce image size by leaving out compilers and build headers.
Example 4: Java Spring Boot Application #
Java applications typically have a build stage that includes the JDK and a runtime stage with just the JRE:
# Build stage
FROM maven:3.8-openjdk-17 AS build
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests
# Run stage
FROM eclipse-temurin:17-jre-alpine
WORKDIR /app
COPY /app/target/*.jar app.jar
EXPOSE 8080
CMD ["java", "-jar", "app.jar"]
The benefits for Java applications:
- No Build Tools: The final image doesn't include Maven or the JDK
- Smaller Base Image: JRE-only images are significantly smaller than JDK images
- Efficient Layer Caching: Dependencies are downloaded separately from compilation
- Alpine Base: Further reduces the image size
For even smaller Java images, you can create a custom JRE with jlink:
# Build stage
FROM maven:3.8-openjdk-17 AS build
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline
COPY src ./src
RUN mvn package -DskipTests
# JRE creation stage
FROM eclipse-temurin:17 AS jre-build
RUN jlink \
--add-modules java.base,java.logging,java.sql,java.desktop,java.management,java.naming,java.security.jgss,java.instrument \
--strip-debug \
--no-man-pages \
--no-header-files \
--compress=2 \
--output /javaruntime
# Run stage
FROM alpine:3.15
RUN apk --no-cache add ca-certificates
WORKDIR /app
COPY /javaruntime /opt/java
COPY /app/target/*.jar app.jar
ENV PATH="${PATH}:/opt/java/bin"
EXPOSE 8080
CMD ["java", "-jar", "app.jar"]
This approach creates a minimal custom JRE with only the modules your application needs, resulting in a much smaller image.
Best Practices for Multi-Stage Builds #
To get the most out of multi-stage builds, follow these best practices:
1. Order Layers by Frequency of Change #
Place infrequently changed operations first to maximize caching:
FROM node:16 AS build
WORKDIR /app
# Rarely changes
COPY package*.json ./
RUN npm ci
# Changes more frequently
COPY . .
RUN npm run build
2. Use Explicit Image Tags #
Avoid latest
tags to ensure build reproducibility:
# Good
FROM node:16.14.0-alpine3.15 AS build
# Avoid
FROM node:latest AS build
3. Name Your Stages #
Named stages improve readability and maintenance:
FROM golang:1.17 AS builder
# ...
FROM alpine:3.15 AS final
# ...
4. Use Small Base Images #
For runtime stages, prioritize smaller base images:
- Alpine Linux:
alpine:3.15
- Distroless:
gcr.io/distroless/static
- Scratch:
scratch
5. Use Build Arguments for Flexibility #
ARG NODE_ENV=production
FROM node:16-alpine AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build:${NODE_ENV}
6. Multi-architecture Builds #
Support multiple CPU architectures with build arguments:
ARG ARCH=amd64
FROM golang:1.17 AS build
# ...
Measuring Image Size Improvements #
Let's compare image sizes for our Go application example:
┌───────────────────────────────────────────────────────────┐
│ Docker Image Sizes │
│ │
│ ┌───────────────────────────────────┐ │ │
│ │ Single-Stage Go │ │ │
│ │ (golang:1.17) │ │ │
│ │ (golang:1.17) │ │ 1.07 GB │
│ └───────────────────────────────────┘ │ │
│ │ │
│ ┌────────────────────┐ │ │
│ │ Multi-Stage Go │ │ │
│ │ (alpine) │ │ 15.6 MB │
│ └────────────────────┘ │ │
│ │
│ ┌───────────┐ │ │
│ │ Go │ │ │
│ │ (scratch) │ │ 7.2 MB │
│ └───────────┘ │ │
│ │
└───────────────────────────────────────────────────────────┘
The results are dramatic:
- Single-stage Go image: ~1.07 GB
- Multi-stage Go image with Alpine: ~15.6 MB
- Multi-stage Go image with scratch: ~7.2 MB
That's a 99% reduction in image size!
Similar improvements can be seen with other languages:
- Node.js: 50-70% reduction
- Python: 40-60% reduction
- Java: 60-80% reduction
Security Benefits of Multi-Stage Builds #
Beyond size optimization, multi-stage builds significantly improve security:
- Reduced Attack Surface: Fewer packages mean fewer potential vulnerabilities
- No Build Tools in Production: Compilers, build tools, and development dependencies can be exploited if present
- Minimal Runtime: Only the exact runtime dependencies needed to execute your application
- Separation of Concerns: Build secrets (like API keys for private package repositories) don't leak into the final image
- Regular Base Image Updates: Smaller images are easier to rebuild and update regularly
For even better security, combine multi-stage builds with non-root users:
FROM node:16-alpine
# Create app directory and non-root user
RUN mkdir -p /app && \
addgroup -g 1001 appgroup && \
adduser -u 1001 -G appgroup -h /app -D appuser
WORKDIR /app
COPY /app/dist ./dist
COPY /app/node_modules ./node_modules
USER appuser
EXPOSE 3000
CMD ["node", "dist/server.js"]
Cleanup #
After working with multi-stage builds and experimenting with different approaches, it's important to clean up your Docker environment to free up disk space and maintain a well-organized system. Multi-stage builds can create multiple intermediary images that consume storage space.
Removing Unused Images #
The most important cleanup task after experimenting with multi-stage builds is to remove unused images, especially those large builder images:
# List all images to see what's consuming space
docker images
# Remove specific images
docker rmi go-app:latest node-app:latest python-app:latest java-app:latest
# Remove intermediary images (those with <none> tags)
docker rmi $(docker images -f "dangling=true" -q)
Removing Containers #
If you've been testing your images, you may also have stopped containers consuming resources:
# Remove all stopped containers
docker container prune
# Or remove specific containers
docker rm go-app-container node-app-container
Cleaning Up the Build Cache #
Multi-stage builds can accumulate build cache that takes up disk space:
# Clear build cache (Docker 17.06.1 or later)
docker builder prune
# Remove all unused build cache
docker builder prune --all
# Force removal without prompt
docker builder prune --force
Comprehensive Cleanup #
For a complete cleanup after your multi-stage build experiments:
# Remove all unused containers, networks, images (both dangling and unreferenced), and build cache
docker system prune -a
# Include volumes in the cleanup
docker system prune -a --volumes
Tracking Image Size Improvements #
If you want to keep track of the size improvements you've achieved with multi-stage builds:
# Create a report of image sizes before cleanup
docker images --format ": - " > image-sizes.txt
Cleanup for Lab Examples #
If you've been following the examples from our lab:
# Remove the example images
docker rmi go-example node-example python-example java-example
# Clean up resources from docker-compose examples
cd /path/to/lab4_multi_stage_build_example
docker-compose down --rmi all
Regular cleanup after experimenting with multi-stage builds ensures your system remains efficient and prevents wasted disk space on unused builder images.
Conclusion #
Multi-stage builds are an essential technique for creating efficient, secure Docker images. By separating the build environment from the runtime environment, you can dramatically reduce image size, improve security, and streamline your CI/CD pipelines.
In this guide, we've explored:
- The fundamentals of multi-stage builds
- Practical examples for Go, Node.js, Python, and Java applications
- Best practices for optimizing your builds
- Quantifiable benefits in terms of image size
- Security improvements from using multi-stage builds
For any containerized application, multi-stage builds should be considered the default approach. The benefits in terms of size, security, and efficiency make them a critical part of any Docker workflow.
In the next article in our Docker Practical Guide series, we'll explore image management best practices, including tagging strategies, registry interactions, and image optimization techniques. Stay tuned!
Are you using multi-stage builds in your Docker workflow? Share your experiences or ask questions in the comments below!