Multi stage build

🧱 You can only use one base image

You cannot directly use multiple base images in a single Dockerfile.

Each Dockerfile must begin with a single FROM instruction (although you can switch images using multiple FROM instructions in separate build stages).

Every Docker image starts from a base image, which forms the initial filesystem layer. You can think of it as the “root disk” for the image.

So a Dockerfile like:


FROM ubuntu:22.04
RUN echo "Hello"

Means: start from a clean Ubuntu 22.04 OS image → add my stuff on top.

You can’t do this:


FROM ubuntu:22.04
FROM node:20

This just means: “use node:20 as the base and ignore ubuntu:22.04.” The second FROM overrides the first.

There are powerful ways to combine multiple base images using multi-stage builds

✅ How to Combine Multiple Base Images: Multi-Stage Builds

If you want to leverage multiple images, you use multi-stage builds — each with its own FROM, and then copy files between stages.

🔧 Example: Compile in one image, run in another


# Stage 1: Build Go binary using golang image
FROM golang:1.21 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp
 
# Stage 2: Run using slim image
FROM debian:bullseye-slim
COPY --from=builder /app/myapp /usr/local/bin/myapp
ENTRYPOINT ["myapp"]

⚙️ What’s Happening?

You start from golang:1.21 to build the binary — this image has all the compilers/tools.
Then you switch to a clean debian:bullseye-slim image for the final runtime.
You copy just the result (the myapp binary), not the entire environment.

This avoids conflict entirely by isolating environments in separate stages.

Multi stage v.s. layer

Concept	What it means	When it’s used
Layer	A snapshot of filesystem changes created by each instruction (`RUN`, `COPY`, etc.)	Always, for every Dockerfile
Multi-stage build	A build process that uses multiple `FROM` instructions, each starting a new, isolated build stage	Used only if you need to separate concerns (e.g., build vs runtime)

🧱 Docker Layers

Layers are the core building blocks of an image.

Every instruction like this:


RUN apt-get update
COPY . /app

…creates a new layer. These layers are:

Immutable (can’t be changed later)
Stacked to form an image
Cached (to speed up builds)

🔍 Use docker history <image> to inspect layers.

🧰 Multi-Stage Builds

Multi-stage builds are a design pattern that uses multiple FROM instructions to create isolated sets of layers, typically to:

Separate build-time dependencies from runtime code
Reduce image size
Avoid shipping compilers, debuggers, temp files, etc.


FROM node:20 AS builder       # Stage 1: build frontend
WORKDIR /app
COPY . .
RUN npm run build             # Creates layers in stage 1
 
FROM nginx:alpine             # Stage 2: final image
COPY --from=builder /app/dist /usr/share/nginx/html

👆 Here:

You still have layers within each stage.
Each stage has its own set of layers.
Only the final stage is shipped in the final image.

Imagine a multi-stage build as a project with multiple folders, each representing a stage (build, test, deploy).

Inside each folder, you’re still creating files (layers). But when you’re done, you only zip up one folder (the final stage) and deploy it.

✅ When to Use What?

Use layers smartly for efficient builds (cache, order, combining commands).
Use multi-stage builds when:
- You need to build something but don’t want the tools in the final image.
- You want clean separation between environments (e.g., frontend/backend).

🔄 Multi-Stage Build: How It Works

✅ Basic Mechanics

Each FROM (optionally with AS name) starts a new build stage.
You can name a stage using AS builder, AS base, etc.
Later stages can access artifacts from previous stages via COPY --from=<stage>.

This lets you build in one stage (which might be large and heavy), and then copy only needed output into a smaller, clean runtime stage.

📁 Example


# Stage 1: Build
FROM node:20 AS builder
WORKDIR /app
COPY package.json .          # Copy only package.json for early layer cache
RUN npm install              # Install deps (cached if package.json hasn't changed)
COPY . .                     # Copy the rest of the project (source code, etc.)
RUN npm run build            # Build the project using full source
 
# Stage 2: Runtime
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html

🔍 What’s Happening in Stage 1

Builds the frontend app inside a Node image.

Everything in /app is built and available as a layered filesystem.

More about what `COPY . .` means

General Form:


COPY <src> <dest>

So:


COPY . .

means:

Copy everything from the build context’s root (. on your local machine, where you run docker build)
Into the current working directory inside the container, which is /app in this case (because of WORKDIR /app earlier)

📁 What Is “Build Context”?

The build context is the directory you pass to docker build, usually the current directory:


docker build -t my-app .

That . at the end is the build context.

So COPY . . copies everything (except what’s in .dockerignore) from your local directory into the container.

Why split the two `COPY` steps?

Layer caching optimization:
- Docker will cache npm install as long as package.json doesn’t change.
- If you had used COPY . . from the start, then changing any source file would invalidate the cache and force a full reinstall.

🔐 What Actually Gets Copied?

Whatever isn’t excluded by .dockerignore. Example:

.dockerignore


node_modules
*.log
.git
dist

That prevents unnecessary files from bloating your image or leaking secrets.

Stage 2: Starts from scratch (NGINX image).

Uses COPY --from=builder to pull /app/dist from the cached filesystem of stage 1.

🧱 How Are the Stages Connected?

Each named stage (e.g., AS builder) creates a named, intermediate image.

Later stages can use:


COPY --from=builder /path/in/builder /path/in/current

You can also refer to unnamed stages numerically:


COPY --from=0 /some/path /dest

🗂️ Path Resolution Rules

Paths are absolute inside the source stage (/app/dist, /usr/local/bin, etc.)
You can’t access the environment (like variables or build context) — only the filesystem.

📌 If /app/dist was never created in the first stage, the build will fail.

💾 Does Docker Cache Stage Results?

Yes — Docker caches each stage based on:

Base image digest (FROM)
Instruction layer cache (RUN, COPY, etc.)
Build context content

So if nothing changes in the earlier stages, Docker reuses the cached intermediate image — speeding up builds dramatically.

You can verify with:


docker build --progress=plain .

Or inspect image layers with:


docker history <image-name>

🔄 Advanced: Reusing Stages Selectively

You can build up to a specific stage (e.g. for debugging):


docker build --target builder -t debug-image .

You can also share stages between parallel pipelines (with Docker BuildKit) or externalize them into separate Dockerfiles, using:


COPY --from=registry.example.com/my-shared-builder:1.0 /opt/tool /opt/tool

🧨 What If You Need Conflicting Tools?

If two base images offer different environments (e.g. node:20 vs python:3.12), you have three choices:

Find or build a combined base image that includes both:


FROM python:3.12
RUN curl -sL https://deb.nodesource.com/setup_20.x | bash - && \
    apt-get install -y nodejs

Use multi-stage to build Node and Python parts separately:


FROM node:20 AS node_builder
WORKDIR /frontend
COPY frontend .
RUN npm install && npm run build

FROM python:3.12-slim
WORKDIR /app
COPY backend .
COPY --from=node_builder /frontend/dist ./static

Split into separate containers in a multi-container setup, e.g., using Docker Compose or Kubernetes, with each container running a different stack.