This post is the Part 2 of the Docker for Web Devs series. If you haven’t read the previous parts, check them out first:
Docker for Web Devs: Part 1 - Getting Started
Understanding Docker Layer Caching
In our previous example, we’ve created a working Dockerfile, but it is not optimized. Docker builds images in layers, with each instruction creating a new layer. When you rebuild an image, Docker can reuse unchanged layers from its cache.
Think of Docker images like a LEGO structure, where each instruction in your Dockerfile adds a new brick to the stack:
-
Base Image Layer: The
FROM
instruction provides the foundation plate. -
Filesystem Changes: Instructions like
RUN
,COPY
, andADD
stack new LEGO bricks on top. -
Metadata Layers: Instructions like
EXPOSE
,ENV
, andCMD
are like decorative pieces that don’t change the structure but add details
The crucial part is that Docker can only replace bricks from the top down. If you change a brick in the middle of your structure, Docker must rebuild everything above it, even if those higher bricks haven’t changed themselves.
graph BT A["RUN npm ci"] B["COPY .."] C["WORKDIR"] D["FROM"] D --> C --> B --> A style A fill:#e2e8f0,stroke:#64748b style B fill:#e2e8f0,stroke:#64748b style C fill:#e2e8f0,stroke:#64748b style D fill:#e2e8f0,stroke:#64748b
Figure 1: Docker layers build from bottom to top
Cache Invalidation
Let’s look at our previous Node.js Dockerfile from Part 1:
FROM node:22-alpine
WORKDIR /app
# We copy everything at once
COPY . .
# Then install dependencies
RUN npm ci
What happens if we make a code change?
graph BT A["RUN npm ci"] B["COPY .."] C["WORKDIR"] D["FROM"] D --> C --> B --> A style A fill:#fee2e2,stroke:#ef4444 style B fill:#fee2e2,stroke:#ef4444 style C fill:#e2e8f0,stroke:#64748b style D fill:#e2e8f0,stroke:#64748b
Figure 2: Code changes invalidate the COPY layer and above
After making a source code change, our COPY
instruction cache gets invalidated. And that forces the RUN
instructions to be invalidated as well which results in the packages being installed all over again.
This is not optimal and extremely inefficient.
Build Optimization
How can we optimize our image?
In our particular case, what we can do is the following:
FROM node:22-alpine
WORKDIR /app
# Copy dependency files first
COPY package.json package-lock.json ./
# Install dependencies
RUN npm ci
# Then copy application code
COPY . .
If you made the changes above, this is how Docker process its layer cache:
graph BT A["COPY .."] B["RUN npm ci"] C["COPY deps"] D["WORKDIR"] E["FROM"] E --> D --> C --> B --> A style A fill:#fee2e2,stroke:#ef4444 style B fill:#e2e8f0,stroke:#64748b style C fill:#e2e8f0,stroke:#64748b style D fill:#e2e8f0,stroke:#64748b style E fill:#e2e8f0,stroke:#64748b
Figure 3: Only dependency changes trigger rebuilds
This is much better as now, unless your dependencies change, Docker will use cache for those layers and the build will be much faster!
Visualizing the Difference
Action | Original Dockerfile | Optimized Dockerfile |
---|---|---|
First time build (Fresh install) | 45s Full build | 45s Full build |
Modify source code (Change app.js) | 45s Reinstalls everything | 5s ✨ Uses cached dependencies |
Want to make your Docker images even smaller and more secure? Learn about multi-stage builds in Part 3.