Your Docker images often weigh ten times more than your actual application, and docker images doesn't tell you why.
A Docker image works like a mille-feuille pastry. A layer is what each Dockerfile instruction adds: a RUN, a COPY, an ADD. Each layer sits on top of the previous one, and everything you write into a layer stays there, even if the next layer deletes it. That's why an image ends up at 1.5 GB when your application is only 80 MB: you're shipping caches, build files, and dependencies that should never have left your dev machine.
Dive is the tool that opens an image and shows you, layer by layer, what's inside. It computes an efficiency score, the percentage of files that are actually useful, and points to files that were added then deleted, the ones weighing the image down for nothing. By the end, you'll have analyzed a real image, identified the three most common sources of waste, and applied the fixes that shrink images by 30–70 % without changing a line of your application.
Dive ships as a binary, a single executable file, no system-level installation. On macOS via Homebrew:
brew install diveOn Linux (Debian/Ubuntu), via the official .deb package:
DIVE_VERSION=$(curl -sL "https://api.github.com/repos/wagoodman/dive/releases/latest" | grep '"tag_name":' | sed -E 's/.*"v([^"]+)".*/\1/')
curl -OL https://github.com/wagoodman/dive/releases/download/v${DIVE_VERSION}/dive_${DIVE_VERSION}_linux_amd64.deb
sudo apt install ./dive_${DIVE_VERSION}_linux_amd64.debIf you don't want to install anything, run Dive directly via Docker:
docker run --rm -it \
-v /var/run/docker.sock:/var/run/docker.sock \
wagoodman/dive:latest <your-image>:<tag>Once installed, pick any local image and run:
dive node:20Dive pulls the image if needed, inspects it, then opens a two-pane interface. Left pane: the list of layers with their size. Right pane: the file tree of the selected layer. At the bottom, a panel summarizes the whole image:

Two metrics matter here:
A first surprise: Dive only responds to the keyboard, not the mouse. Navigate with the arrow keys. Tab switches panes. Ctrl+Space collapses the file tree. The most useful key is Ctrl+U: it filters the tree to show only the files added or modified by the selected layer. You immediately see what each Dockerfile instruction actually cost you.
Takeaway: run dive <image>, read the score at the bottom, open the heaviest layers with Ctrl+U. That's the basic move, to repeat on every image.
Before you look at size, you can also lint your Dockerfile with Dockle to catch structural anti-patterns first. Then open one of your own images. Be honest with yourself: on up-to-date official images (node:20-slim, python:3.12-slim), you'll gain little, they're already optimized. The real wins are on your images, where waste piles up across PRs. Here are the three sources Dive surfaces most often.
On a Debian-based image, an apt-get install without cleanup leaves /var/lib/apt/lists/ behind (often 30 to 50 MB). In Dive, those files show up in yellow inside the RUN apt-get layer. The fix fits in one line:
RUN apt-get update && apt-get install -y --no-install-recommends \
curl ca-certificates \
&& rm -rf /var/lib/apt/lists/*The critical part is the &&: everything must happen in the same RUN instruction. If you clean up in a separate RUN, the previous layer still keeps the files, Docker can't retroactively remove what's already committed to a layer. That's the trap that makes a lot of "Dockerfiles with cleanup" actually clean up nothing at all.
Look at the COPY . . layer: if you see node_modules, .git, a local dist/ build, or .env files, you're copying your entire working directory into the image. The fix is a .dockerignore file at the project root:
.git
node_modules
dist
*.log
.env
.env.*
coverageRebuild, run dive again: the COPY layer should be noticeably lighter, and the score climbs back up. On a typical Node project, this is often the fastest win, a few hundred megabytes saved for two minutes of work.
If your image contains gcc, make, your sources and the compiled binary, you're shipping the toolchain to production. Dive makes it obvious: the layer that installs the build tools often weighs more than the application itself. The fix is a multi-stage build: a Dockerfile with multiple FROM stages, where only the last one ends up in the published image.
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# final image
FROM node:20-slim
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package*.json ./
RUN npm ci --omit=dev
CMD ["node", "dist/server.js"]Only the FROM node:20-slim stage ends up in the published image. gcc, sources, dev dependencies, and the builder's npm cache stay in an intermediate stage that's never shipped. Switching the final base from node:20 (~1.1 GB) to node:20-slim (~240 MB) saves several hundred more megabytes.
To validate every change without re-running Dive by hand, the tool has a CI mode built for this:
CI=true dive <your-image>:<tag> --highestUserWastedPercent 0.05This command exits non-zero if more than 5 % of the image is waste. It's what you wire into a GitLab CI or GitHub Actions pipeline, the two most common CI/CD platforms (the automated chain of tests and deployments), to prevent an image from regressing on size without anyone noticing.
Once the three patterns above are in place, the next level is the distroless image. A distroless image only contains your application and the runtime needed to execute it, no shell, no package manager, not even ls or cat. Google maintains the main ones (gcr.io/distroless/...), and the goal is simple: anything that isn't your code shouldn't be there.
The multi-stage build from the previous chapter becomes:
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# final image
FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
CMD ["dist/server.js"]On a Node project, the final image typically drops from ~240 MB (node:20-slim) to ~150 MB (distroless/nodejs20). Run dive on both versions to measure the real gain on your own application.
Be honest about the trade-off: debugging a distroless container is harder. You can't run docker exec -it <container> bash to poke around, there is no bash. No curl, no ps, no vi. If your production crashes at 3 a.m. and your only reflex is to jump into the container interactively, this is going to hurt. Two workarounds exist:
:debug image (e.g. gcr.io/distroless/nodejs20-debian12:debug) which includes a minimal busybox shell.Concretely: switch to distroless once the three previous patterns are in place and your observability holds up. Not before. You gain on size and on attack surface, but you gain nothing if every incident forces you to rebuild a :debug variant just to investigate.
Dive doesn't shrink your images automatically. It shows you where the waste hides so you can act in the right place. The habit to build: before pushing an image to production, run dive on it, check the score, open the heaviest layers with Ctrl+U, and ask yourself does this file really need to be here? In most cases, you'll find an apt cache, a stray node_modules, or forgotten build tools within minutes.
To go further, wire Dive into CI with a threshold your team accepts (5 % is a good starting point), move your Dockerfiles to multi-stage builds, and look at distroless or Alpine base images when your dependencies allow it. A well-optimized image isn't just a comfort: it means less bandwidth on pulls, faster deployments, and a smaller attack surface on the security side, pair this work with scanning your images for vulnerabilities using Trivy. And if you're still evaluating container runtimes, see the Docker vs Podman comparison.
Thanks for following along on this journey! 🍺