Chapter 1
Docker Basics — Containers, Images, and the Docker Workflow
What Problem Docker Actually Solves
Before Docker, deploying software meant managing a fragile agreement between your code and the machine running it. "It works on my machine" wasn't a joke — it was a weekly incident. Libraries conflicted, environment variables drifted, OS versions diverged between dev and prod.
Docker solves this with a single guarantee: the environment travels with the code. You package your app, its runtime, its dependencies, and its config into one artifact. That artifact runs identically on your laptop, on CI, and in production.
Containers vs. Virtual Machines
The most common confusion when learning Docker is treating containers like lightweight VMs. They're not.
A virtual machine emulates an entire computer — hardware included. Each VM runs its own OS kernel, which means 1 GB+ of overhead per instance, minutes to boot, and strict resource isolation.
A container shares the host OS kernel. It uses Linux primitives — namespaces (process, network, mount, user isolation) and cgroups (CPU, memory limits) — to create isolated environments without virtualizing hardware.
| VM | Container | |
|---|---|---|
| Boot time | 30–60s | < 1s |
| Size | GBs | MBs |
| OS overhead | Full kernel per VM | Shared host kernel |
| Isolation | Hardware-level | Process-level |
| Use case | Full OS isolation | App + dependency packaging |
Containers are faster to start, cheaper to run, and easier to move. The trade-off: they share the host kernel, so a kernel vulnerability affects all containers on that host.
Images and Containers
Two terms you'll use every day:
- Image — a read-only, layered snapshot of a filesystem. Think of it as a blueprint. Images are built from a
Dockerfileand stored in a registry. - Container — a running instance of an image. You can run many containers from one image simultaneously.
The relationship: Image → Container is like Class → Instance in OOP, or Dockerfile → docker run in practice.
Image Layers
Docker images are built in layers. Each instruction in a Dockerfile adds a new layer on top of the previous one. Layers are cached — if a layer hasn't changed, Docker reuses it during rebuilds.
Image: my-app:latest
├── Layer 1: ubuntu:22.04 (base OS)
├── Layer 2: apt-get install nodejs (runtime)
├── Layer 3: COPY package.json + npm install (dependencies)
└── Layer 4: COPY . . (application code)
This layering is why you should put frequently changing instructions (like COPY . .) at the bottom of your Dockerfile — it preserves the cache for the stable layers above.
Writing Your First Dockerfile
A Dockerfile is a plain-text recipe for building an image. Here's a minimal Node.js example:
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["node", "src/index.js"]Breaking this down instruction by instruction:
| Instruction | What it does |
|---|---|
FROM |
Sets the base image. Always the first instruction. |
WORKDIR |
Sets the working directory inside the container. Creates it if missing. |
COPY |
Copies files from the build context (your local machine) into the image. |
RUN |
Executes a shell command during the build. Result is committed as a new layer. |
EXPOSE |
Documents which port the container listens on. Does not publish it. |
CMD |
Default command when the container starts. Only one CMD takes effect. |
RUN vs CMD vs ENTRYPOINT
These three are the most confused instructions:
RUNexecutes at build time — it modifies the image. Use it to install packages, compile code.CMDexecutes at run time — it's the default process for the container. Can be overridden withdocker run <image> <command>.ENTRYPOINTalso executes at run time, but it's harder to override. Use it when your container should always run a specific executable (e.g., a CLI tool).CMDthen becomes the default arguments.
ENTRYPOINT ["npm", "run"]
CMD ["start"]
# docker run my-app → runs: npm run start
# docker run my-app test → runs: npm run testCore Docker Commands
Building an Image
docker build -t my-app:latest .-t my-app:latest— name and tag the image (name:tag).— build context (the directory Docker reads files from)
docker build -t my-app:1.0.0 --no-cache .--no-cache forces a fresh build, skipping all layer caches. Useful when a dependency update isn't being picked up.
Running a Container
docker run -d -p 3000:3000 --name my-app my-app:latest| Flag | Meaning |
|---|---|
-d |
Detached mode — run in background |
-p 3000:3000 |
Map host port 3000 → container port 3000 |
--name my-app |
Give the container a human-readable name |
docker run -it ubuntu:22.04 bash-it = interactive + pseudo-TTY. Opens a shell inside the container. Useful for debugging.
Managing Containers
docker ps # list running containers
docker ps -a # list all containers (including stopped)
docker stop my-app # gracefully stop (sends SIGTERM)
docker kill my-app # force stop (sends SIGKILL)
docker rm my-app # remove a stopped container
docker rm -f my-app # force remove (stop + remove in one step)Managing Images
docker images # list local images
docker pull node:20-alpine # download image from registry
docker rmi my-app:latest # remove an image
docker image prune # remove all dangling (unused) imagesInspecting and Debugging
docker logs my-app # view container stdout/stderr
docker logs -f my-app # follow logs in real time
docker exec -it my-app sh # open a shell in a running container
docker inspect my-app # full JSON config and state of a container
docker stats # live CPU/memory usage across all containersdocker exec is your primary debugging tool. When something goes wrong in production, this is how you get inside a running container to investigate without restarting it.
The .dockerignore File
Just like .gitignore, a .dockerignore file tells Docker which files to exclude from the build context. This matters for two reasons: build speed (sending fewer files to the Docker daemon) and security (not leaking secrets into the image).
# .dockerignore
node_modules
.git
.env
*.log
dist
coverage
Without this, docker build copies your entire node_modules directory into the build context — even if you're installing fresh inside the container. On a large project that's hundreds of megabytes of unnecessary file transfer.
Multi-Stage Builds
Multi-stage builds solve a common problem: your build tools (compilers, test frameworks, dev dependencies) shouldn't end up in the final production image.
# Stage 1: build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: production
FROM node:20-alpine AS production
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
EXPOSE 3000
CMD ["node", "dist/index.js"]The final image only contains what's in the production stage — no TypeScript compiler, no dev dependencies, no source files. This typically cuts image size by 50–80%.
Pushing to a Registry
Docker Hub is the default public registry. Private registries like AWS ECR, GCR, or GitHub Container Registry follow the same workflow.
# log in
docker login
# tag with registry path
docker tag my-app:latest yourusername/my-app:latest
# push
docker push yourusername/my-app:latest
# pull on another machine
docker pull yourusername/my-app:latestFor private registries, the registry hostname is part of the tag:
docker tag my-app:latest 123456789.dkr.ecr.ap-southeast-1.amazonaws.com/my-app:latest
docker push 123456789.dkr.ecr.ap-southeast-1.amazonaws.com/my-app:latestCommon Pitfalls
Running as root. By default, processes inside a container run as root. That's a security risk. Add a USER instruction to your Dockerfile:
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuserStoring secrets in the image. Never use ENV MY_SECRET=value in a Dockerfile for sensitive data — it's baked into the image and visible in docker history. Pass secrets at runtime via -e flags or a secrets manager.
One process per container. Containers are designed around a single main process. If that process exits, the container stops. Avoid running multiple services (e.g., app + database) in one container — that's what Compose is for.
Using latest in production. The latest tag is mutable — it points to whatever was last pushed. Pin to a specific version (node:20.11.0-alpine) for reproducible builds.
What's Next
You can now build images, run containers, and understand what's happening under the hood. But a real application is never just one container — it has a database, a cache, a message queue.
Before we reach Compose, we need to understand how containers talk to each other and how they persist data. That's Chapter 2: Networking & Volumes.