📦 What Are Containers?
Containers are lightweight, standalone, executable packages that include everything needed to run a piece of software: code, runtime, system tools, libraries, and settings.
Containers vs Virtual Machines
| Feature | Containers | Virtual Machines |
|---|---|---|
| Startup Time | Seconds (or milliseconds) | Minutes |
| Size | MBs (lightweight) | GBs (includes full OS) |
| Isolation | Process-level (shared kernel) | Full OS isolation |
| Resource Usage | Minimal overhead | Significant overhead |
| Portability | Very portable (same OS kernel) | Less portable (different hypervisors) |
| Use Case | Microservices, dev environments | Legacy apps, different OS requirements |
🔧 Linux Container Fundamentals
Containers rely on three core Linux kernel features:
1. Namespaces (Isolation)
Namespaces provide isolation for different system resources:
- PID Namespace: Process isolation - container sees its own process tree
- Network Namespace: Network stack isolation - separate IP addresses, routing tables
- Mount Namespace: Filesystem isolation - separate mount points
- UTS Namespace: Hostname and domain name isolation
- IPC Namespace: Inter-process communication isolation (message queues, semaphores)
- User Namespace: User/group ID isolation - root in container != root on host
2. Control Groups (cgroups) - Resource Limiting
cgroups limit and monitor resource usage:
- CPU: Limit CPU usage (e.g., --cpus="1.5")
- Memory: Set memory limits (e.g., --memory="512m")
- Disk I/O: Limit read/write operations
- Network: Bandwidth limiting
3. Union Filesystems (Layer Management)
Union filesystems (OverlayFS, AUFS) allow containers to share base image layers:
- Each Dockerfile instruction creates a new layer
- Layers are read-only and cached
- Only the top layer is writable (container layer)
- Dramatically reduces storage and improves build times
🏗️ Docker Architecture
docker build
docker run
docker push] Client -->|REST API| Daemon[Docker Daemon
dockerd] Daemon --> Images[Images] Daemon --> Containers[Containers] Daemon --> Networks[Networks] Daemon --> Volumes[Volumes] Daemon -->|pull/push| Registry[Docker Registry
Docker Hub] style Client fill:#4A90E2,color:#2e3440 style Daemon fill:#E74C3C,color:#2e3440 style Registry fill:#F39C12,color:#2e3440
Key Components
- Docker Client: CLI tool that sends commands to the daemon
- Docker Daemon (dockerd): Background service that manages images, containers, networks, and volumes
- Docker Images: Read-only templates with instructions for creating containers
- Docker Containers: Runnable instances of images
- Docker Registry: Stores Docker images (Docker Hub, AWS ECR, Google GCR)
📝 Dockerfile Best Practices
Basic Dockerfile Example
# Use official base image
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Copy requirements first (layer caching!)
COPY requirements.txt .
# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Create non-root user for security
RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
USER appuser
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:8000/health || exit 1
# Run application
CMD ["python", "app.py"]
Multi-Stage Builds (Critical for Production)
Multi-stage builds dramatically reduce image size by separating build and runtime environments:
# Stage 1: Build
FROM node:18 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Stage 2: Production runtime
FROM node:18-alpine
WORKDIR /app
# Copy only necessary files from builder
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./
# Run as non-root
USER node
EXPOSE 3000
CMD ["node", "dist/server.js"]
Dockerfile Optimization Techniques
1. Layer Caching Strategy
# ❌ BAD: Changes to code invalidate dependency layer
FROM python:3.11-slim
COPY . .
RUN pip install -r requirements.txt
# ✅ GOOD: Dependencies cached unless requirements.txt changes
FROM python:3.11-slim
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
2. Minimize Layer Count
# ❌ BAD: Each RUN creates a new layer
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN rm -rf /var/lib/apt/lists/*
# ✅ GOOD: Combine into single layer
RUN apt-get update && \
apt-get install -y curl git && \
rm -rf /var/lib/apt/lists/*
3. Use .dockerignore
# .dockerignore
node_modules
.git
.env
*.log
.vscode
__pycache__
*.pyc
.pytest_cache
dist
build
4. Choose Minimal Base Images
| Base Image | Size | Use Case |
|---|---|---|
| ubuntu:22.04 | 77MB | Full-featured, debugging tools included |
| python:3.11-slim | 125MB | Minimal Python with basic tools |
| python:3.11-alpine | 50MB | Ultra-minimal (musl libc - compatibility issues possible) |
| distroless/python3 | 53MB | No shell, no package manager - maximum security |
🌐 Docker Networking
Network Drivers
| Driver | Description | Use Case |
|---|---|---|
| bridge | Default network. Containers on same bridge can communicate | Single-host container communication |
| host | Remove network isolation - container uses host's network | High performance, no port mapping overhead |
| overlay | Multi-host networking for Docker Swarm | Distributed applications across multiple hosts |
| none | No networking - complete isolation | Maximum security, offline processing |
| macvlan | Assign MAC address - container appears as physical device | Legacy applications expecting direct network access |
Networking Commands
nginx] API[api container
flask] DB[db container
postgres] end Web -->|internal DNS
http://api:5000| API API -->|postgres://db:5432| DB Bridge[Docker Bridge
172.18.0.1] Web -.-> Bridge API -.-> Bridge DB -.-> Bridge end Bridge -->|Port mapping
80:80| External[External Traffic] style External fill:#E74C3C,color:#2e3440
💾 Docker Storage & Volumes
Three Types of Storage
| Type | Description | Persistence | Use Case |
|---|---|---|---|
| Volumes | Managed by Docker, stored in /var/lib/docker/volumes/ | Persists beyond container lifecycle | Database data, application state |
| Bind Mounts | Mount any host path into container | Persists on host filesystem | Development (mount source code), configs |
| tmpfs | Stored in host memory only | Lost when container stops | Sensitive data, temporary processing |
Volume Examples
🎼 Docker Compose
Docker Compose manages multi-container applications with a single YAML file.
Complete Example: Web Application Stack
# docker-compose.yml
version: '3.8'
services:
# Nginx reverse proxy
web:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- api
networks:
- frontend
restart: unless-stopped
# Python API service
api:
build:
context: ./api
dockerfile: Dockerfile
environment:
DATABASE_URL: postgresql://user:pass@db:5432/myapp
REDIS_URL: redis://cache:6379
depends_on:
db:
condition: service_healthy
cache:
condition: service_started
networks:
- frontend
- backend
restart: unless-stopped
# PostgreSQL database
db:
image: postgres:15
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
POSTGRES_DB: myapp
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- backend
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
# Redis cache
cache:
image: redis:7-alpine
networks:
- backend
restart: unless-stopped
networks:
frontend:
backend:
volumes:
postgres-data:
Docker Compose Commands
🔒 Security Best Practices
1. Run as Non-Root User
FROM python:3.11-slim
# Create user with specific UID
RUN useradd -m -u 1000 appuser
WORKDIR /app
COPY --chown=appuser:appuser . .
# Switch to non-root user
USER appuser
CMD ["python", "app.py"]
2. Use Official Images from Trusted Sources
- Prefer official images:
python:3.11notrandom-user/python - Pin specific versions:
nginx:1.24.0notnginx:latest - Verify image digests for supply chain security
3. Scan Images for Vulnerabilities
4. Minimize Attack Surface
- Use minimal base images (Alpine, Distroless)
- Multi-stage builds to exclude build tools
- Don't install unnecessary packages
- Remove package manager caches
5. Secret Management
# ❌ BAD: Secret in Dockerfile
ENV DATABASE_PASSWORD=supersecret
# ✅ GOOD: Pass at runtime
docker run -e DATABASE_PASSWORD="${DB_PASS}" my-app
# ✅ BETTER: Use Docker secrets (Swarm) or Kubernetes secrets
docker secret create db_password ./password.txt
docker service create --secret db_password my-app
# ✅ BEST: External secret manager (AWS Secrets Manager, HashiCorp Vault)
6. Resource Limits
7. Read-Only Filesystem
# Run with read-only root filesystem
docker run -d \
--read-only \
--tmpfs /tmp \
--tmpfs /var/run \
nginx
🚀 Image Optimization Strategies
Size Comparison Example
| Technique | Image Size | Reduction |
|---|---|---|
| Original (ubuntu base, all deps) | 1.2 GB | - |
| + Multi-stage build | 450 MB | 62% smaller |
| + Alpine base | 85 MB | 81% smaller than multi-stage |
| + Distroless | 52 MB | 39% smaller than Alpine |
Layer Caching Best Practices
- Order matters: Put least-changing layers first
- Separate dependencies from code: COPY package files → RUN install → COPY code
- Combine related commands: Use && to chain RUN commands
- Clean up in same layer: Install and clean in one RUN statement
🔄 Container Runtime Comparison
| Runtime | Description | OCI Compliant | Use Case |
|---|---|---|---|
| Docker | Full platform with daemon, CLI, build tools | Yes | Development, general use |
| containerd | Industry-standard container runtime (Docker uses it) | Yes | Kubernetes default, production |
| CRI-O | Lightweight runtime designed for Kubernetes | Yes | Kubernetes-only environments |
| Podman | Daemonless alternative to Docker | Yes | Rootless containers, no daemon |
| runc | Low-level runtime that actually runs containers | Yes (reference implementation) | Used by other runtimes |
💻 Essential Docker Commands
Image Commands
Container Commands
System Commands
🎯 When to Use Containers vs VMs
Use Containers When:
- Building microservices architectures
- Need rapid deployment and scaling
- Running multiple instances of same application
- Development environment consistency (dev/prod parity)
- CI/CD pipelines
- Running Linux applications on Linux host
- Want minimal resource overhead
Use VMs When:
- Need different OS than host (Windows on Linux)
- Require strong isolation for security/compliance
- Running legacy monolithic applications
- Need full OS-level isolation
- Kernel-level operations required
- Long-running stateful applications with complex dependencies
Hybrid Approach:
Many production systems use both - VMs for strong isolation, containers within VMs for efficient resource usage:
- AWS ECS/EKS: Containers run on EC2 VMs
- Google GKE: Kubernetes nodes are VMs running containers
- Multi-tenant environments: VMs per tenant, containers per service
🎓 Interview Questions & Answers
1. What's the difference between CMD and ENTRYPOINT?
CMD: Default command that can be overridden
ENTRYPOINT: Always executed, CMD becomes arguments to ENTRYPOINT
# CMD only - can be overridden
FROM ubuntu
CMD ["echo", "hello"]
# docker run myimage → "hello"
# docker run myimage echo bye → "bye"
# ENTRYPOINT + CMD
FROM ubuntu
ENTRYPOINT ["echo"]
CMD ["hello"]
# docker run myimage → "hello"
# docker run myimage bye → "bye"
2. How does Docker layer caching work?
Docker caches each layer. If a layer hasn't changed, Docker reuses the cached version. Cache is invalidated if:
- The Dockerfile instruction changes
- Files referenced by COPY/ADD change
- Any parent layer changes (invalidates all subsequent layers)
Strategy: Put least-changing instructions first, most-changing last.
3. What happens when a container stops?
- Process is sent SIGTERM (graceful shutdown)
- After 10s (default), SIGKILL is sent (force kill)
- Container layer (writable) still exists until removed
- Volumes persist
- Network connections are released
4. How do you debug a crashed container?
# View logs from stopped container
docker logs container-name
# Inspect exit code and state
docker inspect container-name | grep -A 10 State
# Start container with different command
docker run -it --entrypoint /bin/sh image-name
# Common exit codes:
# 0 - Success
# 1 - Application error
# 137 - SIGKILL (OOM killed, or manual kill -9)
# 139 - Segmentation fault
# 143 - SIGTERM (graceful shutdown)
5. What is the Docker overlay network?
Overlay networks enable containers on different Docker hosts to communicate. Used in Docker Swarm and can be used with standalone containers.
- Uses VXLAN encapsulation
- Requires key-value store (etcd, Consul) or Swarm mode
- Provides service discovery and load balancing
- Encrypts traffic between nodes (--opt encrypted)
6. Explain Docker's copy-on-write strategy
All image layers are read-only. When a container modifies a file:
- Docker searches for file in layers (top to bottom)
- File is copied to container's writable layer
- Modification happens in the copy
- Original in image layer remains unchanged
Benefit: Multiple containers can share same image layers, saving disk space.
🔗 Related Technologies
Container Orchestration
- Kubernetes: Industry-standard container orchestration (covered in separate guide)
- Docker Swarm: Docker's native clustering solution (simpler than K8s)
- Amazon ECS: AWS container orchestration service
- Nomad: HashiCorp's orchestration tool
Build Tools
- BuildKit: Next-gen Docker build engine (parallel builds, caching improvements)
- Kaniko: Build images in Kubernetes without Docker daemon
- Buildah: Build OCI images without Docker
Registries
- Docker Hub: Public registry
- AWS ECR: Amazon Elastic Container Registry
- Google GCR: Google Container Registry
- Harbor: Open-source enterprise registry with security scanning
- JFrog Artifactory: Universal artifact repository