How Docker Works: Understanding Container Architecture

You've probably run docker run hello-world a dozen times and seen the "Hello from Docker!" message. But have you ever wondered what actually happens under the hood? When you execute that command, Docker doesn't just magically create a lightweight virtual machine. It's doing something more clever—and more fundamental.

Docker containers share the host operating system kernel but run in isolated user spaces with their own filesystems, network stacks, and process trees. This isolation comes from two Linux kernel features: namespaces and control groups (cgroups). Understanding these mechanisms is the key to grasping how containers work and why they're so different from traditional virtual machines.

Namespaces: Isolating the View

Namespaces are the first layer of container isolation. They provide a view of the system that's limited to what the container needs. Think of namespaces as different windows into the same underlying system, where each window shows only what the container is allowed to see.

Mount Namespace

The mount namespace controls what filesystems are visible to a process. When you run a container, its mount namespace starts with an empty view. Docker then mounts the container's filesystem image (the root filesystem) into this namespace. The container sees only its own filesystem, not the host's.

# Inside a container, you see only the container's filesystem
docker run -it ubuntu:latest bash
root@container:/# ls /
bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
root@container:/# cat /etc/hostname
container

Network Namespace

Network namespaces give each container its own network stack. This means containers can have their own IP addresses, network interfaces, and routing tables. When you run a container, Docker creates a virtual network interface (veth pair) and connects it to a bridge network.

# Containers get their own IP addresses
docker run -d --name web nginx
docker inspect web | grep IPAddress

The output shows the container's private IP address, completely separate from the host's network stack. Containers can communicate with each other on the same network but are isolated from the host's network by default.

PID Namespace

The PID namespace provides a unique view of processes. Each container gets its own process tree, starting with PID 1. This means a container can have a process with PID 1, while the host system has its own processes with different PIDs.

# Inside the container, PID 1 is the init process
docker run -it ubuntu:latest bash
root@container:/# ps aux
PID   USER     TIME  COMMAND
1     root     0:00  /sbin/init
7     root     0:00  bash

The host system sees the container process as a single process with a different PID. This isolation prevents containers from seeing or interfering with each other's processes.

User and UTS Namespaces

User namespaces map container user IDs to host user IDs, allowing containers to run as non-root users even when the host is running as root. The UTS namespace separates the hostname and domain name, so each container can have its own hostname.

# Containers can have different hostnames
docker run -it --hostname mycontainer ubuntu:latest bash
root@mycontainer:/# hostname
mycontainer

Control Groups: Limiting Resources

While namespaces provide isolation, control groups (cgroups) limit and account for resource usage. Cgroups ensure that one container doesn't starve the others or consume all available resources on the host.

CPU Limits

Cgroups can limit the amount of CPU time a process can use. This prevents a misbehaving container from monopolizing the host's CPU cores.

# Limit container to 0.5 CPU cores
docker run --cpus="0.5" nginx

When you set this limit, the container's processes are throttled if they try to use more CPU time than allowed. The kernel's scheduler ensures fair distribution among all containers.

Memory Limits

Memory limits prevent containers from consuming excessive memory, which could cause the host to run out of RAM and trigger the OOM killer.

# Limit container to 512MB of memory
docker run --memory="512m" redis

If a container exceeds its memory limit, Docker sends a SIGTERM signal to its processes. If they don't shut down gracefully within a timeout period, Docker sends SIGKILL to forcefully terminate them.

Block I/O Limits

Cgroups can also limit the amount of disk I/O a container can perform. This is useful for preventing a single container from overwhelming the host's storage subsystem.

# Limit container to 10MB/s read and write
docker run --device-read-bps /dev/sda:10mb --device-write-bps /dev/sda:10mb app

The Container Lifecycle

Understanding the container lifecycle helps you debug issues and optimize your deployments. Containers go through several states during their lifetime.

Created State

When you run docker create, the container is created but not started. The container image is downloaded, and the container filesystem is prepared, but no processes are running.

# Create but don't start the container
docker create --name myapp nginx
docker ps -a

The -a flag shows all containers, including those in the Created state.

Running State

When you run docker start, the container's main process (defined in the Dockerfile's CMD or ENTRYPOINT) begins executing. The container is now actively using resources and can process requests.

# Start the container
docker start myapp
docker ps

The container appears in the output of docker ps with a status of Up.

Paused State

You can pause a running container with docker pause. This stops all processes in the container but keeps the container running. The container remains in the Paused state until you resume it.

# Pause the container
docker pause myapp
docker ps

The container status changes to Paused.

Restarting State

If a container exits and the --restart policy is set, Docker will attempt to restart it. The container goes through a Restarting state during this process.

# Run with auto-restart policy
docker run --restart=always nginx

Exited State

When a container stops normally (either by running docker stop or the main process exiting), it enters the Exited state. You can inspect the exit code and logs.

# Stop the container
docker stop myapp
docker ps -a
docker logs myapp

Dead State

If a container is forcefully killed (e.g., by the OOM killer or docker kill), it enters the Dead state. This typically happens when the container cannot be stopped gracefully.

# Force kill the container
docker kill myapp
docker ps -a

Container vs Virtual Machine

The key difference between containers and virtual machines lies in how they use the host's resources.

Feature	Container	Virtual Machine
Kernel	Shares host kernel	Has its own kernel
Size	~100MB	~1-10GB
Startup Time	Milliseconds	Seconds to minutes
Resource Overhead	Minimal	Significant
Isolation	Process-level	Hardware-level

Containers share the host kernel, which means they're much lighter and faster to start. However, this also means containers can only run applications compatible with the host kernel. Virtual machines have their own kernel, providing complete isolation but at the cost of higher resource usage and slower startup times.

Practical Considerations

Security Implications

Because containers share the host kernel, a vulnerability in the kernel could potentially affect all containers running on that host. This is why running containers as non-root users and keeping the host kernel updated is critical.

# Run container as non-root user
docker run -u 1000:1000 app

Resource Contention

When multiple containers compete for resources, cgroups ensure fair distribution. However, you should still monitor resource usage to prevent one container from degrading the performance of others.

# Monitor container resource usage
docker stats

Storage Performance

Containers use copy-on-write filesystems, which means they don't copy the entire image when started. Instead, they reference the layers of the image and only write changes to a writable layer. This makes containers efficient but can impact storage performance if not managed properly.

# Inspect container storage
docker inspect myapp | grep -A 10 "Mounts"

Conclusion

Docker's container architecture relies on namespaces for isolation and cgroups for resource management. This combination allows containers to be lightweight, fast, and efficient while providing strong isolation guarantees. Understanding these mechanisms helps you debug issues, optimize resource usage, and design more reliable containerized applications.

If you're managing multiple containers and complex deployments, platforms like ServerlessBase can help you automate container orchestration, manage networking, and handle SSL certificates, so you can focus on building great applications rather than wrestling with infrastructure.

Next Steps:

Explore Docker images and layers in the next article
Learn how to write your first Dockerfile
Understand Docker networking and storage options