ServerlessBase Blog
  • How Docker Works: Understanding Container Architecture

    A deep dive into Docker's container architecture, including namespaces, cgroups, and the container lifecycle.

    How Docker Works: Understanding Container Architecture

    You've probably run docker run hello-world a dozen times and seen the "Hello from Docker!" message. But have you ever wondered what actually happens under the hood? When you execute that command, Docker doesn't just magically create a lightweight virtual machine. It's doing something more clever—and more fundamental.

    Docker containers share the host operating system kernel but run in isolated user spaces with their own filesystems, network stacks, and process trees. This isolation comes from two Linux kernel features: namespaces and control groups (cgroups). Understanding these mechanisms is the key to grasping how containers work and why they're so different from traditional virtual machines.

    Namespaces: Isolating the View

    Namespaces are the first layer of container isolation. They provide a view of the system that's limited to what the container needs. Think of namespaces as different windows into the same underlying system, where each window shows only what the container is allowed to see.

    Mount Namespace

    The mount namespace controls what filesystems are visible to a process. When you run a container, its mount namespace starts with an empty view. Docker then mounts the container's filesystem image (the root filesystem) into this namespace. The container sees only its own filesystem, not the host's.

    # Inside a container, you see only the container's filesystem
    docker run -it ubuntu:latest bash
    root@container:/# ls /
    bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var
    root@container:/# cat /etc/hostname
    container

    Network Namespace

    Network namespaces give each container its own network stack. This means containers can have their own IP addresses, network interfaces, and routing tables. When you run a container, Docker creates a virtual network interface (veth pair) and connects it to a bridge network.

    # Containers get their own IP addresses
    docker run -d --name web nginx
    docker inspect web | grep IPAddress

    The output shows the container's private IP address, completely separate from the host's network stack. Containers can communicate with each other on the same network but are isolated from the host's network by default.

    PID Namespace

    The PID namespace provides a unique view of processes. Each container gets its own process tree, starting with PID 1. This means a container can have a process with PID 1, while the host system has its own processes with different PIDs.

    # Inside the container, PID 1 is the init process
    docker run -it ubuntu:latest bash
    root@container:/# ps aux
    PID   USER     TIME  COMMAND
    1     root     0:00  /sbin/init
    7     root     0:00  bash

    The host system sees the container process as a single process with a different PID. This isolation prevents containers from seeing or interfering with each other's processes.

    User and UTS Namespaces

    User namespaces map container user IDs to host user IDs, allowing containers to run as non-root users even when the host is running as root. The UTS namespace separates the hostname and domain name, so each container can have its own hostname.

    # Containers can have different hostnames
    docker run -it --hostname mycontainer ubuntu:latest bash
    root@mycontainer:/# hostname
    mycontainer

    Control Groups: Limiting Resources

    While namespaces provide isolation, control groups (cgroups) limit and account for resource usage. Cgroups ensure that one container doesn't starve the others or consume all available resources on the host.

    CPU Limits

    Cgroups can limit the amount of CPU time a process can use. This prevents a misbehaving container from monopolizing the host's CPU cores.

    # Limit container to 0.5 CPU cores
    docker run --cpus="0.5" nginx

    When you set this limit, the container's processes are throttled if they try to use more CPU time than allowed. The kernel's scheduler ensures fair distribution among all containers.

    Memory Limits

    Memory limits prevent containers from consuming excessive memory, which could cause the host to run out of RAM and trigger the OOM killer.

    # Limit container to 512MB of memory
    docker run --memory="512m" redis

    If a container exceeds its memory limit, Docker sends a SIGTERM signal to its processes. If they don't shut down gracefully within a timeout period, Docker sends SIGKILL to forcefully terminate them.

    Block I/O Limits

    Cgroups can also limit the amount of disk I/O a container can perform. This is useful for preventing a single container from overwhelming the host's storage subsystem.

    # Limit container to 10MB/s read and write
    docker run --device-read-bps /dev/sda:10mb --device-write-bps /dev/sda:10mb app

    The Container Lifecycle

    Understanding the container lifecycle helps you debug issues and optimize your deployments. Containers go through several states during their lifetime.

    Created State

    When you run docker create, the container is created but not started. The container image is downloaded, and the container filesystem is prepared, but no processes are running.

    # Create but don't start the container
    docker create --name myapp nginx
    docker ps -a

    The -a flag shows all containers, including those in the Created state.

    Running State

    When you run docker start, the container's main process (defined in the Dockerfile's CMD or ENTRYPOINT) begins executing. The container is now actively using resources and can process requests.

    # Start the container
    docker start myapp
    docker ps

    The container appears in the output of docker ps with a status of Up.

    Paused State

    You can pause a running container with docker pause. This stops all processes in the container but keeps the container running. The container remains in the Paused state until you resume it.

    # Pause the container
    docker pause myapp
    docker ps

    The container status changes to Paused.

    Restarting State

    If a container exits and the --restart policy is set, Docker will attempt to restart it. The container goes through a Restarting state during this process.

    # Run with auto-restart policy
    docker run --restart=always nginx

    Exited State

    When a container stops normally (either by running docker stop or the main process exiting), it enters the Exited state. You can inspect the exit code and logs.

    # Stop the container
    docker stop myapp
    docker ps -a
    docker logs myapp

    Dead State

    If a container is forcefully killed (e.g., by the OOM killer or docker kill), it enters the Dead state. This typically happens when the container cannot be stopped gracefully.

    # Force kill the container
    docker kill myapp
    docker ps -a

    Container vs Virtual Machine

    The key difference between containers and virtual machines lies in how they use the host's resources.

    FeatureContainerVirtual Machine
    KernelShares host kernelHas its own kernel
    Size~100MB~1-10GB
    Startup TimeMillisecondsSeconds to minutes
    Resource OverheadMinimalSignificant
    IsolationProcess-levelHardware-level

    Containers share the host kernel, which means they're much lighter and faster to start. However, this also means containers can only run applications compatible with the host kernel. Virtual machines have their own kernel, providing complete isolation but at the cost of higher resource usage and slower startup times.

    Practical Considerations

    Security Implications

    Because containers share the host kernel, a vulnerability in the kernel could potentially affect all containers running on that host. This is why running containers as non-root users and keeping the host kernel updated is critical.

    # Run container as non-root user
    docker run -u 1000:1000 app

    Resource Contention

    When multiple containers compete for resources, cgroups ensure fair distribution. However, you should still monitor resource usage to prevent one container from degrading the performance of others.

    # Monitor container resource usage
    docker stats

    Storage Performance

    Containers use copy-on-write filesystems, which means they don't copy the entire image when started. Instead, they reference the layers of the image and only write changes to a writable layer. This makes containers efficient but can impact storage performance if not managed properly.

    # Inspect container storage
    docker inspect myapp | grep -A 10 "Mounts"

    Conclusion

    Docker's container architecture relies on namespaces for isolation and cgroups for resource management. This combination allows containers to be lightweight, fast, and efficient while providing strong isolation guarantees. Understanding these mechanisms helps you debug issues, optimize resource usage, and design more reliable containerized applications.

    If you're managing multiple containers and complex deployments, platforms like ServerlessBase can help you automate container orchestration, manage networking, and handle SSL certificates, so you can focus on building great applications rather than wrestling with infrastructure.


    Next Steps:

    • Explore Docker images and layers in the next article
    • Learn how to write your first Dockerfile
    • Understand Docker networking and storage options

    Leave comment