ServerlessBase Blog
  • Docker Troubleshooting: Common Issues and Solutions

    A comprehensive guide to diagnosing and fixing common Docker problems including container crashes, network issues, storage problems, and performance bottlenecks.

    Docker Troubleshooting: Common Issues and Solutions

    You've built a containerized application, pushed it to a registry, and tried to run it on your development machine. Then everything breaks. The container exits immediately. The network connection times out. The disk fills up. Docker errors are frustrating because they often lack context, and the documentation assumes you already know what's wrong.

    This guide covers the most common Docker problems you'll encounter and how to diagnose and fix them. We'll focus on practical solutions you can apply immediately.

    Understanding Docker Debugging Tools

    Before diving into specific issues, you need to know the basic debugging commands. These are your first line of defense.

    # View container logs
    docker logs <container_id>
     
    # View logs with timestamps and follow output
    docker logs -t -f <container_id>
     
    # View detailed container information
    docker inspect <container_id>
     
    # Enter a running container shell
    docker exec -it <container_id> /bin/bash
     
    # View container resource usage
    docker stats
     
    # View container processes
    docker top <container_id>
     
    # View detailed container configuration
    docker inspect <container_id> --format='{{json .Config}}' | jq

    The docker logs command is your primary tool for understanding what's happening inside a container. The -f flag follows the output in real-time, which is essential for debugging applications that write to stdout/stderr. docker inspect provides detailed information about the container's configuration, environment, and state.

    Container Exits Immediately

    A container that exits immediately after starting is a common issue. The exit code tells you what went wrong.

    # Run container and see exit code
    docker run --rm my-image
    echo "Exit code: $?"
     
    # Check exit code from logs
    docker logs <container_id>

    Common exit codes and their meanings:

    Exit CodeMeaningCommon Cause
    0SuccessContainer ran successfully
    1Application errorApplication crashed
    125Docker errorDocker daemon issue
    126Permission errorBinary not executable
    127Command not foundEntrypoint issue

    Common Causes and Solutions

    1. Entrypoint or CMD mismatch

    The container might be trying to run a command that doesn't exist or isn't executable.

    # Check the entrypoint and command
    docker inspect <container_id> --format='{{.Config.Entrypoint}} {{.Config.Cmd}}'
     
    # Try running the command manually
    docker run --rm -it my-image /bin/bash

    2. Missing environment variables

    Applications often require specific environment variables to function correctly.

    # Check what environment variables are set
    docker inspect <container_id> --format='{{range .Config.Env}}{{println .}}{{end}}'
     
    # Pass required variables
    docker run -e DATABASE_URL=postgres://user:pass@host:5432/db my-image

    3. Working directory issues

    The application might be trying to write to a directory that doesn't exist or isn't writable.

    # Check the working directory
    docker inspect <container_id> --format='{{.Config.WorkingDir}}'
     
    # Verify directory exists and is writable
    docker exec <container_id> ls -la /app
    docker exec <container_id> touch /app/test.txt

    4. Resource limits

    The container might be hitting resource limits and being killed.

    # Check if OOM killer killed the container
    docker inspect <container_id> --format='{{.State.OOMKilled}}'
     
    # Increase memory limit
    docker run --memory="2g" my-image

    Network Issues

    Network problems are among the most frustrating Docker issues. Containers can't reach each other, services time out, or connections fail unexpectedly.

    Container Cannot Connect to External Services

    Symptom: Container logs show connection refused or timeout errors when trying to reach external services.

    # Test connectivity from the container
    docker exec <container_id> curl -v https://api.example.com
     
    # Check DNS resolution
    docker exec <container_id> nslookup api.example.com
     
    # Check network interface
    docker exec <container_id> ip addr show

    Solutions:

    # Use host network mode for debugging
    docker run --network host my-image
     
    # Check if firewall is blocking connections
    docker exec <container_id> iptables -L -n
     
    # Verify DNS configuration
    docker run --dns 8.8.8.8 --dns 8.8.4.4 my-image

    Container Cannot Reach Other Containers

    Symptom: Container A cannot connect to container B on the same Docker network.

    # Check if both containers are on the same network
    docker inspect <container_id_a> --format='{{range $net, $conf := .NetworkSettings.Networks}}{{$net}} {{end}}'
    docker inspect <container_id_b> --format='{{range $net, $conf := .NetworkSettings.Networks}}{{$net}} {{end}}'
     
    # Check container names and aliases
    docker inspect <container_id_b> --format='{{.Name}} {{range $k, $v := .Config.Labels}}{{println $k $v}}{{end}}'
     
    # Test connectivity between containers
    docker exec <container_id_a> ping <container_name_or_id>
    docker exec <container_id_a> curl http://<container_name>:8080

    Common issues:

    • Containers on different networks
    • Service not running inside the container
    • Wrong port mapping
    • Firewall rules blocking internal traffic

    Port Conflicts

    Symptom: Container fails to start with "port is already allocated" error.

    # Check what's using the port
    sudo lsof -i :8080
    sudo netstat -tulpn | grep 8080
     
    # Find the container using the port
    docker ps --filter "publish=8080"
     
    # Stop the conflicting container
    docker stop <container_id>
     
    # Or use a different port
    docker run -p 8081:8080 my-image

    Storage and Volume Issues

    Storage problems manifest as permission errors, disk full errors, or data loss.

    Permission Denied on Volumes

    Symptom: Container logs show "permission denied" when trying to write to mounted volumes.

    # Check volume permissions
    docker exec <container_id> ls -la /app/data
     
    # Check the host directory permissions
    ls -la /path/to/host/directory
     
    # Fix permissions on host
    sudo chown -R 1000:1000 /path/to/host/directory
    sudo chmod -R 755 /path/to/host/directory

    Understanding user IDs:

    Docker containers typically run as a specific user (often 1000 or 999). When you mount a host directory into the container, the container's user might not have permission to write to it. The solution is to ensure the host directory has appropriate permissions for the container's user ID.

    Disk Full Errors

    Symptom: Container exits with "no space left on device" error.

    # Check Docker disk usage
    docker system df
     
    # Check container disk usage
    docker stats --no-stream
     
    # Clean up unused resources
    docker system prune -a
     
    # Check if there are large images
    docker images
     
    # Remove unused images
    docker image prune -a

    Volume Not Persisting

    Symptom: Data disappears when the container is removed.

    # Verify volume is mounted
    docker inspect <container_id> --format='{{json .Mounts}}' | jq
     
    # Check volume exists
    docker volume ls
     
    # Check volume contents
    docker run --rm -v <volume_name>:/data alpine ls -la /data
     
    # Create volume if it doesn't exist
    docker volume create <volume_name>

    Performance Issues

    Slow containers, high CPU usage, or memory leaks can indicate deeper problems.

    High CPU Usage

    Symptom: Container consuming excessive CPU resources.

    # Check CPU usage
    docker stats --no-stream
     
    # Check container processes
    docker top <container_id>
     
    # Check if the process is in an infinite loop
    docker exec <container_id> ps aux
     
    # Profile CPU usage
    docker exec <container_id> top -b -n 1

    Common causes:

    • Infinite loops in application code
    • Inefficient algorithms
    • Resource-intensive operations running continuously
    • Background jobs consuming resources

    High Memory Usage

    Symptom: Container consuming excessive memory or being killed by OOM.

    # Check memory usage
    docker stats --no-stream
     
    # Check if OOM killed
    docker inspect <container_id> --format='{{.State.OOMKilled}}'
     
    # Check memory limits
    docker inspect <container_id> --format='{{.HostConfig.Memory}}'
     
    # Increase memory limit
    docker run --memory="4g" my-image

    Slow Container Startup

    Symptom: Container takes a long time to start or initialize.

    # Check startup logs
    docker logs <container_id>
     
    # Check if there are long-running initialization commands
    docker exec <container_id> ps aux
     
    # Profile startup time
    docker run --time 60 my-image
     
    # Check for slow dependencies
    docker exec <container_id> strace -T -c command

    Build Issues

    Build failures are common and can be caused by various issues in the Dockerfile.

    Build Context Issues

    Symptom: Build fails with "file not found" errors.

    # Check build context
    docker build --no-cache --progress=plain .
     
    # Verify files are in the build context
    docker build -t test . --target=stage1 --target=stage2
     
    # Use a specific build context
    docker build -t myimage -f Dockerfile.prod .

    Common causes:

    • Files not in the build context directory
    • Incorrect Dockerfile path
    • Build context too large (causing timeouts)

    Layer Caching Issues

    Symptom: Build takes longer than expected or produces unexpected results.

    # Check build cache
    docker build --no-cache --progress=plain .
     
    # Check layer cache
    docker history myimage
     
    # Force rebuild specific layers
    docker build --no-cache -t myimage .
     
    # Optimize Dockerfile for caching
    # Put COPY and RUN commands that change least frequently first

    Dependency Issues

    Symptom: Build fails with missing dependencies or version conflicts.

    # Check if dependencies are installed
    docker build --target builder -t myimage .
     
    # Verify package installation
    docker run --rm myimage apt list --installed
     
    # Check for missing dependencies
    docker run --rm myimage sh -c "command -v package || echo 'Missing package'"

    Debugging Workflow

    When you encounter a Docker problem, follow this systematic approach:

    1. Gather information

      docker ps -a
      docker logs &lt;container_id>
      docker inspect &lt;container_id>
    2. Reproduce the issue

      docker run --rm -it my-image
    3. Isolate the problem

      # Try running the command manually
      docker run --rm -it my-image /bin/bash
       
      # Test individual components
      docker run --rm my-image command1
      docker run --rm my-image command2
    4. Check logs and errors

      docker logs &lt;container_id>
      docker logs &lt;container_id> 2>&1 | grep -i error
    5. Verify configuration

      docker inspect &lt;container_id> --format='{{json .Config}}' | jq
    6. Test with minimal setup

      docker run --rm -it --entrypoint /bin/bash my-image

    Common Anti-Patterns to Avoid

    1. Running containers as root

    # WRONG - Running as root
    docker run -it my-image
     
    # CORRECT - Running as non-root user
    docker run -it --user $(id -u):$(id -g) my-image

    2. Hardcoding secrets in Dockerfiles

    # WRONG - Secrets in Dockerfile
    ENV DATABASE_PASSWORD=secret123
     
    # CORRECT - Use secrets at runtime
    docker run -e DATABASE_PASSWORD=$DATABASE_PASSWORD my-image

    3. Not using health checks

    # WRONG - No health check
    docker run my-image
     
    # CORRECT - With health check
    docker run --health-cmd="curl -f http://localhost:8080/health || exit 1" my-image

    4. Ignoring resource limits

    # WRONG - No resource limits
    docker run my-image
     
    # CORRECT - With resource limits
    docker run --memory="2g" --cpus="2" my-image

    Conclusion

    Docker troubleshooting requires a systematic approach. Start by gathering information with docker logs, docker inspect, and docker stats. Then isolate the problem by testing individual components and configurations. Remember that Docker containers are ephemeral by design—everything that needs to persist should be in volumes, and everything that needs to be configured should be in environment variables.

    The key to effective troubleshooting is understanding how Docker works under the hood: containers run as isolated processes, networks are virtual, and storage is mounted from the host. When you understand these fundamentals, debugging becomes much easier.

    Platforms like ServerlessBase simplify deployment by handling reverse proxy configuration and SSL certificate provisioning automatically, so you can focus on your application rather than fighting with Docker networking and configuration issues.

    Leave comment