ServerlessBase Blog
  • Introduction to Server Documentation Best Practices

    Learn essential server documentation practices for maintaining clear, actionable, and up-to-date infrastructure records that improve team collaboration and reduce operational risks.

    Introduction to Server Documentation Best Practices

    You've just inherited a production server with no documentation. The previous admin left three months ago, and you're staring at a terminal wondering what services are running, which ports are exposed, and why the database connection string looks suspicious. This scenario happens more often than you'd think, and it's a nightmare that can be avoided with proper documentation practices.

    Good server documentation isn't just about creating files—it's about creating a living resource that evolves with your infrastructure. When documentation is accurate, up-to-date, and easy to understand, it becomes a force multiplier for your team. New engineers onboard faster, troubleshooting time drops, and the risk of catastrophic configuration errors decreases significantly.

    This guide covers the essential practices for documenting servers effectively, from inventory systems to runbooks, ensuring your infrastructure remains maintainable and your team stays aligned.

    What Makes Documentation Effective

    Effective server documentation shares several key characteristics. First, it's actionable. Generic statements like "configure the database" don't help anyone. Specific instructions like "set max_connections to 200 and restart PostgreSQL" provide clear next steps.

    Second, it's current. Documentation that describes a server configuration that hasn't existed for six months is worse than no documentation at all—it actively misleads. Every change to your infrastructure should trigger documentation updates.

    Third, it's accessible. Your team needs to find information quickly without wading through walls of text. Good documentation uses clear headings, concise language, and logical organization.

    Finally, it's collaborative. Documentation should be a living document that multiple people contribute to and maintain. When team members own specific sections, everyone stays engaged with keeping it accurate.

    Building a Server Inventory System

    A server inventory is the foundation of good documentation. It provides a bird's-eye view of all your infrastructure assets, their locations, and their purposes. Without this overview, you're flying blind.

    Inventory Data Points

    Every server entry should include these essential details:

    • Hostname - The DNS name or internal identifier
    • IP Address - Both internal and external addresses
    • Location - Data center, region, or physical location
    • Purpose - What this server does (web server, database, cache, etc.)
    • Owner - Who is responsible for this server
    • Operating System - Version and distribution
    • Services Running - List of applications and their versions
    • Resource Allocation - CPU, memory, and disk specifications
    • Network Configuration - Subnets, firewalls, and ports
    • Backup Status - Whether backups are configured and tested
    • Last Updated - When the documentation was last modified

    Inventory Table Structure

    Here's a comparison of different inventory formats you might use:

    FormatBest ForProsCons
    Spreadsheet (Excel/Google Sheets)Small teams, simple infraEasy to create, visualVersion control issues, hard to share
    Markdown filesMedium teams, documentation-firstVersion controlled, easy to editRequires tooling for collaboration
    DatabaseLarge teams, complex infraPowerful queries, relationshipsOverkill for small setups
    Configuration management toolsDevOps-heavy teamsAuto-generated, integratedSteep learning curve

    For most teams starting out, a simple markdown inventory file in your repository works well. It's version controlled, easy to share, and can be converted to other formats later if needed.

    Creating Your First Inventory

    Let's create a basic server inventory markdown file:

    # Server Inventory
     
    ## Production Servers
     
    | Hostname | IP Address | Location | Purpose | Owner | OS | Services |
    |----------|------------|----------|---------|-------|-----|----------|
    | web-01 | 10.0.1.10 | us-east-1a | Web server | devops-team | Ubuntu 22.04 | Nginx, Node.js |
    | db-01 | 10.0.1.20 | us-east-1a | PostgreSQL | devops-team | Ubuntu 22.04 | PostgreSQL 15 |
    | cache-01 | 10.0.1.30 | us-east-1a | Redis | devops-team | Ubuntu 22.04 | Redis 7 |
     
    ## Development Servers
     
    | Hostname | IP Address | Location | Purpose | Owner | OS | Services |
    |----------|------------|----------|---------|-------|-----|----------|
    | dev-web-01 | 10.0.2.10 | us-east-1b | Development web server | frontend-team | Ubuntu 20.04 | Nginx, Node.js |
    | dev-db-01 | 10.0.2.20 | us-east-1b | Development database | backend-team | Ubuntu 20.04 | PostgreSQL 14 |

    This inventory provides a quick reference for anyone who needs to understand your infrastructure layout. Each row represents a server, and the columns capture the essential information needed for day-to-day operations.

    Documenting Server Configuration

    Configuration documentation captures the specific settings that make each server unique. This includes system-level settings, application configurations, and network rules. The goal is to provide a complete picture of how the server is configured without overwhelming readers with every single setting.

    System Configuration Documentation

    Document the core system settings that affect server behavior:

    • Kernel parameters - Tunables like vm.swappiness, net.core.somaxconn
    • File system layout - Partition schemes, mount points, disk usage
    • User and group accounts - Admin accounts, service users, permissions
    • System services - Enabled/disabled services, startup order
    • Security settings - Firewall rules, SELinux/AppArmor policies, authentication methods

    Here's an example of documenting a server's firewall configuration:

    # Firewall Configuration for web-01
     
    # Allow SSH (port 22) from management network
    sudo ufw allow from 10.0.0.0/8 to any port 22
     
    # Allow HTTP and HTTPS
    sudo ufw allow 80/tcp
    sudo ufw allow 443/tcp
     
    # Allow Nginx status checks
    sudo ufw allow from 10.0.0.0/8 to any port 8080
     
    # Enable firewall
    sudo ufw enable

    This configuration ensures that only authorized networks can access management interfaces while allowing public traffic on web ports. Documenting these rules makes it easy to audit who can access what and to make changes safely.

    Application Configuration Documentation

    For each service running on the server, document:

    • Installation method - Package manager, Docker, manual install
    • Configuration file locations - Paths to config files
    • Environment variables - Required and optional variables
    • Startup commands - How to start, stop, and restart the service
    • Health checks - How to verify the service is running correctly

    Consider this example for a Node.js application:

    # Node.js Application Configuration
     
    # Application: my-app
    # Location: /opt/my-app
    # Process: pm2
     
    # Environment variables (stored in /opt/my-app/.env)
    NODE_ENV=production
    DATABASE_URL=postgresql://user:pass@db-01:5432/mydb
    REDIS_URL=redis://cache-01:6379
    PORT=3000
     
    # PM2 process list
    pm2 list
     
    # Start command
    cd /opt/my-app && pm2 start dist/index.js --name my-app
     
    # Logs location
    /var/log/my-app/
     
    # Health check endpoint
    curl http://localhost:3000/health

    This documentation tells you everything you need to know about running the application: where it lives, how it's started, what configuration it needs, and how to verify it's healthy.

    Network Documentation

    Network configuration is often the most complex part of server documentation. Misconfigured networks cause outages that can take hours to diagnose. Clear network documentation helps you understand how traffic flows through your infrastructure.

    Network Diagrams

    Create visual representations of your network architecture:

    • Physical topology - How servers are connected in the data center
    • Logical topology - How services communicate with each other
    • Firewall rules - Which ports are open and to whom
    • Load balancer configuration - How traffic is distributed

    Here's a simple network diagram in mermaid format:

    graph TD
        A[Internet] --> B[Load Balancer]
        B --> C[web-01]
        B --> D[web-02]
        B --> E[web-03]
        C --> F[db-01]
        D --> F
        E --> F
        C --> G[cache-01]
        D --> G
        E --> G

    This diagram shows how traffic flows from the internet through a load balancer to three web servers, which then connect to a database and cache server. Visual documentation makes it easy to understand complex relationships at a glance.

    Firewall and Security Rules

    Document every firewall rule with its purpose and justification:

    # Firewall Rules Documentation
     
    # Rule 1: Allow SSH from management network
    # Purpose: Admin access
    # Source: 10.0.0.0/8
    # Destination: Any
    # Port: 22/tcp
     
    # Rule 2: Allow HTTP traffic
    # Purpose: Web server
    # Source: Any
    # Destination: Any
    # Port: 80/tcp
     
    # Rule 3: Allow HTTPS traffic
    # Purpose: Web server
    # Source: Any
    # Destination: Any
    # Port: 443/tcp
     
    # Rule 4: Allow database connections from web servers
    # Purpose: Application connectivity
    # Source: 10.0.1.0/24 (web servers)
    # Destination: 10.0.1.20 (db-01)
    # Port: 5432/tcp
     
    # Rule 5: Allow Redis connections from web servers
    # Purpose: Application caching
    # Source: 10.0.1.0/24 (web servers)
    # Destination: 10.0.1.30 (cache-01)
    # Port: 6379/tcp

    This documentation serves two purposes: it explains why each rule exists, and it provides a clear record that can be audited and updated as needed.

    Creating Runbooks for Common Tasks

    Runbooks are step-by-step procedures for performing common operational tasks. They transform tribal knowledge into documented processes that anyone can follow. A good runbook should be so clear that a new team member can complete the task without asking questions.

    Runbook Structure

    Every runbook should include:

    • Title and purpose - What the runbook covers and when to use it
    • Prerequisites - What you need before starting (permissions, tools, etc.)
    • Steps - Numbered instructions with expected outputs
    • Rollback procedures - How to undo the changes if something goes wrong
    • Success criteria - How to verify the task completed successfully
    • Related documentation - Links to related runbooks or resources

    Example Runbook: Restarting a Service

    Here's a runbook for restarting a web server service:

    # Runbook: Restart Nginx Web Server
     
    ## Purpose
    Restart the Nginx web server on web-01 to apply configuration changes or resolve issues.
     
    ## Prerequisites
    - SSH access to web-01
    - sudo privileges
    - Verified configuration changes in /etc/nginx/nginx.conf
     
    ## Steps
     
    1. **Check current Nginx status**
       ```bash
       sudo systemctl status nginx

    Expected output: Nginx is running (active)

    1. Test Nginx configuration

      sudo nginx -t

      Expected output: syntax is ok, test is successful

    2. Gracefully restart Nginx

      sudo systemctl reload nginx

      This reloads the configuration without dropping connections.

    3. Verify Nginx is still running

      sudo systemctl status nginx

      Expected output: Nginx is running (active)

    4. Test a sample page

      curl -I http://localhost

      Expected output: HTTP/1.1 200 OK

    Rollback Procedure

    If the reload fails, revert to the previous configuration:

    sudo systemctl restart nginx

    Success Criteria

    • Nginx status shows "active (running)"
    • Configuration test passes
    • Sample page returns HTTP 200
    • No errors in Nginx error log
    • Nginx Configuration Guide
    • Server Restart Procedures
    • Troubleshooting Nginx Issues
    
    This runbook provides clear, actionable steps that anyone can follow. The success criteria ensure you verify the task completed correctly, and the rollback procedure gives you a safety net if something goes wrong.
    
    ## Documenting Backup and Recovery Procedures
    
    Backup documentation is critical for disaster recovery. Without clear procedures, backups can become useless when you actually need them. Good backup documentation covers what's backed up, how it's stored, and how to restore it.
    
    ### Backup Documentation Elements
    
    Document these aspects of your backup strategy:
    
    - **What is backed up** - Databases, files, configurations
    - **Backup frequency** - How often backups occur
    - **Retention policy** - How long backups are kept
    - **Storage location** - Where backups are stored (on-prem, cloud, offsite)
    - **Backup verification** - How often backups are tested
    - **Restore procedures** - Step-by-step instructions for restoring from backup
    
    ### Example Backup Documentation
    
    ```markdown
    # Database Backup Documentation
    
    ## Database: Production PostgreSQL
    
    ### Backup Schedule
    - Full backups: Daily at 2:00 AM
    - Incremental backups: Every 6 hours
    
    ### Backup Location
    - Primary: /var/backups/postgres/
    - Secondary (offsite): S3 bucket: s3://backup-repo/prod-db/
    
    ### Retention Policy
    - Full backups: 30 days
    - Incremental backups: 7 days
    - Point-in-time recovery logs: 90 days
    
    ### Backup Verification
    - Test restore: Weekly on a staging server
    - Backup integrity: Monthly checksum verification
    
    ### Restore Procedure
    
    1. **Download backup from S3**
       ```bash
       aws s3 cp s3://backup-repo/prod-db/2026-03-09-full.sql.gz /tmp/
    1. Stop database connections

      sudo systemctl stop postgresql
    2. Restore from backup

      gunzip /tmp/2026-03-09-full.sql.gz
      sudo psql -U postgres -d production < /tmp/2026-03-09-full.sql
    3. Start database

      sudo systemctl start postgresql
    4. Verify restore

      sudo psql -U postgres -d production -c "SELECT COUNT(*) FROM users;"

    Success Criteria

    • Database starts without errors
    • Data count matches expected values
    • Application can connect to database
    • No errors in PostgreSQL logs
    
    This documentation ensures that when disaster strikes, you have a clear path to recovery. The test restore procedure catches backup failures before they become critical issues.
    
    ## Version Control and Collaboration
    
    Documentation should be treated like code—it should be version controlled and collaboratively maintained. This ensures changes are tracked, reviewed, and rolled back if needed.
    
    ### Documentation Workflow
    
    1. **Create documentation in your repository** - Store documentation alongside your code
    2. **Use pull requests for changes** - Require review before merging documentation updates
    3. **Tag versions** - Document changes in release notes or changelogs
    4. **Keep it simple** - Use markdown for simplicity and compatibility
    5. **Automate where possible** - Generate documentation from configuration files
    
    ### Example Documentation Workflow
    
    ```bash
    # Create a new documentation file
    git checkout -b docs/add-server-inventory
    vim docs/inventory.md
    
    # Review changes
    git diff docs/inventory.md
    
    # Commit changes
    git add docs/inventory.md
    git commit -m "Add server inventory documentation"
    
    # Create pull request
    gh pr create --title "Add server inventory documentation" --body "Adds comprehensive server inventory with all production and development servers."
    
    # After review and approval, merge the PR
    git checkout main
    git merge docs/add-server-inventory

    This workflow ensures that documentation changes go through the same review process as code changes, maintaining quality and consistency across your documentation.

    Common Documentation Pitfalls

    Even with good intentions, teams often fall into documentation traps that make their documentation worse than useless.

    Pitfall 1: Documentation That Doesn't Match Reality

    The most common issue is documentation that describes a configuration that no longer exists. This happens when documentation isn't updated when configurations change.

    Solution: Automate documentation generation from configuration files. If your infrastructure is defined in Terraform, generate documentation from the Terraform state. If you use Docker Compose, generate documentation from the compose file.

    Pitfall 2: Over-Documentation

    Writing pages of documentation for simple tasks creates a barrier to entry. New team members won't read it, and experienced team members will ignore it.

    Solution: Focus on the essential information. If a task can be explained in 5 steps, don't write 50. Use examples and code snippets to convey information concisely.

    Pitfall 3: Documentation Written for One Person

    Documentation that reflects one person's mental model won't help others. What's obvious to you may be confusing to someone else.

    Solution: Write documentation for your audience. If you have both junior and senior engineers, create different levels of detail. Use analogies and examples to make concepts accessible.

    Pitfall 4: Documentation That's Hard to Find

    If team members can't find documentation when they need it, they'll stop using it and create tribal knowledge instead.

    Solution: Organize documentation logically with clear navigation. Use search-friendly titles and consistent naming conventions. Consider creating a documentation index or table of contents.

    Tools for Server Documentation

    Several tools can help you create and maintain better server documentation.

    Markdown Editors

    • VS Code - With markdown extensions for preview and linting
    • Typora - WYSIWYG markdown editor
    • Obsidian - Knowledge management with linking capabilities

    Documentation Generators

    • MkDocs - Static site generator for markdown documentation
    • Docusaurus - React-based documentation site
    • GitBook - Collaborative documentation platform

    Infrastructure as Code Tools

    • Terraform - Generate documentation from infrastructure code
    • Ansible - Create documentation from playbooks
    • Pulumi - Document infrastructure as code

    Wiki Platforms

    • Confluence - Enterprise wiki with integrations
    • Notion - Flexible documentation platform
    • Wiki.js - Self-hosted markdown wiki

    Conclusion

    Server documentation is a critical component of operational excellence. When done well, it transforms tribal knowledge into accessible, maintainable resources that help your team work more effectively. The key is to focus on accuracy, currency, and usability.

    Start with a simple server inventory and build from there. Document your most critical systems first, then expand to less critical ones. Make documentation a habit—update it whenever you make configuration changes. Encourage team members to contribute and review documentation.

    Remember that good documentation isn't about creating perfect documentation—it's about creating useful documentation that evolves with your infrastructure. The goal is to reduce the time it takes to onboard new team members, troubleshoot issues, and make changes to your systems.

    Platforms like ServerlessBase can help automate parts of your documentation process by generating infrastructure documentation from your deployments. This reduces the manual effort required to keep documentation up to date and ensures it stays in sync with your actual infrastructure.

    Start documenting your servers today. Your future self—and your team—will thank you when you're no longer the only person who knows how your infrastructure works.

    Leave comment