How to Read and Analyze Linux System Logs
You've just deployed an application to your server, and something isn't working. The logs are full of cryptic error messages, but you don't know where to start. Reading Linux system logs is a fundamental skill for any system administrator or DevOps engineer. Without understanding what your logs are telling you, you're flying blind.
This guide will teach you how to navigate the Linux logging ecosystem, identify common issues, and use tools to analyze logs effectively. You'll learn where logs live, how to read them, and how to extract meaningful information from the noise.
Understanding the Linux Logging Ecosystem
Linux uses a centralized logging system called journald that collects logs from various sources. These logs are then stored in files under /var/log, where they persist even after a system reboot. The most important directories for log analysis are /var/log/ and /var/log/journal/.
The systemd service manages these logs through journald. When you run journalctl without arguments, you're seeing all logs from all services. This can be overwhelming, so it's crucial to understand how to filter and search through this data.
Different services write to different log files. The kernel logs go to /var/log/kern.log, system messages go to /var/log/syslog or /var/log/messages, and authentication logs go to /var/log/auth.log (Debian/Ubuntu) or /var/log/secure (RHEL/CentOS). Understanding which log file contains which type of information saves time when troubleshooting.
Common Log Files and Their Purposes
Linux systems generate logs for every major component. Here's a breakdown of the most important log files you'll encounter:
| Log File | Purpose | Typical Issues Found |
|---|---|---|
/var/log/syslog or /var/log/messages | General system messages | Hardware errors, service startup failures |
/var/log/auth.log or /var/log/secure | Authentication events | Failed login attempts, SSH issues |
/var/log/kern.log | Kernel messages | Driver problems, hardware failures |
/var/log/dmesg | Boot-time kernel messages | Boot failures, driver loading issues |
/var/log/nginx/error.log | Nginx web server errors | 404 errors, connection failures |
/var/log/apache2/error.log | Apache web server errors | Configuration errors, permission issues |
/var/log/mysql/error.log | MySQL database errors | Connection timeouts, query failures |
/var/log/docker.log | Docker daemon logs | Container crashes, network issues |
Each log file serves a specific purpose. The syslog or messages file contains general system information, while auth.log tracks authentication events. Kernel logs (kern.log and dmesg) are critical for hardware and driver issues. Application-specific logs like nginx/error.log contain errors from your web server.
When troubleshooting, start with the most relevant log file based on the symptoms you're seeing. A web application error usually points to the application or web server logs, while a database connection failure points to the database logs.
Reading Logs with journalctl
The journalctl command is the primary tool for reading logs managed by journald. It provides powerful filtering options to find exactly what you need.
To see all logs from the current boot:
To see logs from a specific service:
To see logs from the last hour:
To see logs from a specific time range:
To follow logs in real-time (similar to tail -f):
The -u flag filters by service name, -b shows logs from the current boot, and --since/--until specify time ranges. These flags combine powerfully—for example, to see all Nginx errors from the last hour:
This command filters Nginx logs for the last hour and extracts only error messages.
Analyzing Log Files Directly
While journalctl is convenient for systemd-managed logs, many applications write to traditional log files. The tail command is your primary tool for reading these files.
To see the last 10 lines of a log file:
To follow logs in real-time:
To see the last 50 lines and then follow:
The -n flag specifies the number of lines, and -f enables follow mode. This is essential for monitoring logs while an application runs.
For searching within log files, grep is indispensable:
To search case-insensitively:
To count occurrences:
To search for multiple patterns:
The -i flag makes the search case-insensitive, -c counts matches, and -E enables extended regular expressions for complex patterns.
Identifying Common Log Patterns
Certain log patterns indicate specific types of issues. Recognizing these patterns speeds up troubleshooting significantly.
Failed login attempts typically appear in /var/log/auth.log or /var/log/secure:
Multiple failed attempts indicate brute force attacks. You can count them with:
Service startup failures appear in syslog or the service's own log file:
This indicates Nginx failed to start. Check the configuration with nginx -t to identify syntax errors.
Database connection errors in MySQL logs:
This means another process is using the MySQL port. Check with netstat -tlnp | grep 3306.
Disk space issues appear in syslog:
This indicates the filesystem is full. Check disk usage with df -h and identify large files with du -sh * | sort -rh.
Memory exhaustion appears in kern.log:
This means the system is running out of memory. Check memory usage with free -h and identify memory-hungry processes with top or htop.
Practical Log Analysis Walkthrough
Let's walk through a real-world troubleshooting scenario. Your web application is returning 500 errors, and you need to find the root cause.
Step 1: Check the web server logs
You see repeated errors:
The error shows that Nginx can't connect to the upstream application on port 3000.
Step 2: Check if the application is running
The output shows:
The application is not running.
Step 3: Check application logs
You see:
The application can't start because port 3000 is already in use.
Step 4: Identify what's using port 3000
Output:
Another Python process is using port 3000.
Step 5: Check if it's the same application
Output:
It's a different instance of the same application.
Step 6: Stop the conflicting process
Step 7: Start the application
Step 8: Verify the application is running
Output:
The application is now running, and Nginx can connect to it. Check the Nginx logs again:
No more errors. The issue is resolved.
This walkthrough demonstrates the systematic approach to log analysis: identify the error, find the relevant logs, trace the root cause, and implement a fix. Each step builds on the previous one, using logs as your guide.
Log Rotation and Management
Logs grow quickly and can fill your disk if not managed properly. Linux uses logrotate to manage log rotation, which archives old logs and creates new ones.
The main configuration file is /etc/logrotate.conf, and system-specific configurations are in /etc/logrotate.d/. A typical logrotate configuration looks like:
This configuration rotates Nginx logs daily, keeps 14 days of logs, compresses old logs, and reloads Nginx after rotation.
To check if logrotate is working:
The -d flag runs in debug mode without actually rotating logs. If you see errors, check the logrotate log at /var/lib/logrotate/status.
For manual log rotation:
The -f flag forces rotation even if the rotation conditions aren't met.
Advanced Log Analysis Techniques
For more sophisticated analysis, consider these techniques:
Structured logging: Use JSON-formatted logs for easier parsing. Most modern applications support this:
Log aggregation: Centralize logs from multiple servers using tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Loki (Grafana's log aggregation system).
Log analysis tools: Use specialized tools like grep, awk, and sed for pattern matching and text processing:
Log monitoring: Set up alerts for critical log patterns using tools like logwatch or custom scripts:
Best Practices for Log Analysis
Effective log analysis follows these principles:
Always check the most recent logs first: Newer logs are more likely to contain the current issue.
Use appropriate tools for the job: journalctl for systemd logs, tail and grep for traditional logs, specialized tools for structured logs.
Understand the log format: Each log file has its own format. Read the documentation or examine sample logs to understand the structure.
Combine multiple log sources: Issues often span multiple components. Check application logs, web server logs, and system logs together.
Document your findings: Keep a record of common issues and their solutions. This builds institutional knowledge.
Automate repetitive tasks: Create scripts for common log analysis tasks to save time.
Monitor log growth: Ensure log rotation is configured and working to prevent disk space issues.
Use log levels appropriately: Applications should use appropriate log levels (debug, info, warning, error, critical) to help filter logs.
Conclusion
Reading and analyzing Linux system logs is a critical skill for any system administrator or DevOps engineer. By understanding where logs live, how to read them, and how to identify common patterns, you can troubleshoot issues efficiently and keep your systems running smoothly.
The key takeaways are: logs are your primary diagnostic tool, journalctl and tail are your main reading tools, grep is your search tool, and systematic analysis—identifying the error, finding the relevant logs, tracing the root cause, and implementing a fix—is your approach.
Start by familiarizing yourself with the log files in /var/log/ and mastering the journalctl, tail, and grep commands. As you gain experience, you'll develop an intuition for common patterns and faster troubleshooting workflows.
Platforms like ServerlessBase simplify deployment and monitoring, but understanding your logs remains essential. When issues arise, your ability to read and analyze logs will determine how quickly you can resolve them and keep your applications running smoothly.
For more information on monitoring and troubleshooting, check out the ServerlessBase documentation on monitoring and applications.