Solving High Disk Usage in Linux: Tools, Tips, and Techniques for Sysadmins

0

High disk usage can significantly slow down Linux systems, leading to poor performance, high response times, and even outages. Understanding the root causes of high disk usage and the techniques to diagnose and resolve them is crucial for any sysadmin. This guide covers the essential tools, tips, and best practices for diagnosing and reducing disk usage.


Common Causes of High Disk Usage

Before jumping into the tools and techniques, it’s important to understand the common culprits behind high disk usage:

  1. Log Files Growth: Unchecked system or application log files that grow without rotation can fill up disk space.
  2. Database Write I/O: High activity databases can write frequently to disk, causing high disk usage.
  3. Large Temporary Files: Applications can generate large temporary files that are left behind, consuming space.
  4. Disk Fragmentation: Although rare in Linux filesystems, fragmentation can still occur under certain conditions.
  5. Backup Files: Old or redundant backup files that haven’t been pruned over time.
  6. Unoptimized Applications: Applications that are poorly optimized for disk I/O, causing excessive writes or reads.

Step 1: Identify the Disk Hog

The first step to resolving disk usage issues is identifying what’s consuming space and causing high I/O. These tools help you identify which files or processes are using disk resources.

1. du (Disk Usage)

du is the go-to command for checking directory sizes on Linux. It shows disk usage for individual directories and files.

Example command:

du -h --max-depth=1 /

This command shows the disk usage for each top-level directory in a human-readable format. Increase the depth to drill down further into subdirectories.

2. df (Disk Free)

df gives an overview of the used and available disk space on all mounted filesystems.

Example command:

df -h

This shows the total, used, and available space for all disks in human-readable format.

3. ncdu (NCurses Disk Usage)

ncdu is an interactive disk usage analyzer with a user-friendly interface. It’s especially useful for navigating large directories and identifying large files.

To install:

sudo apt install ncdu   # On Debian/Ubuntu systems
sudo yum install ncdu # On Red Hat/CentOS systems

To run it:

ncdu /

4. iotop

iotop is a top-like utility that shows real-time disk I/O usage by processes.

To install:

sudo apt install iotop

Run it using:

sudo iotop -oPa

The -oPa options help filter out processes that aren’t actively generating I/O.

5. lsof (List Open Files)

lsof is invaluable for identifying which files are being accessed by a particular process.

Example command:

lsof +D /path/to/directory

This command lists all files opened in a specific directory. It’s useful when tracking down which files or directories are contributing to high disk I/O.


Step 2: Analyze and Resolve Disk Usage Issues

Once you’ve identified the source of high disk usage, the next step is to resolve it. Below are some common issues and how to fix them:

1. Clean Up Log Files

Large or growing log files are one of the most common causes of high disk usage. Use the following command to track down large log files:

find /var/log -type f -name "*.log" -size +100M

To clean up logs safely, you can compress them or rotate them using logrotate:

  • Log Rotation: If you’re not already using logrotate, configure it to automatically archive old logs and prevent them from growing indefinitely.

Example logrotate configuration:

/var/log/syslog {
rotate 7
daily
compress
missingok
notifempty
create 640 syslog adm
}

2. Purge Old Backups

Old backup files often take up a large amount of disk space. Use du to locate large backup files and either move them to external storage or delete outdated versions.

Example:

find /backup -type f -mtime +30

This command finds files older than 30 days in the /backup directory.

3. Handle Large Temporary Files

Temporary files can accumulate over time. Regularly clean the /tmp directory or any other directories where large files accumulate.

To remove files older than 7 days:

find /tmp -type f -atime +7 -delete

4. Optimize Databases

If your MySQL or PostgreSQL databases are generating high write I/O, look into optimization strategies:

  • For MySQL, tools like MySQLTuner can help identify inefficiencies.
  • Optimize tables: Regularly running OPTIMIZE TABLE on fragmented or heavily used tables can reduce disk writes.

Step 3: Monitor and Automate

Automating disk usage monitoring can prevent future issues.

1. Set Up Disk Monitoring

Tools like Nagios or Zabbix can monitor disk usage and alert you when space runs low or disk I/O spikes. Both tools support plugins for monitoring specific metrics, such as disk space and I/O usage.

2. Automate Cleanup

Consider writing scripts to automate the cleanup of old files, temporary data, and log rotation. An example cron job to delete files in /tmp older than 7 days:

0 3 * * * find /tmp -type f -atime +7 -delete

This runs the cleanup daily at 3 AM.

3. Enable Quotas

For multi-user environments, enabling disk quotas ensures that individual users or groups don’t consume excessive disk space.

To enable quotas:

  1. Install quota tools:bashCopy codesudo apt install quota # On Debian/Ubuntu sudo yum install quota # On CentOS/RedHat
  2. Edit /etc/fstab and add usrquota or grpquota to the appropriate filesystem.
  3. Remount the filesystem:bashCopy codesudo mount -o remount /
  4. Run the following to check disk usage for users or groups:bashCopy codesudo repquota -a

Conclusion

High disk usage can have many underlying causes, from large log files and inefficient databases to old backups and temporary files. Using the right tools like du, iotop, lsof, and MySQLTuner, sysadmins can diagnose and resolve disk usage issues efficiently. Combining these diagnostics with regular cleanups, optimizations, and automation ensures that your system remains performant and stable in the long run.

By implementing proactive monitoring and maintenance routines, you’ll avoid sudden disk space issues, keep disk usage under control, and ensure optimal system performance.

Share.

Comments are closed.