Creating Incremental Backups with rsync: Complete Guide

Learn how to create efficient incremental backups using rsync. Master delta transfers, automation, and best practices for optimal backup strategies.

Creating Incremental Backups with rsync

Table of Contents

- [Introduction to rsync](#introduction-to-rsync) - [Understanding Incremental Backups](#understanding-incremental-backups) - [Basic rsync Syntax](#basic-rsync-syntax) - [Command Options and Flags](#command-options-and-flags) - [Setting Up Incremental Backups](#setting-up-incremental-backups) - [Advanced Techniques](#advanced-techniques) - [Practical Examples](#practical-examples) - [Automation and Scheduling](#automation-and-scheduling) - [Monitoring and Verification](#monitoring-and-verification) - [Troubleshooting](#troubleshooting) - [Best Practices](#best-practices)

Introduction to rsync

rsync (remote synchronization) is a powerful command-line utility for efficiently transferring and synchronizing files across computer systems. It is particularly well-suited for creating incremental backups due to its ability to transfer only the differences between source and destination files, significantly reducing bandwidth usage and backup time.

Key Features of rsync

| Feature | Description | |---------|-------------| | Delta Transfer | Only transfers changed portions of files | | Compression | Built-in compression to reduce transfer size | | Preservation | Maintains file permissions, timestamps, and ownership | | Remote Sync | Works over SSH for secure remote backups | | Exclusion Patterns | Flexible file and directory exclusion rules | | Dry Run Mode | Test operations without making changes |

Why Use rsync for Incremental Backups

Incremental backups with rsync offer several advantages over traditional full backup methods:

- Efficiency: Only modified files are transferred - Speed: Faster backup operations after initial sync - Bandwidth Conservation: Minimal network usage for remote backups - Storage Optimization: Reduced storage requirements - Flexibility: Customizable backup strategies

Understanding Incremental Backups

Incremental backups are a backup strategy where only files that have changed since the last backup are copied to the backup destination. This approach contrasts with full backups, which copy all files regardless of whether they have changed.

Backup Types Comparison

| Backup Type | Description | Pros | Cons | |-------------|-------------|------|------| | Full Backup | Complete copy of all data | Simple restore process | Time-consuming, storage-intensive | | Incremental | Only changed files since last backup | Fast, efficient storage | Complex restore chain | | Differential | Changed files since last full backup | Simpler restore than incremental | Grows larger over time |

How rsync Implements Incremental Backups

rsync achieves incremental functionality through several mechanisms:

1. File Modification Time: Compares timestamps between source and destination 2. File Size Comparison: Checks if file sizes differ 3. Checksum Verification: Optional deep comparison using checksums 4. Hard Link Creation: Creates space-efficient snapshots using hard links

Basic rsync Syntax

The fundamental syntax for rsync follows this pattern:

`bash rsync [OPTIONS] SOURCE DESTINATION `

Basic Command Structure

`bash

Local to local backup

rsync -av /source/directory/ /backup/destination/

Local to remote backup

rsync -av /source/directory/ user@remote-host:/backup/destination/

Remote to local backup

rsync -av user@remote-host:/source/directory/ /local/backup/ `

Important Syntax Notes

- Trailing Slash Significance: The presence or absence of a trailing slash on the source directory affects behavior - rsync -av /source/dir/ /dest/ copies contents of dir into dest - rsync -av /source/dir /dest/ copies dir itself into dest - Path Specifications: Always use absolute paths for consistency - Quoting: Use quotes around paths containing spaces or special characters

Command Options and Flags

Understanding rsync options is crucial for creating effective incremental backup strategies.

Essential Options

| Option | Long Form | Description | |--------|-----------|-------------| | -a | --archive | Archive mode (preserves permissions, times, etc.) | | -v | --verbose | Verbose output | | -r | --recursive | Recurse into directories | | -u | --update | Skip files newer on destination | | -n | --dry-run | Show what would be transferred without doing it | | -z | --compress | Compress file data during transfer | | -h | --human-readable | Output numbers in human-readable format |

Advanced Options for Incremental Backups

| Option | Long Form | Description | |--------|-----------|-------------| | --delete | | Delete extraneous files from destination | | --delete-excluded | | Delete excluded files from destination | | --backup | | Make backups of existing files | | --backup-dir=DIR | | Store backups in specified directory | | --suffix=SUFFIX | | Backup suffix (default ~) | | --link-dest=DIR | | Hardlink to files in DIR when unchanged | | --exclude=PATTERN | | Exclude files matching pattern | | --exclude-from=FILE | | Read exclude patterns from file | | --include=PATTERN | | Include files matching pattern | | --files-from=FILE | | Read file list from file |

Progress and Logging Options

| Option | Long Form | Description | |--------|-----------|-------------| | --progress | | Show progress during transfer | | --stats | | Give file transfer stats | | --log-file=FILE | | Log what rsync is doing to file | | --itemize-changes | | Output change summary for all updates |

Setting Up Incremental Backups

Method 1: Simple Incremental Backup

The most straightforward approach uses rsync's built-in incremental capabilities:

`bash #!/bin/bash

Simple incremental backup script

SOURCE="/home/user/documents" DESTINATION="/backup/documents" LOGFILE="/var/log/backup.log"

rsync -avh --delete --stats --log-file="$LOGFILE" "$SOURCE/" "$DESTINATION/" `

Method 2: Snapshot-Style Backups with Hard Links

This method creates multiple backup snapshots while using hard links to save space:

`bash #!/bin/bash

Snapshot-style incremental backup

SOURCE="/home/user/data" BACKUP_ROOT="/backup/snapshots" DATE=$(date +%Y-%m-%d_%H-%M-%S) CURRENT_BACKUP="$BACKUP_ROOT/backup-$DATE" LATEST_LINK="$BACKUP_ROOT/latest"

Create backup directory

mkdir -p "$CURRENT_BACKUP"

Perform backup with hard links to previous backup

if [ -d "$LATEST_LINK" ]; then rsync -av --delete --link-dest="$LATEST_LINK" "$SOURCE/" "$CURRENT_BACKUP/" else rsync -av --delete "$SOURCE/" "$CURRENT_BACKUP/" fi

Update latest link

rm -f "$LATEST_LINK" ln -s "$CURRENT_BACKUP" "$LATEST_LINK" `

Method 3: Incremental with Exclusions

Create backups while excluding unnecessary files:

`bash #!/bin/bash

Incremental backup with exclusions

SOURCE="/home/user" DESTINATION="/backup/user-data" EXCLUDE_FILE="/etc/backup-exclude.txt"

Create exclude file if it doesn't exist

cat > "$EXCLUDE_FILE" << EOF *.tmp *.cache .git/ node_modules/ __pycache__/ *.log Trash/ .thumbnails/ EOF

rsync -avh --delete --exclude-from="$EXCLUDE_FILE" "$SOURCE/" "$DESTINATION/" `

Advanced Techniques

Using rsync with SSH for Remote Backups

For secure remote backups, rsync can tunnel through SSH:

`bash

Remote backup over SSH

rsync -avz -e ssh /local/data/ user@backup-server:/remote/backup/

Using specific SSH key

rsync -avz -e "ssh -i /path/to/private/key" /local/data/ user@backup-server:/remote/backup/

Custom SSH port

rsync -avz -e "ssh -p 2222" /local/data/ user@backup-server:/remote/backup/ `

Bandwidth Limiting

Control bandwidth usage during backups:

`bash

Limit bandwidth to 1000 KB/s

rsync -av --bwlimit=1000 /source/ /destination/

Limit to 50% of available bandwidth (requires additional tools)

rsync -av --bwlimit=$(( $(cat /proc/net/dev | grep eth0 | awk '{print $2}') / 1024 / 2 )) /source/ /destination/ `

Multi-Destination Backups

Create backups to multiple destinations:

`bash #!/bin/bash

Multi-destination backup script

SOURCE="/important/data" DESTINATIONS=( "/local/backup" "user@server1:/remote/backup" "user@server2:/offsite/backup" )

for dest in "${DESTINATIONS[@]}"; do echo "Backing up to $dest" rsync -avz --delete "$SOURCE/" "$dest/" || echo "Backup to $dest failed" done `

Practical Examples

Example 1: Daily Incremental Home Directory Backup

`bash #!/bin/bash

Daily home directory backup script

File: /usr/local/bin/daily-backup.sh

Configuration

USER="john" SOURCE="/home/$USER" BACKUP_ROOT="/backup/daily" DATE=$(date +%Y-%m-%d) BACKUP_DIR="$BACKUP_ROOT/$DATE" LATEST_LINK="$BACKUP_ROOT/latest" LOG_FILE="/var/log/daily-backup.log"

Create backup directory

mkdir -p "$BACKUP_DIR"

Exclusion patterns

EXCLUDE_PATTERNS=( "*.tmp" "*.cache" ".cache/" ".local/share/Trash/" "Downloads/temp/" ".mozilla/firefox/*/Cache/" ".thunderbird/*/ImapMail/" )

Build exclude arguments

EXCLUDE_ARGS="" for pattern in "${EXCLUDE_PATTERNS[@]}"; do EXCLUDE_ARGS="$EXCLUDE_ARGS --exclude=$pattern" done

Log backup start

echo "$(date): Starting backup of $SOURCE" >> "$LOG_FILE"

Perform backup

if [ -d "$LATEST_LINK" ]; then rsync -av --delete --link-dest="$LATEST_LINK" $EXCLUDE_ARGS "$SOURCE/" "$BACKUP_DIR/" >> "$LOG_FILE" 2>&1 else rsync -av --delete $EXCLUDE_ARGS "$SOURCE/" "$BACKUP_DIR/" >> "$LOG_FILE" 2>&1 fi

Check if backup was successful

if [ $? -eq 0 ]; then # Update latest link rm -f "$LATEST_LINK" ln -s "$BACKUP_DIR" "$LATEST_LINK" echo "$(date): Backup completed successfully" >> "$LOG_FILE" else echo "$(date): Backup failed with exit code $?" >> "$LOG_FILE" exit 1 fi

Clean up old backups (keep 30 days)

find "$BACKUP_ROOT" -maxdepth 1 -type d -name "20*" -mtime +30 -exec rm -rf {} \; `

Example 2: Database Backup with rsync

`bash #!/bin/bash

Database backup with rsync

File: /usr/local/bin/db-backup.sh

DB_NAME="production_db" DB_USER="backup_user" DUMP_DIR="/tmp/db_dumps" BACKUP_DIR="/backup/database" REMOTE_BACKUP="backup-server:/backup/db"

Create dump directory

mkdir -p "$DUMP_DIR"

Create database dump

mysqldump -u "$DB_USER" -p "$DB_NAME" > "$DUMP_DIR/${DB_NAME}_$(date +%Y%m%d_%H%M%S).sql"

Compress old dumps

find "$DUMP_DIR" -name "*.sql" -mtime +1 -exec gzip {} \;

Sync to local backup

rsync -av --delete "$DUMP_DIR/" "$BACKUP_DIR/"

Sync to remote backup

rsync -avz --delete "$DUMP_DIR/" "$REMOTE_BACKUP/"

Clean up old dumps (keep 7 days)

find "$DUMP_DIR" -name "*.sql.gz" -mtime +7 -delete `

Example 3: Website Backup Script

`bash #!/bin/bash

Website incremental backup script

File: /usr/local/bin/website-backup.sh

WEBSITE_ROOT="/var/www/html" BACKUP_ROOT="/backup/website" REMOTE_SERVER="backup.example.com" REMOTE_USER="backup" REMOTE_PATH="/backup/websites/$(hostname)"

Local backup first

DATE=$(date +%Y-%m-%d_%H-%M-%S) LOCAL_BACKUP="$BACKUP_ROOT/$DATE" LATEST_LOCAL="$BACKUP_ROOT/latest"

mkdir -p "$LOCAL_BACKUP"

Perform local incremental backup

if [ -d "$LATEST_LOCAL" ]; then rsync -av --delete --link-dest="$LATEST_LOCAL" \ --exclude="*.log" \ --exclude="cache/" \ --exclude="tmp/" \ "$WEBSITE_ROOT/" "$LOCAL_BACKUP/" else rsync -av --delete \ --exclude="*.log" \ --exclude="cache/" \ --exclude="tmp/" \ "$WEBSITE_ROOT/" "$LOCAL_BACKUP/" fi

Update latest link

rm -f "$LATEST_LOCAL" ln -s "$LOCAL_BACKUP" "$LATEST_LOCAL"

Sync to remote server

rsync -avz --delete -e ssh \ "$LOCAL_BACKUP/" \ "$REMOTE_USER@$REMOTE_SERVER:$REMOTE_PATH/"

Cleanup old local backups (keep 14 days)

find "$BACKUP_ROOT" -maxdepth 1 -type d -name "20*" -mtime +14 -exec rm -rf {} \; `

Automation and Scheduling

Using Cron for Scheduled Backups

Create automated backup schedules using cron:

`bash

Edit crontab

crontab -e

Add backup schedules

Daily backup at 2 AM

0 2 * /usr/local/bin/daily-backup.sh

Weekly backup every Sunday at 3 AM

0 3 0 /usr/local/bin/weekly-backup.sh

Monthly backup on first day of month at 4 AM

0 4 1 /usr/local/bin/monthly-backup.sh `

Systemd Timer Alternative

Create a systemd service and timer:

`ini

/etc/systemd/system/incremental-backup.service

[Unit] Description=Incremental Backup Service After=network.target

[Service] Type=oneshot ExecStart=/usr/local/bin/incremental-backup.sh User=backup Group=backup `

`ini

/etc/systemd/system/incremental-backup.timer

[Unit] Description=Run incremental backup daily Requires=incremental-backup.service

[Timer] OnCalendar=daily Persistent=true

[Install] WantedBy=timers.target `

Enable and start the timer:

`bash systemctl daemon-reload systemctl enable incremental-backup.timer systemctl start incremental-backup.timer `

Monitoring and Verification

Backup Verification Script

`bash #!/bin/bash

Backup verification script

File: /usr/local/bin/verify-backup.sh

SOURCE="/home/user/documents" BACKUP="/backup/documents" LOG_FILE="/var/log/backup-verification.log"

echo "$(date): Starting backup verification" >> "$LOG_FILE"

Check if backup directory exists

if [ ! -d "$BACKUP" ]; then echo "$(date): ERROR - Backup directory does not exist" >> "$LOG_FILE" exit 1 fi

Compare file counts

SOURCE_COUNT=$(find "$SOURCE" -type f | wc -l) BACKUP_COUNT=$(find "$BACKUP" -type f | wc -l)

echo "$(date): Source files: $SOURCE_COUNT, Backup files: $BACKUP_COUNT" >> "$LOG_FILE"

Verify checksums of critical files

CRITICAL_FILES=( "important_document.pdf" "database_dump.sql" "configuration.conf" )

for file in "${CRITICAL_FILES[@]}"; do if [ -f "$SOURCE/$file" ] && [ -f "$BACKUP/$file" ]; then SOURCE_MD5=$(md5sum "$SOURCE/$file" | cut -d' ' -f1) BACKUP_MD5=$(md5sum "$BACKUP/$file" | cut -d' ' -f1) if [ "$SOURCE_MD5" = "$BACKUP_MD5" ]; then echo "$(date): VERIFIED - $file" >> "$LOG_FILE" else echo "$(date): ERROR - $file checksum mismatch" >> "$LOG_FILE" fi fi done

echo "$(date): Backup verification completed" >> "$LOG_FILE" `

Monitoring Backup Size and Growth

`bash #!/bin/bash

Monitor backup size and growth

File: /usr/local/bin/monitor-backup-size.sh

BACKUP_DIR="/backup" LOG_FILE="/var/log/backup-size.log"

Get current backup size

CURRENT_SIZE=$(du -sh "$BACKUP_DIR" | cut -f1) CURRENT_SIZE_BYTES=$(du -sb "$BACKUP_DIR" | cut -f1)

Log current size

echo "$(date),$CURRENT_SIZE,$CURRENT_SIZE_BYTES" >> "$LOG_FILE"

Check growth rate (compare with yesterday)

YESTERDAY=$(date -d "1 day ago" +%Y-%m-%d) YESTERDAY_SIZE=$(grep "$YESTERDAY" "$LOG_FILE" | tail -1 | cut -d',' -f3)

if [ -n "$YESTERDAY_SIZE" ]; then GROWTH=$((CURRENT_SIZE_BYTES - YESTERDAY_SIZE)) GROWTH_HUMAN=$(echo "$GROWTH" | awk '{print $1/1024/1024/1024 " GB"}') echo "$(date): Backup growth: $GROWTH_HUMAN" >> "$LOG_FILE" fi `

Troubleshooting

Common rsync Issues and Solutions

| Issue | Symptoms | Solution | |-------|----------|----------| | Permission Denied | rsync: recv_generator: mkdir failed | Check destination permissions, use sudo if needed | | SSH Connection Failed | ssh: connect to host port 22: Connection refused | Verify SSH service, firewall, and credentials | | File Already Exists | file has vanished | Use --ignore-missing-args option | | Bandwidth Issues | Slow transfer speeds | Use --bwlimit to control bandwidth usage | | Partial Transfers | rsync error: some files could not be transferred | Check disk space and file permissions |

Debugging rsync Operations

Enable verbose output and debugging:

`bash

Maximum verbosity

rsync -avvv --debug=ALL /source/ /destination/

Dry run with detailed output

rsync -avvn --itemize-changes /source/ /destination/

Log all operations

rsync -av --log-file=/tmp/rsync.log /source/ /destination/ `

Recovery Procedures

`bash #!/bin/bash

Backup recovery script

File: /usr/local/bin/recover-backup.sh

BACKUP_DIR="/backup/documents/latest" RECOVERY_DIR="/recovery/documents" LOG_FILE="/var/log/recovery.log"

echo "$(date): Starting recovery from $BACKUP_DIR to $RECOVERY_DIR" >> "$LOG_FILE"

Create recovery directory

mkdir -p "$RECOVERY_DIR"

Restore files

rsync -av --progress "$BACKUP_DIR/" "$RECOVERY_DIR/" >> "$LOG_FILE" 2>&1

if [ $? -eq 0 ]; then echo "$(date): Recovery completed successfully" >> "$LOG_FILE" else echo "$(date): Recovery failed" >> "$LOG_FILE" exit 1 fi `

Best Practices

Security Considerations

1. SSH Key Authentication: Use SSH keys instead of passwords for remote backups 2. Backup Encryption: Encrypt backup destinations, especially for sensitive data 3. Access Control: Limit backup script permissions and user access 4. Network Security: Use VPN or secure networks for backup transfers

Performance Optimization

| Technique | Description | Command Example | |-----------|-------------|-----------------| | Compression | Enable compression for remote transfers | rsync -avz | | Bandwidth Limiting | Control network usage | rsync -av --bwlimit=1000 | | Parallel Processing | Run multiple rsync processes | Use xargs or GNU parallel | | Exclude Unnecessary Files | Skip temporary and cache files | rsync -av --exclude="*.tmp" |

Maintenance Tasks

1. Regular Testing: Periodically test backup restoration procedures 2. Log Rotation: Implement log rotation to prevent log files from growing too large 3. Storage Monitoring: Monitor backup storage usage and plan for capacity 4. Documentation: Maintain documentation of backup procedures and schedules

Backup Strategy Recommendations

| Backup Type | Frequency | Retention | Purpose | |-------------|-----------|-----------|---------| | Incremental | Daily | 30 days | Regular data protection | | Weekly Full | Weekly | 12 weeks | Medium-term recovery | | Monthly Archive | Monthly | 12 months | Long-term retention | | Yearly Archive | Yearly | 7 years | Compliance and historical data |

Script Template for Production Use

`bash #!/bin/bash

Production-ready incremental backup script

File: /usr/local/bin/production-backup.sh

set -euo pipefail # Exit on error, undefined variables, pipe failures

Configuration

readonly SCRIPT_NAME=$(basename "$0") readonly LOCK_FILE="/var/run/${SCRIPT_NAME}.lock" readonly LOG_FILE="/var/log/${SCRIPT_NAME}.log" readonly CONFIG_FILE="/etc/backup.conf"

Source configuration

if [ -f "$CONFIG_FILE" ]; then source "$CONFIG_FILE" else echo "Configuration file $CONFIG_FILE not found" >&2 exit 1 fi

Functions

log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') [$SCRIPT_NAME] $1" >> "$LOG_FILE" }

cleanup() { rm -f "$LOCK_FILE" log_message "Backup process finished" }

error_exit() { log_message "ERROR: $1" cleanup exit 1 }

Check for lock file

if [ -f "$LOCK_FILE" ]; then error_exit "Another backup process is already running" fi

Create lock file

echo $ > "$LOCK_FILE" trap cleanup EXIT

Start backup process

log_message "Starting backup process"

Validate source directory

if [ ! -d "$SOURCE_DIR" ]; then error_exit "Source directory $SOURCE_DIR does not exist" fi

Create backup directory

mkdir -p "$BACKUP_DIR" || error_exit "Failed to create backup directory"

Perform backup

log_message "Backing up $SOURCE_DIR to $BACKUP_DIR" rsync -av --delete --stats \ --exclude-from="$EXCLUDE_FILE" \ "$SOURCE_DIR/" "$BACKUP_DIR/" >> "$LOG_FILE" 2>&1 || error_exit "Backup failed"

log_message "Backup completed successfully" `

This comprehensive guide provides the foundation for implementing robust incremental backup solutions using rsync. The techniques and examples presented can be adapted to various environments and requirements, ensuring data protection through efficient and reliable backup strategies.

Tags

  • Automation
  • backup
  • file-sync
  • rsync

Related Articles

Popular Technical Articles & Tutorials

Explore our comprehensive collection of technical articles, programming tutorials, and IT guides written by industry experts:

Browse all 8+ technical articles | Read our IT blog

Creating Incremental Backups with rsync: Complete Guide