Complete Guide to wget: The Ultimate File Download Tool
Table of Contents
1. [Introduction](#introduction) 2. [Installation](#installation) 3. [Basic Syntax and Usage](#basic-syntax-and-usage) 4. [Common Options and Parameters](#common-options-and-parameters) 5. [Download Scenarios](#download-scenarios) 6. [Advanced Features](#advanced-features) 7. [Configuration Files](#configuration-files) 8. [Error Handling and Troubleshooting](#error-handling-and-troubleshooting) 9. [Best Practices](#best-practices) 10. [Examples and Use Cases](#examples-and-use-cases)
Introduction
wget is a free, non-interactive command-line utility for downloading files from the web. It supports HTTP, HTTPS, and FTP protocols and provides extensive functionality for retrieving files, entire websites, and handling various download scenarios. The name "wget" comes from "World Wide Web" and "get," reflecting its primary purpose of retrieving content from the internet.
wget is particularly powerful because it can work in the background, handle network interruptions gracefully, and resume interrupted downloads. It's an essential tool for system administrators, developers, and power users who need reliable file downloading capabilities.
Key Features
- Non-interactive operation suitable for scripts and automation - Recursive downloading capabilities - Resume interrupted downloads - Bandwidth throttling - Proxy support - SSL/TLS support - Cookie handling - User agent customization - Mirror entire websites - Background operation
Installation
Linux Systems
Most Linux distributions include wget by default. If not installed, use the following commands:
Ubuntu/Debian:
`bash
sudo apt update
sudo apt install wget
`
CentOS/RHEL/Fedora:
`bash
CentOS/RHEL
sudo yum install wgetor for newer versions
sudo dnf install wgetFedora
sudo dnf install wget`Arch Linux:
`bash
sudo pacman -S wget
`
macOS
Using Homebrew:
`bash
brew install wget
`
Using MacPorts:
`bash
sudo port install wget
`
Windows
Download wget for Windows from the GNU wget website or use package managers like Chocolatey:
`powershell
choco install wget
`
Verification
Verify installation by checking the version:
`bash
wget --version
`
Basic Syntax and Usage
General Syntax
`bash
wget [options] [URL]
`
Simple Download
The most basic usage is downloading a single file:
`bash
wget https://example.com/file.zip
`
This command downloads the file to the current directory with its original filename.
Specifying Output Filename
Use the -O option to specify a different filename:
`bash
wget -O newname.zip https://example.com/file.zip
`
Downloading to Specific Directory
Use the -P option to specify the download directory:
`bash
wget -P /path/to/directory https://example.com/file.zip
`
Common Options and Parameters
Output and Logging Options
| Option | Description | Example |
|--------|-------------|---------|
| -O filename | Save document to filename | wget -O document.pdf https://example.com/doc.pdf |
| -P directory | Save files to directory | wget -P ~/Downloads https://example.com/file.zip |
| -o logfile | Log messages to logfile | wget -o download.log https://example.com/file.zip |
| -a logfile | Append messages to logfile | wget -a download.log https://example.com/file.zip |
| -q | Quiet mode (no output) | wget -q https://example.com/file.zip |
| -v | Verbose mode | wget -v https://example.com/file.zip |
| --progress=type | Progress indicator type | wget --progress=bar https://example.com/file.zip |
Download Control Options
| Option | Description | Example |
|--------|-------------|---------|
| -c | Continue partial downloads | wget -c https://example.com/largefile.zip |
| -t number | Number of retries | wget -t 5 https://example.com/file.zip |
| -T seconds | Timeout in seconds | wget -T 30 https://example.com/file.zip |
| --limit-rate=rate | Limit download rate | wget --limit-rate=200k https://example.com/file.zip |
| -w seconds | Wait between retrievals | wget -w 2 https://example.com/file.zip |
| --random-wait | Random wait between downloads | wget --random-wait https://example.com/file.zip |
HTTP Options
| Option | Description | Example |
|--------|-------------|---------|
| --user-agent=agent | Set user agent string | wget --user-agent="Mozilla/5.0" https://example.com/file.zip |
| --referer=url | Set referer URL | wget --referer="https://example.com" https://example.com/file.zip |
| --header=header | Add custom header | wget --header="Accept: application/json" https://api.example.com/data |
| --post-data=string | Send POST data | wget --post-data="param=value" https://example.com/api |
| --cookies=on/off | Enable/disable cookies | wget --cookies=on https://example.com/file.zip |
| --load-cookies=file | Load cookies from file | wget --load-cookies=cookies.txt https://example.com/file.zip |
| --save-cookies=file | Save cookies to file | wget --save-cookies=cookies.txt https://example.com/file.zip |
Authentication Options
| Option | Description | Example |
|--------|-------------|---------|
| --http-user=user | HTTP username | wget --http-user=john https://example.com/protected/file.zip |
| --http-password=pass | HTTP password | wget --http-password=secret https://example.com/protected/file.zip |
| --ask-password | Prompt for password | wget --http-user=john --ask-password https://example.com/protected/file.zip |
| --certificate=file | Client certificate file | wget --certificate=client.crt https://secure.example.com/file.zip |
| --private-key=file | Private key file | wget --private-key=client.key https://secure.example.com/file.zip |
Download Scenarios
Single File Download
Basic file download with progress indication:
`bash
wget https://releases.ubuntu.com/20.04/ubuntu-20.04.3-desktop-amd64.iso
`
Multiple Files Download
Download multiple files by specifying multiple URLs:
`bash
wget https://example.com/file1.zip https://example.com/file2.zip https://example.com/file3.zip
`
Download from File List
Create a file containing URLs and use -i option:
`bash
Create urls.txt file with URLs
echo "https://example.com/file1.zip" > urls.txt echo "https://example.com/file2.zip" >> urls.txt echo "https://example.com/file3.zip" >> urls.txtDownload all files from the list
wget -i urls.txt`Resume Interrupted Downloads
Use the -c option to continue interrupted downloads:
`bash
wget -c https://example.com/largefile.zip
`
Background Downloads
Use the -b option for background downloads:
`bash
wget -b https://example.com/largefile.zip
`
Check progress with:
`bash
tail -f wget-log
`
Bandwidth Limiting
Limit download speed to avoid consuming all available bandwidth:
`bash
Limit to 200 KB/s
wget --limit-rate=200k https://example.com/largefile.zipLimit to 1 MB/s
wget --limit-rate=1m https://example.com/largefile.zip`Advanced Features
Recursive Downloads
wget can recursively download entire websites or directory structures:
#### Basic Recursive Download
`bash
wget -r https://example.com/
`
#### Website Mirroring
Create a complete mirror of a website:
`bash
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com/
`
Options explanation:
- --mirror: Enable mirroring options
- --convert-links: Convert links for local viewing
- --adjust-extension: Add appropriate extensions to files
- --page-requisites: Download CSS, images, etc.
- --no-parent: Don't ascend to parent directory
#### Recursive Download Options
| Option | Description | Example |
|--------|-------------|---------|
| -r | Recursive download | wget -r https://example.com/ |
| -l depth | Maximum recursion depth | wget -r -l 3 https://example.com/ |
| -A pattern | Accept only matching files | wget -r -A "*.pdf" https://example.com/ |
| -R pattern | Reject matching files | wget -r -R ".gif,.jpg" https://example.com/ |
| --include-directories=list | Include only specified directories | wget -r --include-directories=docs https://example.com/ |
| --exclude-directories=list | Exclude specified directories | wget -r --exclude-directories=temp https://example.com/ |
FTP Downloads
wget supports FTP protocol for downloading files:
#### Anonymous FTP
`bash
wget ftp://ftp.example.com/pub/file.zip
`
#### FTP with Authentication
`bash
wget --ftp-user=username --ftp-password=password ftp://ftp.example.com/private/file.zip
`
#### Recursive FTP Download
`bash
wget -r ftp://ftp.example.com/pub/directory/
`
Proxy Support
wget can work through various proxy servers:
#### HTTP Proxy
`bash
wget --proxy=on --http-proxy=proxy.example.com:8080 https://example.com/file.zip
`
#### Environment Variables
Set proxy through environment variables:
`bash
export http_proxy=http://proxy.example.com:8080
export https_proxy=http://proxy.example.com:8080
export ftp_proxy=http://proxy.example.com:8080
wget https://example.com/file.zip
`
SSL/TLS Options
| Option | Description | Example |
|--------|-------------|---------|
| --secure-protocol=protocol | Choose SSL/TLS protocol | wget --secure-protocol=TLSv1_2 https://example.com/file.zip |
| --no-check-certificate | Don't verify SSL certificates | wget --no-check-certificate https://self-signed.example.com/file.zip |
| --ca-certificate=file | Use specific CA certificate | wget --ca-certificate=ca.crt https://example.com/file.zip |
| --ca-directory=directory | CA certificates directory | wget --ca-directory=/etc/ssl/certs https://example.com/file.zip |
Configuration Files
Global Configuration
wget reads configuration from several locations:
1. System-wide configuration: /etc/wgetrc
2. User configuration: ~/.wgetrc
3. Environment variable: $WGETRC
Configuration File Format
Configuration files use simple key-value pairs:
`
~/.wgetrc example
Set default timeout
timeout = 30Set default number of retries
tries = 3Set default user agent
user_agent = Mozilla/5.0 (compatible; wget)Enable cookies by default
cookies = onSet default download directory
dir_prefix = /home/user/DownloadsLimit download rate
limit_rate = 500kEnable continue by default
continue = onBe more verbose
verbose = on`Common Configuration Options
| Option | Description | Example |
|--------|-------------|---------|
| timeout | Default timeout | timeout = 30 |
| tries | Default retry count | tries = 5 |
| user_agent | Default user agent | user_agent = MyBot/1.0 |
| limit_rate | Default rate limit | limit_rate = 200k |
| dir_prefix | Default download directory | dir_prefix = /downloads |
| continue | Enable resume by default | continue = on |
| recursive | Enable recursive by default | recursive = off |
Error Handling and Troubleshooting
Common Exit Codes
| Exit Code | Description | |-----------|-------------| | 0 | No problems occurred | | 1 | Generic error code | | 2 | Parse error | | 3 | File I/O error | | 4 | Network failure | | 5 | SSL verification failure | | 6 | Username/password authentication failure | | 7 | Protocol errors | | 8 | Server issued an error response |
Common Issues and Solutions
#### SSL Certificate Problems
Problem: SSL certificate verification fails
`bash
ERROR: cannot verify example.com's certificate
`
Solutions:
`bash
Option 1: Skip certificate verification (not recommended for production)
wget --no-check-certificate https://example.com/file.zipOption 2: Update CA certificates
sudo apt update && sudo apt install ca-certificatesOption 3: Specify CA certificate
wget --ca-certificate=/path/to/ca.crt https://example.com/file.zip`#### Connection Timeouts
Problem: Downloads timeout frequently
Solutions:
`bash
Increase timeout and retries
wget -T 60 -t 10 https://example.com/file.zipAdd wait between retries
wget -T 60 -t 10 -w 5 https://example.com/file.zip`#### Rate Limiting
Problem: Server blocks requests due to high frequency
Solutions:
`bash
Add random wait between requests
wget --random-wait --wait=1 -r https://example.com/Limit download rate
wget --limit-rate=100k https://example.com/file.zip`#### 403 Forbidden Errors
Problem: Server returns 403 Forbidden
Solutions:
`bash
Set proper user agent
wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" https://example.com/file.zipSet referer
wget --referer="https://example.com" https://example.com/file.zipAdd custom headers
wget --header="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8" https://example.com/file.zip`Debugging Options
| Option | Description | Example |
|--------|-------------|---------|
| --debug | Enable debug output | wget --debug https://example.com/file.zip |
| -v | Verbose output | wget -v https://example.com/file.zip |
| --server-response | Show server headers | wget --server-response https://example.com/file.zip |
| --spider | Check if file exists without downloading | wget --spider https://example.com/file.zip |
Best Practices
Security Considerations
1. Verify SSL certificates: Avoid using --no-check-certificate in production
2. Use HTTPS when available: Prefer secure connections
3. Protect credentials: Use configuration files with proper permissions for authentication
4. Validate downloads: Check file integrity using checksums when available
Performance Optimization
1. Use appropriate retry settings: Balance between reliability and speed
2. Implement rate limiting: Respect server resources and avoid being blocked
3. Resume interrupted downloads: Use -c for large files
4. Optimize recursive downloads: Use appropriate depth limits and filters
Automation Best Practices
1. Use absolute paths: Specify full paths in scripts 2. Implement error handling: Check exit codes in scripts 3. Log activities: Use logging options for debugging and monitoring 4. Use configuration files: Centralize common settings
Example Script
`bash
#!/bin/bash
Download script with error handling
URL="https://example.com/largefile.zip" OUTPUT_DIR="/downloads" LOG_FILE="/var/log/wget.log"Create output directory if it doesn't exist
mkdir -p "$OUTPUT_DIR"Download with error handling
wget \ --continue \ --timeout=60 \ --tries=5 \ --limit-rate=500k \ --directory-prefix="$OUTPUT_DIR" \ --output-file="$LOG_FILE" \ --progress=bar \ "$URL"Check exit code
if [ $? -eq 0 ]; then echo "Download completed successfully" else echo "Download failed with exit code $?" exit 1 fi`Examples and Use Cases
Website Mirroring
Complete website mirror for offline browsing:
`bash
wget \
--recursive \
--no-clobber \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--domains example.com \
--no-parent \
https://example.com/
`
Downloading Software Releases
Download the latest software release:
`bash
Download with resume capability and progress bar
wget \ --continue \ --progress=bar:force \ --timeout=30 \ --tries=5 \ https://github.com/user/project/releases/download/v1.0.0/software.tar.gz`API Data Retrieval
Download data from REST API:
`bash
Download JSON data with authentication
wget \ --header="Authorization: Bearer TOKEN" \ --header="Accept: application/json" \ --output-document=data.json \ https://api.example.com/data`Batch File Downloads
Download multiple files with pattern matching:
`bash
Download all PDF files from a directory listing
wget \ --recursive \ --no-parent \ --accept="*.pdf" \ --level=1 \ https://example.com/documents/`FTP Site Synchronization
Synchronize local directory with FTP site:
`bash
Mirror FTP directory
wget \ --mirror \ --ftp-user=username \ --ftp-password=password \ --no-host-directories \ --cut-dirs=1 \ ftp://ftp.example.com/pub/files/`Scheduled Downloads
Create a cron job for regular downloads:
`bash
Add to crontab (crontab -e)
Download daily at 2 AM
0 2 * /usr/bin/wget --quiet --output-document=/backup/daily-$(date +\%Y\%m\%d).sql https://example.com/backup.sql`Large File Download with Monitoring
Download large files with detailed monitoring:
`bash
#!/bin/bash
URL="https://example.com/large-dataset.zip" OUTPUT="/data/dataset.zip" LOG="/var/log/dataset-download.log"
Start download in background with detailed logging
wget \ --continue \ --timeout=300 \ --tries=10 \ --waitretry=30 \ --limit-rate=1m \ --progress=bar:force:noscroll \ --output-file="$LOG" \ --output-document="$OUTPUT" \ "$URL" &WGET_PID=$!
Monitor progress
while kill -0 $WGET_PID 2>/dev/null; do if [ -f "$OUTPUT" ]; then SIZE=$(du -h "$OUTPUT" | cut -f1) echo "Downloaded: $SIZE" fi sleep 10 donewait $WGET_PID EXIT_CODE=$?
if [ $EXIT_CODE -eq 0 ]; then
echo "Download completed successfully"
# Verify download if checksum available
# sha256sum -c dataset.zip.sha256
else
echo "Download failed with exit code $EXIT_CODE"
fi
`
This comprehensive guide covers the essential aspects of using wget for file downloads. The tool's flexibility and extensive options make it suitable for everything from simple file downloads to complex web scraping and site mirroring tasks. Understanding these features and best practices will help you use wget effectively in various scenarios.