When a security incident hits, the difference between a minor disruption and a catastrophic breach often comes down to one thing: preparation. Large enterprises have dedicated Security Operations Centers (SOCs) with dozens of analysts. But what about small IT teams managing infrastructure with 2-5 people? You need an incident response plan that is practical, actionable, and realistic for your team size.
This guide provides a complete incident response framework designed specifically for small IT teams, with templates, checklists, and procedures you can implement today.
Why Small Teams Need an IR Plan
Small teams are not immune to security incidents — they are actually more vulnerable. Without dedicated security staff, incidents often go undetected longer, responses are ad-hoc and inconsistent, and critical steps get missed during the chaos of an active incident.
An incident response plan provides:
- Clear procedures when stress is high and thinking is difficult
- Role assignments so everyone knows their responsibilities
- Communication templates to avoid saying the wrong thing to stakeholders
- Documentation requirements for legal and regulatory compliance
- Recovery checklists to ensure nothing is missed during restoration
Phase 1: Preparation
The most important phase happens before any incident occurs.
Build Your IR Team
Even with a small team, assign clear roles:
- Incident Commander (IC) — Makes decisions, coordinates response, communicates with management. Usually the senior sysadmin or IT manager.
- Technical Lead — Performs the technical investigation and remediation. Your strongest technical person.
- Communications Lead — Handles stakeholder notifications, documentation, and external communication. Can be the IC in small teams.
Essential Documentation to Prepare
- Asset inventory — Know every server, network device, and critical application in your environment
- Network diagram — Current and accurate, including all VLANs, subnets, and external connections
- Contact list — Team members, management, legal counsel, insurance provider, and external IR services (have a contract in place before you need it)
- Baseline documentation — What does "normal" look like? Document typical network traffic, running processes, and scheduled tasks
- Backup verification — Regularly test that backups are complete and restorable
Tools to Have Ready
# Incident Response Toolkit (install on a dedicated USB drive)
# Forensic tools
sudo apt install sleuthkit autopsy volatility3 -y
# Network analysis
sudo apt install tcpdump wireshark-cli nmap -y
# Log analysis
sudo apt install logwatch goaccess jq -y
# System analysis
sudo apt install sysstat lsof strace -y
Critical: Keep a copy of your IR toolkit on offline media (USB drive) that is tested regularly. During an active incident, you may not be able to download tools or access package repositories.
Phase 2: Detection and Analysis
Common Indicators of Compromise (IoCs)
- Unexpected outbound network connections, especially to known-bad IP ranges
- Unusual process activity — unfamiliar processes, processes running at odd times
- Failed login spikes from a single IP or across multiple accounts
- File integrity changes — modified system binaries, new files in sensitive directories
- DNS anomalies — requests to unusual domains, high DNS query volumes
- Unexpected scheduled tasks or cron jobs
- Disabled security tools (antivirus, logging, firewall rules)
Initial Triage Checklist
When a potential incident is detected, work through this checklist:
- Confirm the incident — Is this a real security event or a false positive?
- Determine scope — How many systems are affected?
- Assess severity — What data or services are at risk?
- Activate the IR plan — Notify the IC and begin formal response procedures
- Start the incident log — Document everything from this point forward with timestamps
Quick Triage Commands
# Check for unusual network connections
ss -tlnp # Listening ports
ss -tnp # Active connections
netstat -an | grep ESTABLISHED | awk '{print $5}' | sort | uniq -c | sort -rn
# Check for suspicious processes
ps auxf # Process tree
ps aux --sort=-%cpu | head -20
ls -la /proc/*/exe 2>/dev/null | grep deleted
# Check recent logins
last -20
lastb -20 # Failed logins
who -a
# Check for unauthorized cron jobs
for user in $(cut -f1 -d: /etc/passwd); do
crontab -l -u $user 2>/dev/null | grep -v "^#" | grep -v "^$" &&
echo "--- $user ---"
done
# Check for recently modified files in sensitive directories
find /etc /usr/bin /usr/sbin -mtime -1 -type f 2>/dev/null
find /tmp /var/tmp -type f -executable 2>/dev/null
Phase 3: Containment
The goal of containment is to stop the incident from spreading while preserving evidence for investigation.
Short-Term Containment
- Isolate affected systems — Disconnect from the network but do NOT power off (this destroys volatile evidence)
- Block malicious IPs at the firewall level
- Disable compromised accounts immediately
- Preserve evidence — Take memory dumps and disk images before making changes
# Isolate a server by blocking all traffic except SSH from your IP
sudo iptables -I INPUT -s YOUR_ADMIN_IP -j ACCEPT
sudo iptables -I INPUT -j DROP
sudo iptables -I OUTPUT -d YOUR_ADMIN_IP -j ACCEPT
sudo iptables -I OUTPUT -j DROP
# Capture memory for forensic analysis
sudo dd if=/dev/mem of=/mnt/usb/memory_dump_$(date +%Y%m%d_%H%M%S).raw bs=1M
# Create a disk image
sudo dd if=/dev/sda of=/mnt/usb/disk_image.raw bs=64K status=progress
Long-Term Containment
- Apply temporary fixes (patches, configuration changes) to prevent re-exploitation
- Increase monitoring on affected and adjacent systems
- Implement additional access controls
Phase 4: Eradication
Once contained, remove the threat completely:
- Identify the root cause — How did the attacker get in? Unpatched vulnerability? Stolen credentials? Social engineering?
- Remove malware and backdoors — Check for webshells, modified binaries, unauthorized SSH keys, and rogue user accounts
- Patch vulnerabilities — Apply all relevant security updates
- Reset credentials — Change all passwords and API keys that may have been exposed
- Review and harden — Implement additional security controls to prevent recurrence
Phase 5: Recovery
Bring systems back online carefully and methodically:
- Restore from known-good backups (verified before the incident timeline)
- Rebuild compromised systems from scratch when possible
- Implement enhanced monitoring before reconnecting to the network
- Gradually restore services, starting with the most critical
- Verify system integrity after restoration
- Monitor closely for 30-90 days for signs of persistent access
Phase 6: Lessons Learned
Within 1-2 weeks of incident resolution, conduct a post-incident review:
- What happened and when? (Complete timeline)
- How was the incident detected?
- What worked well in the response?
- What could be improved?
- What preventive measures should be implemented?
- Does the IR plan need updating?
Document findings and update your IR plan accordingly. Share non-sensitive lessons with the broader team to improve organizational security awareness.
Recommended Reading
Build your security and incident response capabilities with these Dargslan guides:
- Cybersecurity Fundamentals — Essential security concepts every IT professional needs
- Ethical Hacking & Penetration Testing — Understand attacker techniques to better defend against them
- Linux Security Essentials — Core security practices for Linux environments
- Linux Security Auditing — Systematic approaches to finding and fixing vulnerabilities