The Incident Response Lifecycle¶
When detection fires, incident response (IR) kicks in. A calm, practiced process is the difference between a contained event and a catastrophe.
The NIST IR Lifecycle¶
1. Preparation
2. Detection & Analysis
3. Containment, Eradication & Recovery
4. Post-Incident Activity (Lessons Learned)
↺ feeds back into Preparation
1. Preparation (before anything happens)¶
- An IR plan and playbooks for common scenarios.
- A defined IR team with roles and contacts.
- Tools, logging, and access ready in advance.
- Training and tabletop exercises — practice under calm conditions.
The work you do before an incident determines how well you handle it. You can't write the plan during the fire.
2. Detection & Analysis¶
- Validate the alert: is it a real incident or a false positive?
- Determine scope and severity (what/who is affected).
- Classify and prioritize. Begin documenting a timeline immediately.
3. Containment, Eradication & Recovery¶
Containment — stop the bleeding:
- Short-term: isolate affected hosts, block C2, disable compromised accounts.
- Long-term: temporary fixes to keep operating while you clean up.
Eradication — remove the threat: delete malware, close the vulnerability, reset credentials.
Recovery — restore to normal: rebuild from known-good, restore data, monitor closely for recurrence before declaring "all clear."
4. Post-Incident (Lessons Learned)¶
- A blameless post-mortem: what happened, what worked, what didn't.
- Concrete improvements: new detections, control gaps, plan updates.
- The goal is learning, not blame — blame drives people to hide problems.
Preserve Evidence¶
During containment, preserve forensic evidence — don't power off a machine (memory is lost); capture memory and disk images first if investigation/legal action is likely. Maintain chain of custody.
Roles, Communication & Legal¶
- Designate an incident commander to coordinate.
- Plan internal and external communications in advance (legal, PR, customers).
- Know your regulatory notification deadlines (e.g., GDPR's 72-hour breach notification).
- Engage legal/compliance early for reportable incidents.
Key Metrics¶
- MTTD — Mean Time to Detect.
- MTTR — Mean Time to Respond/Recover.
Driving these down is the SOC's north star — speed limits the damage an attacker can do.