SE-Radio Episode 301: Jason Hand on Handling Outages
Bryan Reinero talks with Jason Hand about handling outages and responding to failures. The episode explores basic problem-solving strategies and diagnostic techniques, organizing teams to address incidents efficiently, communicating with stakeholders, learning from incidents, and managing stress.
Related Links
- Episode 284 – John Allspaw on System Failures: Preventing, Responding, and Learning From
- Episode 225 – Brendan Gregg on Systems Performance
Podcast: Play in new window | Download
Subscribe: Apple Podcasts | RSS
Tags: Alerting, devops, Diagnostics, Incident Response, monitoring
What about that book link you talked about?