Tag: monitoring

SE-Radio Episode 301: Jason Hand on Handling Outages

Filed in Episodes by on August 29, 2017 0 Comments
SE-Radio Episode 301: Jason Hand on Handling Outages

Bryan Reinero talks with Jason Hand about handling outages and responding to failures. The episode explores basic problem-solving strategies and diagnostic techniques, organizing teams to address incidents efficiently, communicating with stakeholders, learning from incidents, and managing stress.   Related Links Episode 284 – John Allspaw on System Failures: Preventing, Responding, and Learning From Episode 225 […]

Continue Reading »

SE-Radio Episode 277: Gil Tene on Tail Latency

Filed in Episodes by on December 14, 2016 1 Comment
SE-Radio Episode 277: Gil Tene on Tail Latency

Gil Tene joins Robert Blumen for a discussion of tail latency. What is latency? What is “tail latency”? Why are the upper percentiles of latency more relevant to humans? How is human interaction with an application influenced by tail latency? What are the economics of tail latency? What are the origins of tail latency within […]

Continue Reading »

SE-Radio Episode 276: Björn Rabenstein on Site Reliability Engineering

Filed in Episodes by on December 6, 2016 2 Comments
SE-Radio Episode 276: Björn Rabenstein on Site Reliability Engineering

Björn Rabenstein discusses the field of Site Reliability Engineering (SRE) with host Robert Blumen. The term SRE has recently emerged to mean Google’s approach to DevOps. The publication of Google’s book on SRE has brought many of their practices into more public discussion. The interview covers: what is distinct about SRE versus devops; the SRE […]

Continue Reading »

SE-Radio Episode 270: Brian Brazil on Prometheus Monitoring

Filed in Episodes by on October 4, 2016 1 Comment
SE-Radio Episode 270: Brian Brazil on Prometheus Monitoring

Jeff Meyerson talks with Brian Brazil about monitoring with Prometheus, an open source tool for monitoring distributed applications. Brian is the founder of Robust Perception, a company offering Prometheus engineering and consulting. The high level goal of Prometheus is to allow developers to focus on services rather than individual instances of a given service. Prometheus […]

Continue Reading »