Tag: distributed systems

SE-Radio Episode 235: Ben Hindman on Apache Mesos

Filed in Episodes by on August 17, 2015 0 Comments
SE-Radio Episode 235: Ben Hindman on Apache Mesos

Ben Hindman talks to Jeff Meyerson about Apache Mesos, a distributed systems kernel. Mesos abstracts away many of the hassles of managing a distributed system. Hindman starts with a high-level explanation of Mesos, explaining the problems he encountered trying to run multiple instances of Hadoop against a single data set. He then discusses how Twitter uses […]

Continue Reading »

SE-Radio Episode 233: Fangjin Yang on OLAP and the Druid Real-Time Analytical Data Store

Filed in Episodes by on July 28, 2015 2 Comments
SE-Radio Episode 233: Fangjin Yang on OLAP and the Druid Real-Time Analytical Data Store

Fangjin Yang, creator of the Druid real-time analytical database, talks with Robert Blumen. They discuss the OLAP (online analytical processing) domain, OLAP concepts (hypercube, dimension, metric, and pivot), types of OLAP queries (roll-up, drill-down, and slicing and dicing), use cases for OLAP by organizations, the OLAP store’s position in the enterprise workflow, what “real time” […]

Continue Reading »

Episode 227: Eric Brewer: The CAP Theorem, Then and Now

Filed in Episodes by on May 27, 2015 3 Comments
Episode 227: Eric Brewer: The CAP Theorem, Then and Now

Robert Blumen talks with Eric Brewer, who discovered the CAP (consistency, availability, partition tolerance) theorem. The first part of the show focuses on Brewer’s original thesis presented at the 2000 ACM Symposium on Principles of Distributed Computing (PODC): What set of problems motivated the formulation of CAP? How was it understood at the time? What are […]

Continue Reading »

Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Filed in Episodes by on March 6, 2015 0 Comments
Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Nathan Marz is the creator of Apache Storm, a real-time streaming application. Storm does for stream processing what Hadoop does for batch processing. The project began when Nathan was working on aggregating Twitter data using a queue-and-worker system he had designed. Many companies use Storm, including Spotify, Yelp, WebMD, and many others. Jeff and Nathan […]

Continue Reading »

Episode 220: Jon Gifford on Logging and Logging Infrastructure

Filed in Episodes by on February 18, 2015 2 Comments
Episode 220: Jon Gifford on Logging and Logging Infrastructure

Robert Blumen talks to Jon Gifford of Loggly about logging and logging infrastructure. Topics include logging defined, purposes of logging, uses of logging in understanding the run-time behavior of programs, who produces logs, who consumes logs and for what reasons, software as the consumer of logs, log formats (structured versus free form), log meta-data, logging […]

Continue Reading »