Tag: streaming

SE-Radio Episode 272: Frances Perry on Apache Beam

Filed in Episodes by on October 25, 2016 2 Comments
SE-Radio Episode 272: Frances Perry on Apache Beam

Jeff Meyerson talks with Frances Perry about Apache Beam, a unified batch and stream processing model. Topics include a history of batch and stream processing, from MapReduce to the Lambda Architecture to the more recent Dataflow model, originally defined in a Google paper. Dataflow overcomes the problem of event time skew by using watermarks and […]

Continue Reading »

Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Filed in Episodes by on March 6, 2015 3 Comments
Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Nathan Marz is the creator of Apache Storm, a real-time streaming application. Storm does for stream processing what Hadoop does for batch processing. The project began when Nathan was working on aggregating Twitter data using a queue-and-worker system he had designed. Many companies use Storm, including Spotify, Yelp, WebMD, and many others. Jeff and Nathan […]

Continue Reading »

Episode 219: Apache Kafka with Jun Rao

Filed in Episodes by on February 10, 2015 6 Comments
Episode 219: Apache Kafka with Jun Rao

Jeff Meyerson talks to Jun Rao, a software engineer and researcher (formerly of LinkedIn). Jun has spent much of his time researching MapReduce, scalable databases, query processing, and other facets of the data warehouse. For the past three years, he has been a committer to the Apache Kafka project. Jeff and Jun first compare streaming […]

Continue Reading »