Tag: streaming

Episode 436: Apache Samza with Yi Pan

Filed in Episodes by on November 24, 2020 0 Comments
Episode 436: Apache Samza with Yi Pan

Yi Pan, lead maintainer of Apache Samza discusses the internals of the Samza project as well as the Stream Processing ecosystem. Host Adam Conrad spoke with Pan about the three core aspects of the Samza framework, how it compares to other streaming systems like Spark and Flink, as well as advice on how to handle […]

Continue Reading »

Episode 433: Jay Kreps on ksqlDB

Filed in Episodes by on November 6, 2020 0 Comments
Episode 433: Jay Kreps on ksqlDB

Jay Kreps, CEO and Co-Founder of Confluent discusses ksqlDB which is a SQL engine for data in Apache Kafka. Jay talks about stream processing, Kafka and how the data can now be queried with push/pull queries with ksqlDB, similar to a relational database. Jay discusses some of the similarities and differences between SQL databases and […]

Continue Reading »

SE-Radio Episode 346: Stephan Ewen on Streaming Architecture

Filed in Episodes by on November 14, 2018 0 Comments
SE-Radio Episode 346: Stephan Ewen on Streaming Architecture

Stephen Ewen, one of the original creator of Apache Flink discusses streaming architecture. Streaming architecture has become more important because it enables real-time computation on big data. Edaena Salinas spoke with Stephen Ewen about the comparison between batch processing and stream processing. Stephen explained the architecture components and the types of applications that can be […]

Continue Reading »

SE-Radio Episode 272: Frances Perry on Apache Beam

Filed in Episodes by on October 25, 2016 2 Comments
SE-Radio Episode 272: Frances Perry on Apache Beam

Jeff Meyerson talks with Frances Perry about Apache Beam, a unified batch and stream processing model. Topics include a history of batch and stream processing, from MapReduce to the Lambda Architecture to the more recent Dataflow model, originally defined in a Google paper. Dataflow overcomes the problem of event time skew by using watermarks and […]

Continue Reading »

Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Filed in Episodes by on March 6, 2015 4 Comments
Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Nathan Marz is the creator of Apache Storm, a real-time streaming application. Storm does for stream processing what Hadoop does for batch processing. The project began when Nathan was working on aggregating Twitter data using a queue-and-worker system he had designed. Many companies use Storm, including Spotify, Yelp, WebMD, and many others. Jeff and Nathan […]

Continue Reading »