Tag: apache
Episode 436: Apache Samza with Yi Pan

Yi Pan, lead maintainer of Apache Samza discusses the internals of the Samza project as well as the Stream Processing ecosystem. Host Adam Conrad spoke with Pan about the three core aspects of the Samza framework, how it compares to other streaming systems like Spark and Flink, as well as advice on how to handle […]
Episode 398: Apache Kudu with Adar Lieber-Dembo

Adar Lieber-Dembo from Cloudera discusses Apache Kudu, which is a columnar data storage system for fast analytics and fast ingestion of large datasets. Kudu takes its inspiration from systems in the Hadoop ecosystem, but it addresses many of their shortcomings. SE Radio’s Akshay Manchale spoke with Adar about motivations behind building Kudu, features available for […]
Episode 229: Flavio Junqueira on Distributed Coordination with Apache ZooKeeper

Flavio Junqueira is the author of Zookeeper: Distributed Process Coordination. Flavio and Jeff Meyerson begin by defining ZooKeeper and talking about what ZooKeeper is and isn’t. ZooKeeper can be thought of as a patch against certain fallacies of distributed computing: that the network is secure, has zero latency, has infinite bandwidth, and so on. With […]