Tag: big data

SE-Radio Episode 346: Stephan Ewen on Streaming Architecture

Filed in Episodes by on November 14, 2018 0 Comments
SE-Radio Episode 346: Stephan Ewen on Streaming Architecture

Stephen Ewen, one of the original creator of Apache Flink discusses streaming architecture. Streaming architecture has become more important because it enables real-time computation on big data. Edaena Salinas spoke with Stephen Ewen about the comparison between batch processing and stream processing. Stephen explained the architecture components and the types of applications that can be […]

Continue Reading »

SE-Radio Episode 260: Haoyuan Li on Alluxio

Filed in Episodes by on June 14, 2016 0 Comments
SE-Radio Episode 260: Haoyuan Li on Alluxio

Jeff Meyerson talks to Haoyuan Li about Alluxio, a memory-centric distributed storage system. The cost of memory and disk capacity are both decreasing every year–but only the throughput of memory is increasing exponentially. This trend is driving opportunity in the space of big data processing. Alluxio is an open source, memory-centric, distributed, and reliable storage […]

Continue Reading »

SE-Radio Episode 235: Ben Hindman on Apache Mesos

Filed in Episodes by on August 17, 2015 1 Comment
SE-Radio Episode 235: Ben Hindman on Apache Mesos

Ben Hindman talks to Jeff Meyerson about Apache Mesos, a distributed systems kernel. Mesos abstracts away many of the hassles of managing a distributed system. Hindman starts with a high-level explanation of Mesos, explaining the problems he encountered trying to run multiple instances of Hadoop against a single data set. He then discusses how Twitter uses […]

Continue Reading »

Episode 219: Apache Kafka with Jun Rao

Filed in Episodes by on February 10, 2015 6 Comments
Episode 219: Apache Kafka with Jun Rao

Jeff Meyerson talks to Jun Rao, a software engineer and researcher (formerly of LinkedIn). Jun has spent much of his time researching MapReduce, scalable databases, query processing, and other facets of the data warehouse. For the past three years, he has been a committer to the Apache Kafka project. Jeff and Jun first compare streaming […]

Continue Reading »

Episode 193: Apache Mahout

Filed in Episodes by on April 22, 2013 2 Comments
Episode 193: Apache Mahout

Recording Venue: Skype Guest: Grant Ingersoll Grant Ingersoll, founder of the Mahout project, talks with Robert about machine learning.   The conversation begins with an introduction to machine learning and the forces driving the adoption of this technique. Grant explains the three main use cases, similarity metrics, supervised versus unsupervised learning, and the use of large data […]

Continue Reading »