Episode 222: Nathan Marz on Real-Time Processing with Apache Storm

Filed in Episodes by on March 6, 2015 3 Comments
Nathan Marz

Nathan Marz

Nathan Marz is the creator of Apache Storm, a real-time streaming application. Storm does for stream processing what Hadoop does for batch processing. The project began when Nathan was working on aggregating Twitter data using a queue-and-worker system he had designed. Many companies use Storm, including Spotify, Yelp, WebMD, and many others. Jeff and Nathan talk about the basic abstractions of Storm: spouts (computation sources), bolts (process input streams and produce new output streams), and topologies (networks of spouts and bolts). These simplifying core concepts are analogous to map and reduce in Hadoop. Nathan attributes Storm’s success to the simplicity of these components. After exploring the basics of Storm, Jeff and Nathan talk about the fundamentals of Lambda architecture. You can use Storm with a batch tool such as Hadoop to form a Lambda architecture. The conversation continues with discussions of examples, common failure cases, and guarantees of Storm.

Venue: Internet

Related Links



Tags: , , , , , , , , , , , , , , ,