Nathan Marz is the creator of Apache Storm, a real-time streaming application. Storm does for stream processing what Hadoop does for batch processing. The project began when Nathan was working on aggregating Twitter data using a queue-and-worker system he had designed. Many companies use Storm, including Spotify, Yelp, WebMD, and many others. Jeff and Nathan talk about the basic abstractions of Storm: spouts (computation sources), bolts (process input streams and produce new output streams), and topologies (networks of spouts and bolts). These simplifying core concepts are analogous to map and reduce in Hadoop. Nathan attributes Storm’s success to the simplicity of these components. After exploring the basics of Storm, Jeff and Nathan talk about the fundamentals of Lambda architecture. You can use Storm with a batch tool such as Hadoop to form a Lambda architecture. The conversation continues with discussions of examples, common failure cases, and guarantees of Storm.
- Apache Storm project: https://storm.apache.org
- Storm on Twitter: https://twitter.com/stormprocessor
- Nathan Marz’s homepage: http://nathanmarz.com
- Nathan Marz’s Twitter: https://twitter.com/nathanmarz
- Manning book on Storm: http://manning.com/marz
- Nathan Marz presents Storm on YouTube: https://www.youtube.com/watch?v=bdps8tE0gYo
Tags: apache, Apache Storm, consumers, distributed systems, fault tolerance, hadoop, hash functions, Lambda architecture, MapReduce, message broker, Nathan Marz, producers, real-time processing, scalability, Storm, streaming