April 22, 2013

SE Radio 193: Apache Mahout

Recording Venue: Skype
Guest: Grant Ingersoll
Grant Ingersoll, founder of the Mahout project, talks with Robert about machine learning. The conversation begins with an introduction to machine learning and the forces driving the adoption of this technique. Grant explains the three main use cases, similarity metrics, supervised versus unsupervised learning, and the use of large data sets. He also provides a brief history of the Mahout project and the connection between Mahout and Hadoop. The remainder of the episode dives into the three main uses cases: recommendations, clustering, and classification. Grant and Robert discuss each use case, illustrating with examples and a typical algorithm. Recommendation is a technique for identifying items that a user would like to buy, use, or otherwise consume based on the preferences of similar users. Clustering is the partitioning of the data set into a small number of sets of similar items. Classification is the assignment of new items to a small number of existing sets.

Show Notes

The Mahout Project
Mahout in Action (book)
Grant Ingersoll’s blog
Data Science Central hub
Free class on Machine Learning from Stanford University
Programming Collective Intelligence
Grant Ingersoll is a co-author of the book, Taming Text
LucidWorks

Join the discussion

You must be logged in to post a comment.

2 comments

Joe says:

May 4, 2013 at 8:39 pm

When you were talking about ways machine learning is used I immediately thiught about lastfm.com. It’s a really cool website that tracks what music you’re listening to and based on what other people who listen that same music like makes suggestions for you.
Episode 214: Grant Ingersoll on his book, Taming Text : Software Engineering Radio says:

November 11, 2014 at 11:50 pm

[…] SE Radio episode 193 on Apache Mahout: http://www.se-radio.net/2013/04/episode-193-apache-mahout/ […]

SE Radio 193: Apache Mahout

Show Notes

Join the discussion

2 comments

More from this show

SE Radio 718: Will Sentance on JS Modernization

SE Radio 717: Eric Tschetter on Decoupling Observability

SE Radio 716: Martin Kleppmann Local-First Software

Menu

Recent posts

Search

Search

SE Radio 193: Apache Mahout

Show Notes

Join the discussion

2 comments

More from this show

SE Radio 718: Will Sentance on JS Modernization

SE Radio 717: Eric Tschetter on Decoupling Observability

SE Radio 716: Martin Kleppmann Local-First Software

Menu

Recent posts