Tag: big data
Jeff Meyerson talks to Haoyuan Li about Alluxio, a memory-centric distributed storage system. The cost of memory and disk capacity are both decreasing every year–but only the throughput of memory is increasing exponentially. This trend is driving opportunity in the space of big data processing. Alluxio is an open source, memory-centric, distributed, and reliable storage […]
Jeff Meyerson talks to Jun Rao, a software engineer and researcher (formerly of LinkedIn). Jun has spent much of his time researching MapReduce, scalable databases, query processing, and other facets of the data warehouse. For the past three years, he has been a committer to the Apache Kafka project. Jeff and Jun first compare streaming […]
Recording Venue: Skype Guest: Grant Ingersoll Grant Ingersoll, founder of the Mahout project, talks with Robert about machine learning. The conversation begins with an introduction to machine learning and the forces driving the adoption of this technique. Grant explains the three main use cases, similarity metrics, supervised versus unsupervised learning, and the use of large data […]
Dave explains why reading source code is at least as important a skill as writing source code. He shares approaches for how to get to grips with unknown and undocumented source code even if it is non-trivial in size. He finishes with advice for how to get started reading code.