Archive for category MapReduce
Chris Wensel of Cascading talks Hadoop with Sohrab Modi of Sun
Posted by Jonathan Gray in Cascading, Hadoop/HBase, MapReduce, Video on December 16th, 2008
Interesting conversation between Chris Wensel, founder of Concurrent Inc and author of Cascading, and Sohrab Modi, VP Chief Technology Office at Sun Microsystems.
Sun has definitely been paying attention to Hadoop (and increasingly HBase), so it will be interesting to see if they can make a case for using (typically high-end) Sun hardware to run this new distributed, commodity hardware driven software model. Sohrab mentions increasing the Disk-to-Core ratio on Hadoop nodes above the 1-to-1 ratio typical in many clusters today.
This thinking seems at odds with most of the Hadoop community who are often CPU bound, or who feel more nodes with fewer disks is better than fewer nodes with more disks.
They don’t speak about HBase, but from that perspective it might make sense to squeeze more disks per node and fewer total nodes, especially with the new findings from George Porter about tapping the local FS when possible. However it still seems to me that on the surface Sun hardware does not necessarily fit the new distributed, commodity hardware model.
Part one of that conversation (originally posted here):
[youtube=http://www.youtube.com/watch?v=CMt-IqQlnQ8&hl=en&fs=1&rel=0&color1=0x234900&color2=0x4e9e00]
Part two:
[youtube=http://www.youtube.com/watch?v=YtkaDQOuJ4k&hl=en&fs=1&rel=0&color1=0x234900&color2=0x4e9e00]
Another Hadoop-related video of interest by Stefan Groschupf of Scale Unlimited visualizing the evolution of the Hadoop codebase:
[vimeo vimeo.com/2513321]