Sanjay Radia, Yahoo Inc. Hadoop: Scalable Storage and Computing The Hadoop system provides a distributed file system and a framework for processing very large amounts of data using the MapReduce paradigm. The system scales horizontally for compute cycles, IO bandwidth and storage capacity. Hadoop is in daily use for large clusters of several thousand machines at Yahoo and smaller clusters at many other organizations. Hadoop is available via an Apache open source license. This talk give a brief overview of the Hadoop system and focuses on a couple of areas of new development that are ongoing at Yahoo. Speaker Bio Sanjay leads the Hadoop Distributed File System project at Yahoo where it is in daily use for large clusters of several thousand machines. Previously he has held senior positions at Cassatt, Sun Microsystems and INRIA where he has developed systems software for distributed systems and grid/utility computing infrastructures. He has published numerous papers and holds several patents. Sanjay has PhD in Computer Science from University of Waterloo. Canada.