Term of the Moment

graphics formats


Look Up Another Term


Definition: Hadoop


An open source big data framework from the Apache Software Foundation designed to handle huge amounts of data on clusters of servers. The storage is handled by the Hadoop Distributed File System (HDFS), and the data are sorted and summarized in parallel by Hadoop MapReduce, a version of Google's MapReduce. Required Java files are included in Hadoop Common, and Hadoop YARN provides the cluster management.

Originally written for the Nutch Web crawler for spidering the Web, in 2008, Yahoo!'s Search Webmap was the first very large implementation of Hadoop running on 10,000 Linux servers. Search Webmap ran in a third less time than Yahoo!'s previous search engine.

A Stuffed Elephant
The Hadoop name comes from a favorite stuffed elephant of the son of developer Doug Cutting. See Google File System, MapReduce and Apache Spark.