In the last two years, the Apache Hadoop software library has emerged as a
veritable Swiss Army knife of data management and analytical infrastructure.
The Hadoop toolset has been positioned as a universal platform for all types
of commercial and organizational analytical needs.
Hadoop is an ideal solution for use cases in which the data is easily
partitioned and distributed. For example, consider keyword searches, a major
component of SEO. Simply identifying and counting distinct words in text data
is a central part of the process for keyword-based searches. No matter how
many pieces of text you have, each document, article, blog post, or other
piece of content is distinct from the other.
In order to enable keyword search, a program then computes the number of
words in each distinct document or item of text. Clearly, this can be done in
isolation: to count the num... (more)