What are the key features and benefits of Hadoop? Why is Hadoop such a successful platform?
Apache Hadoop, mostly called just Hadoop, is a software
framework and platform for reading, processing, storing and analyzing very
large amounts of data. There are several features of Hadoop that make it a very
powerful solution for data analytics.
Hadoop is Distributed
With Hadoop, from a few to hundreds or thousands of
commodity servers (called nodes) can be connected (forming a cluster) to work
together to achieve whatever processing power and storage capability is needed.
The software platform enables the nodes to work together, passing work and data
between them. Data and processing is distributed across nodes which spreads the
load and significantly reduces the impact of failure.
Hadoop is Scalable
In the past, to achieve extremely powerful computing, a company would have to buy very expensive, large, monolithic computers. As data growth exploded, eventually even those super computers would become insufficient. With Hadoop, from a few to hundreds or thousands or even millions of commodity servers can be relatively easily connected to work together to achieve whatever processing power and storage capability is needed. This allows a company or project to start out small and then grow as needed inexpensively, without any concern about hitting a limitation.
Hadoop is Fault
Tolerant
Hadoop was designed and built around the fact that there
will be frequent failures on the commodity hardware servers that make up the
Hadoop cluster. When a failure occurs, the software handles the automatic reassignment
of work and replication of data to other nodes in the cluster, and the system
continues to function properly without manual intervention. When a node
recovers, from a reboot for example, it will rejoin the cluster automatically
and become available for work.
Hadoop is backed by the power of Open Source
Hadoop is open source software, which means that it can be downloaded, installed, used and even modified for free. It is managed by the renown non-profit group, Apache Software Foundation (ASF), hence the name Apache Hadoop. The group is made up of many brilliant people from all over the world, many of whom work at some of the top technology companies, who commit their time to managing the software. In addition, there are also many developers that contribute code to enhance or add new features and functionality to Hadoop or to add new tools that work with Hadoop. The various tools that have been built over the years to complement core Hadoop make up what is called the Hadoop ecosystem. With a large community of people from all over the world continuously adding to the growth of the Hadoop ecosystem in a well-managed way, it will only get better and become more useful to many more use-cases.
These are the reasons Hadoop has become such a force within the data world. Although there is some hype around the big data phenomenon, the benefits and solutions based on the Hadoop ecosystem are real.
You can learn more at https://hadoop.apache.org