Difference Between Hadoop and HBase
Last Updated :
24 Sep, 2021
Hadoop: Hadoop is an open source framework from Apache that is used to store and process large datasets distributed across a cluster of servers. Four main components of Hadoop are Hadoop Distributed File System(HDFS), Yarn, MapReduce, and libraries. It involves not only large data but a mixture of structured, semi-structured, and unstructured information. Amazon, IBM, Microsoft, Cloudera, ScienceSoft, Pivotal, Hortonworks are some of the companies using Hadoop technology.
HBase: HBase is an open source database from Apache that runs on Hadoop cluster. It falls under the non-relational database management system. Three important components of HBase are HMaster, Region server, Zookeeper. CapitalOne, JPMorganchase, apple, MTB, AT& T, Lockheed Martin are some of the companies using HBase.
Below is a table of differences between Hadoop and HBase:
S.No. |
Hadoop |
HBase |
1 |
Hadoop is a collection of software tools |
HBase is a part of hadoop eco-system |
2 |
Stores data sets in a distributed environment |
Stores data in a column-oriented manner |
3 |
Hadoop is a framework |
HBase is a NOSQL database |
4 |
Data are stored in form of chunks |
Data are stored in form of key/value pair |
5 |
Hadoop does not allow run time changes |
HBase allows run time changes |
6 |
File can be written only once, can be read many times |
File can be read and write multiple times |
7 |
Hadoop has low latency operations |
HBase has high latency operations |
8 |
HDFS can be accessed through MapReduce |
HBase can be accessed through shell commands, Java API, REST |
Share your thoughts in the comments
Please Login to comment...