Difference Between Hadoop and HBase

Last Updated : 24 Sep, 2021

Hadoop: Hadoop is an open source framework from Apache that is used to store and process large datasets distributed across a cluster of servers. Four main components of Hadoop are Hadoop Distributed File System(HDFS), Yarn, MapReduce, and libraries. It involves not only large data but a mixture of structured, semi-structured, and unstructured information. Amazon, IBM, Microsoft, Cloudera, ScienceSoft, Pivotal, Hortonworks are some of the companies using Hadoop technology.

HBase: HBase is an open source database from Apache that runs on Hadoop cluster. It falls under the non-relational database management system. Three important components of HBase are HMaster, Region server, Zookeeper. CapitalOne, JPMorganchase, apple, MTB, AT& T, Lockheed Martin are some of the companies using HBase.

Hadoop-vs-HBase

Below is a table of differences between Hadoop and HBase:

S.No.	Hadoop	HBase
1	Hadoop is a collection of software tools	HBase is a part of hadoop eco-system
2	Stores data sets in a distributed environment	Stores data in a column-oriented manner
3	Hadoop is a framework	HBase is a NOSQL database
4	Data are stored in form of chunks	Data are stored in form of key/value pair
5	Hadoop does not allow run time changes	HBase allows run time changes
6	File can be written only once, can be read many times	File can be read and write multiple times
7	Hadoop has low latency operations	HBase has high latency operations
8	HDFS can be accessed through MapReduce	HBase can be accessed through shell commands, Java API, REST

Suggest improvement

Difference Between Hadoop and Hive

Share your thoughts in the comments

Difference Between Hadoop and HBase

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?