How to Install Single Node Cluster Hadoop on Windows?

Last Updated : 06 Oct, 2021

Hadoop Can be installed in two ways. The first is on a single node cluster and the second way is on a multiple node cluster. Let’s see the explanation of both of them. But in this section will cover the installation part on a single node cluster. Let’s discuss one by one.

Single Node Cluster and Multi-Node Cluster:

Single Node Cluster – It Has one DataNode running and setting up all the NameNode, DataNode, Resource Manager, and NodeManager on a single machine. This is used for studying and testing purposes.
Multi-Node Cluster – Has more than one DataNode running and each DataNode is running on different machines.

Installation steps on a Single Node Cluster

Steps for Installing Single Node Cluster Hadoop on Windows as follows.

Prerequisite:

JAVA-Java JDK (installed)
HADOOP-Hadoop package (Downloaded)

Step 1: Verify the Java installed

javac -version

Verify the Java installed

Step 2: Extract Hadoop at C:\Hadoop

$Extract Hadoop at C:\Hadoop$

Step 3: Setting up the HADOOP_HOME variable

Use windows environment variable setting for Hadoop Path setting.

Setting up the HADOOP

Step 4: Set JAVA_HOME variable

Use windows environment variable setting for Hadoop Path setting.

Set JAVA_HOME variable

Step 5: Set Hadoop and Java bin directory path

Set Hadoop and Java bin directory path

Step 6: Hadoop Configuration :

For Hadoop Configuration we need to modify Six files that are listed below-

1. Core-site.xml
2. Mapred-site.xml
3. Hdfs-site.xml
4. Yarn-site.xml
5. Hadoop-env.cmd
6. Create two folders datanode and namenode

Step 6.1: Core-site.xml configuration

<configuration>
   <property>
       <name>fs.defaultFS</name>
       <value>hdfs://localhost:9000</value>
   </property>
</configuration>

Step 6.2: Mapred-site.xml configuration

<configuration>
   <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
   </property>
</configuration>

Step 6.3: Hdfs-site.xml configuration

<configuration>
   <property>
       <name>dfs.replication</name>
       <value>1</value>
   </property>
   <property>
       <name>dfs.namenode.name.dir</name>
       <value>C:\hadoop-2.8.0\data\namenode</value>
   </property>
   <property>
       <name>dfs.datanode.data.dir</name>
       <value>C:\hadoop-2.8.0\data\datanode</value>
   </property>
</configuration>

Step 6.4: Yarn-site.xml configuration

<configuration>
   <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
   </property>
   <property>
          <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>  
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
   </property>
</configuration>

Step 6.5: Hadoop-env.cmd configuration

Set "JAVA_HOME=C:\Java" (On C:\java this is path to file jdk.18.0)

Step 6.6: Create datanode and namenode folders

1. Create folder "data" under "C:\Hadoop-2.8.0"
2. Create folder "datanode" under "C:\Hadoop-2.8.0\data"
3. Create folder "namenode" under "C:\Hadoop-2.8.0\data"