Monday, 14 October 2013


Running Hadoop on CentOS Linux (Single-Node Cluster)







Follow the bellow steps:

1).Install CentOS iso image in VM Workstation.

2).download java1.6 based on CentOS 32 bit or 64 bit

for 32 bit 
jdk-6u43-linux-i586-rpm.bin

3).Give executable permission to jdk-6u43-linux-i586-rpm.bin

$chmod 755 jdk-6u43-linux-i586-rpm.bin

4).Install java from root user(where java is there) 

#./jdk-6u43-linux-i586-rpm.bin

5).export java home 
#export JAVA_HOME=/usr/java/jdk1.6.0_43

6).Download required hadoop version
ex:hadoop-1.0.3.tar.gz

7).come to trainig user(training)
#su training
password:password

8).unzip hadoop tar file
$cd hadoop
$tar -zxvf hadoop-1.o.3.tar.gz

9).making hadoop recoznize java
 cd /hadoop/conf
$vi hadoop-env.sh
and add following line
export JAVA_HOME=/usr/java/jdk1.6.0_43
save and quit

10)configuring HADOOP_HOME directory to hadoop instalation directory
export HADOOP_HOME 

11). Goto your home dir
$cd ~

open .bashrc file in vi editor and add following lines
export HADOOP_HOME=<hadoop installed location>
export PATH=$PATH:$HADOOP_HOME/bin

note:
a).add user to sudoers file
goto root user and open /etc/sudoers file and add following line
training ALL=(ALL) NOPASSWD:ALL

b).making update-alternatives working
add following line to .bashrc file in u r home dir

export PATH=$PATH:/sbin:/usr/sbin:/usr/local/sbin

c).make jps running

goto home $cd ~

open .bashrc file and add following line

export PATH=$PATH:/usr/java/jdk1.6.0_43/bin

d).for ssh configuration 
switch to root user

#ssh-keygen
enter
enter
enter

#cp .ssh/id_rsa.pub .ssh/authorized_keys
#chmod 700 .ssh
#chmod 640 .ssh/authorized_keys
#chmod 600 .ssh/id_rsa

#cp .ssh/authorized_keys /home/training/.ssh/authorized_keys
#cp .ssh/id_rsa /home/training/.ssh/id_rsa
#chown training /home/training/.ssh/authorized_keys
#chown training /home/training/.ssh/id_rsa

#chmod 700 /home/training/.ssh
#chmod 640 /home/training/.ssh/authorized_keys
#chmod 600 /home/training/.ssh/id_rsa

open /etc/ssh/ssh_config file and
make StrictHostChecking no

#service sshd restart



12).set all configurations

#vi /home/training/hadoop/hadoop-1.0.3/conf/core-site.xml
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:8020</value>
</property>

#vi /home/training/hadoop/hadoop-1.0.3/conf/mapred-site.xml

<property>
  <name>mapred.job.tracker</name>
  <value>localhost:8021</value>
</property>

#vi /home/training/hadoop/hadoop-1.0.3/conf/hdfs-site.xml

<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>


13).To format the namenode

$hadoop namenode -format

14).start all the services

$/home/training/hadoop/hadoop-1.0.3/bin/start-all.sh

15).Open browser and check weather the services start or not

 type the browser http://localhost:50070 or http://localhost:50030



16).Installing eclipse in node

change to root user and do the following
#

download eclipse 
ex:eclipse-java-europa-winter-linux-gtk.tar.gz

create a dir eclipse under /home/training

copy the downloaded file to eclipse folder and untar the file

tar -zxvf eclipse-java-europa-winter-linux-gtk.tar.gz

17). change permissions for eclipse dir

chmod -R +r /opt/eclipse 

18).Create Eclipse executable on /usr/bin path

touch /usr/bin/eclipse

chmod 755 /usr/bin/eclipse


## Open eclipse file with your favourite editor ##
nano -w /usr/bin/eclipse

## Paste following content to file ##
#!/bin/sh
export ECLIPSE_HOME="/home/training/eclipse"

$ECLIPSE_HOME/eclipse $*


19).bring eclipse icon on desktop

## Create following file, with our favourite editor ##
/usr/share/applications/eclipse.desktop

## Add following content to file and save ##
[Desktop Entry]
Encoding=UTF-8
Name=Eclipse
Comment=Eclipse SDK 4.2.1
Exec=eclipse
Icon=/home/training/eclipse/icon.xpm
Terminal=false
Type=Application
Categories=GNOME;Application;Development;
StartupNotify=true

after successful installation goto Applications->programming->eclipse->right click and ->addthis launcher to desktop

launch eclipse by double clicking eclipse icon on desktop

click on new project-->select mapreduce project-->click on configure Hadoop Install dir adnd give <hadoop install location>

20).Installing hive

download any stable version of hive 
ex:hive-0.10.0-bin.tar.gz

unzip the downloaded file
$tar -zxvf hive-0.10.0-bin.tar.gz

open .bashrc file and add following lines

export HIVE_HOME=/home/training/hive-0.10.0-bin
export PATH=$PATH:$HIVE_HOME/bin


$hadoop fs -mkdir /tmp
$hadoop fs -chmod a+w /tmp/
$hadoop fs -mkdir /user/hive/warehouse
$hadoop fs -chmod a+w /user/hive/warehouse

do the following to change log4j warnining issue
Change org.apache.hadoop.metrics.jvm.EventCounter to org.apache.hadoop.log.metrics.EventCounter in 
/home/training/hive-0.10.0-bin/conf/hive-log4j.properties.
/home/training/hive-0.10.0-bin/conf/hive-exec-log4j.properties.



To refer the websites:

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
http://www.bromlays.com/en/left/hadoop-single-node-configuration-centos/
http://linux.jamesjara.com/2012/02/installing-hadoophivederby-on-centos.html

http://linux.jamesjara.com/2012/02/installing-hadoophivederby-on-centos.html

1 comment: