Running Hadoop on CentOS Linux (Single-Node Cluster)
Follow the bellow steps:
1).Install CentOS iso image in VM Workstation.
2).download java1.6 based on CentOS 32 bit or 64 bit
for 32 bit
jdk-6u43-linux-i586-rpm.bin
3).Give executable permission to jdk-6u43-linux-i586-rpm.bin
$chmod 755 jdk-6u43-linux-i586-rpm.bin
4).Install java from root user(where java is there)
#./jdk-6u43-linux-i586-rpm.bin
5).export java home
#export JAVA_HOME=/usr/java/jdk1.6.0_43
6).Download required hadoop version
ex:hadoop-1.0.3.tar.gz
7).come to trainig user(training)
#su training
password:password
8).unzip hadoop tar file
$cd hadoop
$tar -zxvf hadoop-1.o.3.tar.gz
9).making hadoop recoznize java
cd /hadoop/conf
$vi hadoop-env.sh
and add following line
export JAVA_HOME=/usr/java/jdk1.6.0_43
save and quit
10)configuring HADOOP_HOME directory to hadoop instalation directory
export HADOOP_HOME
11). Goto your home dir
$cd ~
open .bashrc file in vi editor and add following lines
export HADOOP_HOME=<hadoop installed location>
export PATH=$PATH:$HADOOP_HOME/bin
note:
a).add user to sudoers file
goto root user and open /etc/sudoers file and add following line
training ALL=(ALL) NOPASSWD:ALL
b).making update-alternatives working
add following line to .bashrc file in u r home dir
export PATH=$PATH:/sbin:/usr/sbin:/usr/local/sbin
c).make jps running
goto home $cd ~
open .bashrc file and add following line
export PATH=$PATH:/usr/java/jdk1.6.0_43/bin
d).for ssh configuration
switch to root user
#ssh-keygen
enter
enter
enter
#cp .ssh/id_rsa.pub .ssh/authorized_keys
#chmod 700 .ssh
#chmod 640 .ssh/authorized_keys
#chmod 600 .ssh/id_rsa
#cp .ssh/authorized_keys /home/training/.ssh/authorized_keys
#cp .ssh/id_rsa /home/training/.ssh/id_rsa
#chown training /home/training/.ssh/authorized_keys
#chown training /home/training/.ssh/id_rsa
#chmod 700 /home/training/.ssh
#chmod 640 /home/training/.ssh/authorized_keys
#chmod 600 /home/training/.ssh/id_rsa
open /etc/ssh/ssh_config file and
make StrictHostChecking no
#service sshd restart
12).set all configurations
#vi /home/training/hadoop/hadoop-1.0.3/conf/core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
#vi /home/training/hadoop/hadoop-1.0.3/conf/mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
#vi /home/training/hadoop/hadoop-1.0.3/conf/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
13).To format the namenode
$hadoop namenode -format
14).start all the services
$/home/training/hadoop/hadoop-1.0.3/bin/start-all.sh
15).Open browser and check weather the services start or not
type the browser http://localhost:50070 or http://localhost:50030
16).Installing eclipse in node
change to root user and do the following
#
download eclipse
ex:eclipse-java-europa-winter-linux-gtk.tar.gz
create a dir eclipse under /home/training
copy the downloaded file to eclipse folder and untar the file
tar -zxvf eclipse-java-europa-winter-linux-gtk.tar.gz
17). change permissions for eclipse dir
chmod -R +r /opt/eclipse
18).Create Eclipse executable on /usr/bin path
touch /usr/bin/eclipse
chmod 755 /usr/bin/eclipse
## Open eclipse file with your favourite editor ##
nano -w /usr/bin/eclipse
## Paste following content to file ##
#!/bin/sh
export ECLIPSE_HOME="/home/training/eclipse"
$ECLIPSE_HOME/eclipse $*
19).bring eclipse icon on desktop
## Create following file, with our favourite editor ##
/usr/share/applications/eclipse.desktop
## Add following content to file and save ##
[Desktop Entry]
Encoding=UTF-8
Name=Eclipse
Comment=Eclipse SDK 4.2.1
Exec=eclipse
Icon=/home/training/eclipse/icon.xpm
Terminal=false
Type=Application
Categories=GNOME;Application;Development;
StartupNotify=true
after successful installation goto Applications->programming->eclipse->right click and ->addthis launcher to desktop
launch eclipse by double clicking eclipse icon on desktop
click on new project-->select mapreduce project-->click on configure Hadoop Install dir adnd give <hadoop install location>
20).Installing hive
download any stable version of hive
ex:hive-0.10.0-bin.tar.gz
unzip the downloaded file
$tar -zxvf hive-0.10.0-bin.tar.gz
open .bashrc file and add following lines
export HIVE_HOME=/home/training/hive-0.10.0-bin
export PATH=$PATH:$HIVE_HOME/bin
$hadoop fs -mkdir /tmp
$hadoop fs -chmod a+w /tmp/
$hadoop fs -mkdir /user/hive/warehouse
$hadoop fs -chmod a+w /user/hive/warehouse
do the following to change log4j warnining issue
Change org.apache.hadoop.metrics.jvm.EventCounter to org.apache.hadoop.log.metrics.EventCounter in
/home/training/hive-0.10.0-bin/conf/hive-log4j.properties.
/home/training/hive-0.10.0-bin/conf/hive-exec-log4j.properties.
To refer the websites:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/http://www.bromlays.com/en/left/hadoop-single-node-configuration-centos/
http://linux.jamesjara.com/2012/02/installing-hadoophivederby-on-centos.html
http://linux.jamesjara.com/2012/02/installing-hadoophivederby-on-centos.html
This comment has been removed by the author.
ReplyDelete