Hadoop 安装和配置

Hadoop的核心就是HDFS和MapReduce

首先安装Hadoop

下载 Hadoop,解压到本地目录
或使用brew安装

> brew install Hadoop

配置ssh免密码登录

> ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

将生成的公钥加入到用于认证的公钥文件中

> cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

接下来测试一下是否配置成功

> ssh localhost

如果遇到 connection refused 之类的错误,检查一下是否开启远程登录功能,在系统偏好设置中可以设置。

设置环境变量

export HADOOP_HOME=/Users/hadoop/hadoop-1.2.1
export PATH=$PATH:$HADOOP_HOME/bin

配置Hadoop

../hadoop/conf/hadoop-env.sh

export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
export HADOOP_HEAPSIZE=2000
export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"

../hadoop/etc/hadoop/core-site.xml

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>hdfs://localhost:9000</value>
        <description>A base for other temporary directories.</description>
    </property>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:8020</value>
    </property>
</configuration>

../hadoop/etc/hadoop/hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

../hadoop/etc/hadoop/hdfs-site.xml

<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>hdfs://localhost:9001/value>
    </property>
    <property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>2</value>
    </property>
    <property>
        <name>mapred.tasktracker.reduce.tasks.maximum</name>
        <value>2</value>
    </property>
</configuration>

运行

> cd  /usr/local/Cellar/hadoop/2.8.0/libexec
# 然后格式化文件系统
> bin/hdfs namenode -format
# 启动NameNode和DataNode的守护进程
> sbin/start-dfs.sh
# 启动ResourceManager和NodeManager的守护进程
> sbin/start-yarn.sh

查看Hadoop集群的信息
http://localhost:8088 

hadoop集群运行情况
http://localhost:50070


Add a Comment

电子邮件地址不会被公开。 必填项已用*标注