配置客户端连接服务器上的hadoop集群,毕竟自己的机子性能有限,进行一些大规模的计算还是需要到服务器上进行,以下记录了配置客户端的全过程。

搭建环境:CentOS7+hadoop3.0.3+jdk8

前置需求

  1. 配置静态ip地址
  2. 修改主机名
  3. 配置主机名和ip地址映射
  4. 关闭防火墙,设置开机不自启动
  5. 安装并配置好jdk
    注:以上操作在笔者之前的Linux相关笔记中均有介绍

正式搭建

  1. 创建相关目录
    mkdir /usr/local/apps/
  2. 解压hadoop-3.0.3.tar.gz到apps/
    tar -zxvf hadoop-3.0.3.tar.gz -C /usr/local/apps/
  3. 建立软连接
    hadoop指向hadoop-3.0.3
    ln -s hadoop-3.0.3 hadoop
  4. 配置hadoop环境变量
    vi ~/.bashrc
    添加
    export HADOOP_HOME=/usr/local/apps/hadoop-3.0.3
    export HADOOP_PREFIX=$HADOOP_HOME
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
    export HADOOP_USER_NAME=xujie
    export JAVA_HOME=/usr/local/apps/jdk1.8.0_101
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    重新加载
    source ~/.bashrc
  5. 修改配置文件core-site.xml
    vi /usr/local/apps/hadoop-3.0.3/etc/hadoop/core-site.xml
    在configuration标签中添加属性

    1
    2
    3
    4
    5
    6
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://computer1.cloud.briup.com:9000</value>
    </property>
    </configuration>
  6. 修改配置文件mapred-site.xml
    vi /usr/local/apps/hadoop-3.0.3/etc/hadoop/mapred-site.xml
    在configuration标签中添加属性

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    <property>
    <name>mapreduce.app-submission.cross-platform</name>
    <value>true</value>
    </property>
    <property>
    <name>mapreduce.admin.user.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
    </property>
    <property>
    <name>mapreduce.application.classpath</name>
    <value>/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*</value>
    </property>
    </configuration>
  7. 修改配置文件yarn-site.xml
    vi /usr/local/apps/hadoop-3.0.3/etc/hadoop/yarn-site.xml
    在configuration标签中添加属性

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    <configuration>
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>computer1.cloud.briup.com</value> #主节点ip
    </property>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    </configuration>
  8. 查看该集群根目录
    hdfs dfs -ls /

  9. 创建自己的家目录
    hdfs dfs -mkdir /user/xujie
  10. 测试计算pi的值
    cd /usr/local/apps/hadoop/share/hadoop/mapreduce
    hadoop jar hadoop-mapreduce-examples-3.0.3.jar pi 4 5000000
    计算5000000次pi的值
  11. 在浏览器查看
    172.16.0.4:8088
    172.16.0.4:9870

最后更新: 2018年10月08日 18:25

原始链接: https://www.lousenjay.top/2018/08/20/hadoop3.0单机模式搭建/