hadoop部署文档

hadoop简介

hadoop是什么

1)Hadoop是一个由Apache基金会所开发的分布式系统基础架构。

2)主要解决,海量数据的存储和海量数据的分析计算问题。

3)广义上来说,Hadoop通常是指一个更广泛的概念——Hadoop生态圈。

hadoop发展历史

1)Lucene框架是Doug Cutting开创的开源软件,用Java书写代码,实现与Google类似的全文搜索功能,它提供了全文检索引擎的架构,包括完整的查询引擎和索引引擎。

2)2001年年底Lucene成为Apache基金会的一个子项目。

3)对于海量数据的场景,Lucene面对与Google同样的困难,存储数据困难,检索速度慢。

4)学习和模仿Google解决这些问题的办法 :微型版Nutch。

5)可以说Google是Hadoop的思想之源(Google在大数据方面的三篇论文)

a. GFS —>HDFS
b. Map-Reduce —>MR
c. BigTable —>HBase

6)2003-2004年,Google公开了部分GFS和MapReduce思想的细节,以此为基础Doug Cutting等人用了2年业余时间实现了DFS和MapReduce机制,使Nutch性能飙升。

7)2005 年Hadoop 作为 Lucene的子项目 Nutch的一部分正式引入Apache基金会。

8)2006 年 3 月份,Map-Reduce和Nutch Distributed File System (NDFS) 分别被纳入到 Hadoop 项目中,Hadoop就此正式诞生,标志着大数据时代来临。

9)名字来源于Doug Cutting儿子的玩具大象

hadoop三大发行版本

Hadoop三大发行版本:Apache、Cloudera、Hortonworks。
Apache版本最原始(最基础)的版本,对于入门学习最好。
Cloudera内部集成了很多大数据框架。对应产品CDH。
Hortonworks文档较好。对应产品HDP。

Hadoop的优势

1)高可靠性:Hadoop底层维护多个数据副本,所以即使Hadoop某个计算元素或存储出现故障,也不会导致数据的丢失。

2)高扩展性:在集群间分配任务数据,可方便的扩展数以千计的节点。

3)高效性:在MapReduce的思想下,Hadoop是并行工作的,以加快任务处理速度。

4)高容错性:能够自动将失败的任务重新分配。

hadoop涉及相关技术生态知识普及

1)Sqoop:Sqoop是一款开源的工具,主要用于在Hadoop、Hive与传统的数据库(MySql)间进行数据的传递,可以将一个关系型数据库(例如 :MySQL,Oracle 等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。
2)Flume:Flume是一个高可用的,高可靠的,分布式的海量日志采集、聚合和传输的系统,Flume支持在日志系统中定制各类数据发送方,用于收集数据。
3)Kafka:Kafka是一种高吞吐量的分布式发布订阅消息系统;
4)Storm:Storm用于“连续计算”,对数据流做连续查询,在计算时就将结果以流的形式输出给用户。
5)Spark:Spark是当前最流行的开源大数据内存计算框架。可以基于Hadoop上存储的大数据进行计算。
6)Flink:Flink是当前最流行的开源大数据内存计算框架。用于实时计算的场景较多。
7)Oozie:Oozie是一个管理Hdoop作业(job)的工作流程调度管理系统。
8)Hbase:HBase是一个分布式的、面向列的开源数据库。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。
9)Hive:Hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供简单的SQL查询功能,可以将SQL语句转换为MapReduce任务进行运行。 其优点是学习成本低,可以通过类SQL语句快速实现简单的MapReduce统计,不必开发专门的MapReduce应用,十分适合数据仓库的统计分析。
10)ZooKeeper:它是一个针对大型分布式系统的可靠协调系统,提供的功能包括:配置维护、名字服务、分布式同步、组服务等。

部署:机器规划

配置主机名:

11.8.37.50 ops01 #命令 hostnamectl --static set-hostname ops01
11.8.36.63 ops02 # 命令hostnamectl --static set-hostname ops02
11.8.36.76 ops03 #命令 hostnamectl --static set-hostname ops03

3台 linux机器 hosts文件都添加如下的ops解析

root@ops01:/root #cat /etc/hosts
127.0.0.1 ydt-cisp-ops01
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

11.16.0.176 rancher.mydomain.com
11.8.38.123 www.tongtongcf.com
# hadoop+k8s
11.8.37.50 ops01
11.8.36.63 ops02
11.8.36.76 ops03
root@ops02:/root #cat /etc/hosts
127.0.0.1 ydt-cdep-ops02
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

11.16.0.176 rancher.mydomain.com
11.8.38.123 www.tongtongcf.com
# hadoop+k8s
11.8.37.50 ops01
11.8.36.63 ops02
11.8.36.76 ops03
root@ops03:/root #cat /etc/hosts
127.0.0.1 ydt-cdep-ops03
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

11.16.0.176 rancher.mydomain.com
11.8.38.123 www.tongtongcf.com
# hadoop+k8s
11.8.37.50 ops01
11.8.36.63 ops02
11.8.36.76 ops03

本机:Windows机器 hosts文件 C:\Windows\System32\drivers\etc\hosts (根据自己真实情况路径)

添加地址解析保存

# ops,用于浏览器访问页面时,省去反复去复制粘贴ip地址的步骤
11.8.37.50 ops01
11.8.36.63 ops02
11.8.36.76 ops03

【注意】windows机器相当于当前的电脑,用于访问页面等操作,如果部署时用的是云主机或者远程的带公网服务器,那么hosts里需要配置的ip地址是对外公网的IP,就不能配置内网ip了

初始环境方面优化 [ 3台机器都需要执行 ]:

root@ops01:/root #
root@ops01:/root #cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
root@ops01:/root #
root@ops01:/root #systemctl stop firewalld
root@ops01:/root #systemctl disable firewalld
root@ops01:/root #
root@ops01:/root #useradd wangting
root@ops01:/root #passwd wangting
root@ops01:/root #
# [ 在visudo 中添加wangting  ALL=(ALL)  ALL 位置如下 ]  3台都要执行
root@ops01:/root #visudo
##
## Allow root to run any commands anywhere
root    ALL=(ALL)       ALL
wangting  ALL=(ALL)  ALL
## Allows members of the 'sys' group to run networking, software,
## service management apps and more.

到此位置,后续操作都可以切换至普通用户来执行操作了

root@ops01:/root #
root@ops01:/root #su - wangting
Last login: Fri Mar 12 11:44:08 CST 2021 from ops03 on pts/0
wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >sudo mkdir /opt/module
wangting@ops01:/home/wangting >sudo mkdir /opt/software
wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >sudo chown -R wangting:wangting /opt/module /opt/software
[sudo] password for wangting: 
wangting@ops01:/home/wangting >ll /opt/ | grep -E "software|module"
drwxr-xr-x   3 wangting wangting 4096 Mar 11 15:26 module
drwxr-xr-x   2 wangting wangting 4096 Mar 11 15:24 software
wangting@ops01:/home/wangting >

3台机器创建免密登录

[ 初次搭建建议3台机器都执行如下操作,使得3台机器相互之间都实现免密登录访问 ]:

wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/wangting/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/wangting/.ssh/id_rsa.
Your public key has been saved in /home/wangting/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Of44IIzyrF18sHw7W7BWo55ZHFEMizm2D+BWiamYc7s wangting@ops02
The key's randomart image is:
+---[RSA 2048]----+
|        .o.      |
|     o + o.      |
|    + B o        |
| o o + o o       |
|+ oo+ + S        |
|.o.=oo.X +       |
| +. =.*.*        |
| .oo =.*.o       |
|..E   *o...      |
+----[SHA256]-----+
wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >ssh-copy-id ops01
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/wangting/.ssh/id_rsa.pub"
The authenticity of host 'ops01 (11.8.37.50)' can't be established.
ECDSA key fingerprint is SHA256:s1nzA+BJgp+a3aOHKX2ORe4So2omxVFZ0Dvk6E7LjmA.
ECDSA key fingerprint is MD5:e3:63:06:62:48:1f:20:82:e3:c7:f4:49:25:9c:3b:fa.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
wangting@ops01's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'ops01'"
and check to make sure that only the key(s) you wanted were added.

wangting@ops01:/home/wangting >ssh-copy-id ops02
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/wangting/.ssh/id_rsa.pub"
The authenticity of host 'ops02 (11.8.36.63)' can't be established.
ECDSA key fingerprint is SHA256:s1nzA+BJgp+a3aOHKX2ORe4So2omxVFZ0Dvk6E7LjmA.
ECDSA key fingerprint is MD5:e3:63:06:62:48:1f:20:82:e3:c7:f4:49:25:9c:3b:fa.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
wangting@ops01's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'ops02'"
and check to make sure that only the key(s) you wanted were added.

wangting@ops01:/home/wangting >ssh-copy-id ops03
/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/wangting/.ssh/id_rsa.pub"
The authenticity of host 'ops03 (11.8.36.76)' can't be established.
ECDSA key fingerprint is SHA256:s1nzA+BJgp+a3aOHKX2ORe4So2omxVFZ0Dvk6E7LjmA.
ECDSA key fingerprint is MD5:e3:63:06:62:48:1f:20:82:e3:c7:f4:49:25:9c:3b:fa.
Are you sure you want to continue connecting (yes/no)? yes
/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
wangting@ops01's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'ops03'"
and check to make sure that only the key(s) you wanted were added.

【注意】 以上为普通用户wangting 3台机器之间的相互免密登录,>>>>>还需要在ops01上采用 root 账号,配置一下无密登录到ops01、ops02、ops03 <<<<<

安装jdk、hadoop

jdk-8u171-linux-x64.tar.gz下载链接

链接:https://pan.baidu.com/s/16O6hnl4fagmkRS2ch9ZhEg
提取码:7nko

hadoop-3.1.3.tar.gz 下载链接

链接:https://pan.baidu.com/s/15kIc8pLWsCo3oTaPfqKtAw
提取码:xhno

下载2个安装包,上传至服务器ops01 /opt/software/

wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >cd /opt/software/
wangting@ops01:/opt/software >ll
total 330160
-rw-r--r-- 1 wangting wangting 338075860 Mar 11 14:34 hadoop-3.1.3.tar.gz
-rw-r--r-- 1 wangting wangting    375860 Mar 11 14:34 jdk-8u171-linux-x64.tar.gz
wangting@ops01:/opt/software >
wangting@ops01:/home/wangting >for i in ops01 ops02 ops03;do echo "=====  $i  =====" && ssh $i "java -version";done
=====  ops01  =====
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
=====  ops02  =====
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
=====  ops03  =====
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)
wangting@ops01:/home/wangting >

java安装

我这里java环境之前已经部署过了,查询version如上

如果jdk没有部署过,操作如下或者百度一下

wangting@ops01:/opt/software >tar -zxvf jdk-8u171-linux-x64.tar.gz -C /opt/module/

【添加如下内容】
wangting@ops01:/opt/software >sudo vim /etc/profile
#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_212
export PATH=$PATH:$JAVA_HOME/bin
保存退出

wangting@ops01:/opt/software >source /etc/profile
wangting@ops01:/opt/software >
wangting@ops01:/opt/software >java -version
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)

【注意】 java部署 3台机器都需要部署执行

hadoop安装

wangting@ops01:/home/wangting >cd /opt/software/
wangting@ops01:/home/wangting >tar -zxvf hadoop-3.1.3.tar.gz -C /opt/module/
wangting@ops01:/opt/module/hadoop-3.1.3 >
wangting@ops01:/opt/software >cd /opt/module/hadoop-3.1.3/
wangting@ops01:/opt/module/hadoop-3.1.3 >
wangting@ops01:/opt/module/hadoop-3.1.3 >pwd
/opt/module/hadoop-3.1.3
wangting@ops01:/opt/software >
【添加如下内容】
wangting@ops01:/opt/software >sudo vim /etc/profile
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
wangting@ops01:/opt/software >source /etc/profile
wangting@ops01:/opt/software >

【注意】 hadoop部署 3台机器都需要部署执行

验证部署信息

wangting@ops01:/home/wangting >hadoop version
Hadoop 3.1.3
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579
Compiled by ztang on 2019-09-12T02:47Z
Compiled with protoc 2.5.0
From source with checksum ec785077c385118ac91aadde5ec9799
This command was run using /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-common-3.1.3.jar
wangting@ops01:/home/wangting >

wangting@ops02:/home/wangting >hadoop version
Hadoop 3.1.3
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579
Compiled by ztang on 2019-09-12T02:47Z
Compiled with protoc 2.5.0
From source with checksum ec785077c385118ac91aadde5ec9799
This command was run using /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-common-3.1.3.jar

wangting@ops03:/home/wangting >hadoop version
Hadoop 3.1.3
Source code repository https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579
Compiled by ztang on 2019-09-12T02:47Z
Compiled with protoc 2.5.0
From source with checksum ec785077c385118ac91aadde5ec9799
This command was run using /opt/module/hadoop-3.1.3/share/hadoop/common/hadoop-common-3.1.3.jar
wangting@ops01:/home/wangting >cd /opt/module/hadoop-3.1.3/
wangting@ops01:/opt/module/hadoop-3.1.3 >ll
total 212
drwxr-xr-x 2 wangting wangting   4096 Sep 12  2019 bin
drwxrwxr-x 4 wangting wangting   4096 Mar 12 11:44 data
drwxr-xr-x 3 wangting wangting   4096 Sep 12  2019 etc
drwxr-xr-x 2 wangting wangting   4096 Sep 12  2019 include
drwxrwxr-x 2 wangting wangting   4096 Mar 12 10:59 input
drwxr-xr-x 3 wangting wangting   4096 Sep 12  2019 lib
drwxr-xr-x 4 wangting wangting   4096 Sep 12  2019 libexec
-rw-rw-r-- 1 wangting wangting 147145 Sep  4  2019 LICENSE.txt
drwxrwxr-x 3 wangting wangting   4096 Mar 12 14:43 logs
-rw-rw-r-- 1 wangting wangting  21867 Sep  4  2019 NOTICE.txt
-rw-rw-r-- 1 wangting wangting   1366 Sep  4  2019 README.txt
drwxr-xr-x 3 wangting wangting   4096 Mar 12 11:45 sbin
drwxr-xr-x 4 wangting wangting   4096 Sep 12  2019 share

(1)bin目录:存放对Hadoop相关服务(HDFS,YARN)进行操作的脚本
(2)etc目录:Hadoop的配置文件目录,存放Hadoop的配置文件
(3)lib目录:存放Hadoop的本地库(对数据进行压缩解压缩功能)
(4)sbin目录:存放启动或停止Hadoop相关服务的脚本
(5)share目录:存放Hadoop的依赖jar包、文档、和官方案例

【注意】:NameNode (ops01)和SecondaryNameNode (ops03)不要安装在同一台服务器
【注意】:ResourceManager (ops02)也很消耗内存,不要和NameNode、SecondaryNameNode配置在同一台机器上。

所以根据资源情况合理配置分配如下(仅供参考):
ops01 : NameNode | DataNode / NodeManager
ops02 : DataNode / ResourceManager | NodeManager
ops03 : SecondaryNameNode | DataNode / NodeManager

配置集群

(1)核心配置文件 配置core-site.xml

wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >cd /opt/module/hadoop-3.1.3/etc/hadoop
wangting@ops01:/opt/module/hadoop-3.1.3/etc/hadoop >vim core-site.xml
文件内容如下 [ 3台机器都需要配置 ]:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://ops01:8020</value>
    </property>
    <property>
        <name>hadoop.data.dir</name>
        <value>/opt/module/hadoop-3.1.3/data</value>
    </property>
    <property>
        <name>hadoop.proxyuser.wangting.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.wangting.groups</name>
        <value>*</value>
    </property>
</configuration>

(2)HDFS配置文件 配置hdfs-site.xml

wangting@ops01:/opt/module/hadoop-3.1.3/etc/hadoop >vim hdfs-site.xml

文件内容如下 [ 3台机器都需要配置 ]:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file://${hadoop.data.dir}/name</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file://${hadoop.data.dir}/data</value>
  </property>
    <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>file://${hadoop.data.dir}/namesecondary</value>
  </property>
    <property>
    <name>dfs.client.datanode-restart.timeout</name>
    <value>30</value>
  </property>
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>ops03:9868</value>
  </property>
</configuration>

(3)YARN配置文件 配置yarn-site.xml

wangting@ops01:/opt/module/hadoop-3.1.3/etc/hadoop >vim yarn-site.xml

文件内容如下 [ 3台机器都需要配置 ]:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>ops02</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
</configuration>

(4)MapReduce配置文件 配置mapred-site.xml

wangting@ops01:/opt/module/hadoop-3.1.3/etc/hadoop >vim mapred-site.xml

文件内容如下 [ 3台机器都需要配置 ]:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

配置workers

wangting@ops01:/home/wangting >vim /opt/module/hadoop-3.1.3/etc/hadoop/workers
ops01
ops02
ops03

【注意】 3台机器都需要配置

【注意】 该文件中添加的内容结尾不允许有空格,文件中不允许有空行

启动集群

(1)如果集群是第一次启动,需要在ops01节点格式化NameNode(注意格式化之前,一定要先停止上次启动的所有namenode和datanode进程,然后再删除data和log数据)
wangting@ops01:/home/wangting >hdfs namenode -format
(2)启动HDFS
wangting@ops01:/home/wangting >start-dfs.sh
(3)在配置了ResourceManager的节点【ops02】启动YARN
wangting@ops02:/home/wangting >start-yarn.sh
(4)Web端查看SecondaryNameNode
(a)浏览器中输入:http://ops03:9868/status.html

【注意】 浏览器能用ops03代替ip是因为之前配置了Windows的hosts文件

3台机器,服务比较多,初期不熟悉容器乱,写一个简单的偷懒脚本:

wangting@ops01:/home/wangting >vim jpsall.sh 
#!/bin/bash
echo "=============   ops01   ============="
ssh ops01 jps
echo "=============   ops02   ============="
ssh ops02 jps
echo "=============   ops03   ============="
ssh ops03 jps
echo "====================================="
echo "正常配置:"
echo "ops01 : NameNode|DataNode  / NodeManager"
echo "ops02 : DataNode  / ResourceManager|NodeManager"
echo "ops03 : SecondaryNameNode|DataNode  / NodeManager"
wangting@ops01:/home/wangting >cp jpsall.sh /usr/bin/
wangting@ops01:/home/wangting >scp jpsall.sh ops02:/usr/bin/
scp: /usr/bin//jpsall.sh: Permission denied
wangting@ops01:/home/wangting >sudo scp jpsall.sh ops02:/usr/bin/
[sudo] password for wangting: 
  
wangting@ops01:/home/wangting >sudo scp jpsall.sh ops03:/usr/bin/
wangting@ops01:/home/wangting >
【执行效果如下】:
wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >jpsall.sh 
=============   ops01   =============
91696 Jps
104531 NameNode
43619 NodeManager
104683 DataNode
43885 JobHistoryServer
=============   ops02   =============
130467 DataNode
68151 Jps
40603 NodeManager
40476 ResourceManager
=============   ops03   =============
91764 DataNode
91895 SecondaryNameNode
121382 NodeManager
11471 Jps
=====================================
正常配置:
ops01 : NameNode|DataNode  / NodeManager
ops02 : DataNode  / ResourceManager|NodeManager
ops03 : SecondaryNameNode|DataNode  / NodeManager

迅速列出每个节点跑的服务和正常配置对比一下,是否有服务凉了,或者异常之类的情况

集群测试

常用操作测试

wangting@ops01:/home/wangting >hadoop fs -mkdir -p /user/wangting/input
wangting@ops01:/home/wangting >cd /opt/module/hadoop-3.1.3/
wangting@ops01:/opt/module/hadoop-3.1.3 >ls
bin  data  etc  include  input  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share
wangting@ops01:/opt/module/hadoop-3.1.3 >hadoop fs -put README.txt /user/wangting/input
wangting@ops01:/opt/module/hadoop-3.1.3 >hadoop fs -ls /user/wangting/input
2021-03-12 16:56:46,146 INFO Configuration.deprecation: No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS
Found 1 items
-rw-r--r--   3 wangting supergroup       1366 2021-03-12 11:49 /user/wangting/input/README.txt
wangting@ops01:/opt/module/hadoop-3.1.3 >
wangting@ops01:/opt/module/hadoop-3.1.3 >hadoop fs -put  /opt/software/hadoop-3.1.3.tar.gz /
wangting@ops01:/opt/module/hadoop-3.1.3 >hadoop fs -ls /
2021-03-12 16:57:56,400 INFO Configuration.deprecation: No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS
Found 3 items
-rw-r--r--   3 wangting supergroup  338075860 2021-03-12 11:50 /hadoop-3.1.3.tar.gz
drwx------   - wangting supergroup          0 2021-03-12 14:46 /tmp
drwxr-xr-x   - wangting supergroup          0 2021-03-12 11:48 /user
wangting@ops01:/home/wangting >cd ~/test/
wangting@ops01:/home/wangting/test >pwd
/home/wangting/test
wangting@ops01:/home/wangting/test >ls
wangting@ops01:/home/wangting/test >hadoop fs -get /hadoop-3.1.3.tar.gz ./
2021-03-12 16:59:09,820 INFO Configuration.deprecation: No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS
2021-03-12 16:59:10,602 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
wangting@ops01:/home/wangting/test >ls
hadoop-3.1.3.tar.gz
wangting@ops01:/home/wangting/test >

执行wordcount程序示例

wangting@ops01:/home/wangting >hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /user/wangting/input /user/wangting/output
wangting@ops01:/home/wangting >hadoop fs -ls /user/wangting/output
2021-03-12 17:02:31,321 INFO Configuration.deprecation: No unit for dfs.client.datanode-restart.timeout(30) assuming SECONDS
Found 2 items
-rw-r--r--   3 wangting supergroup          0 2021-03-12 14:46 /user/wangting/output/_SUCCESS
-rw-r--r--   3 wangting supergroup       1306 2021-03-12 14:46 /user/wangting/output/part-r-00000
wangting@ops01:/home/wangting >

配置历史服务器

为了查看程序的历史运行情况,需要配置一下历史服务器。具体配置步骤如下:

配置mapred-site.xml vi mapred-site.xml 【3台机器都需要操作】

【在该文件里面增加如下配置,在原来的内容后增加<property>段】
<!-- 历史服务器端地址 -->
<property>
    <name>mapreduce.jobhistory.address</name>
    <value>ops01:10020</value>
</property>

<!-- 历史服务器web端地址 -->
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>ops01:19888</value>
</property>

在ops01启动历史服务器

wangting@ops01:/home/wangting >mr-jobhistory–daemon.sh start historyserver
wangting@ops01:/home/wangting >
wangting@ops01:/home/wangting >jps
104531 NameNode
43619 NodeManager
98504 Jps
104683 DataNode
43885 JobHistoryServer			# 新增的历史服务

查看JobHistory
http://ops01:19888/jobhistory
在这里插入图片描述

配置日志的聚集

日志聚集概念:应用运行完成以后,将程序运行日志信息上传到HDFS系统上。
日志聚集功能好处:可以方便的查看到程序运行详情,方便开发调试。
【注意】:开启日志聚集功能,需要重新启动NodeManager 、ResourceManager和HistoryManager。

1.配置yarn-site.xml vim yarn-site.xml 【3台机器都需要操作】

【在该文件里面增加如下配置,在原来的内容后增加<property>段】
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>  
        <name>yarn.log.server.url</name>  
        <value>http://ops01:19888/jobhistory/logs</value>  
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
</property>

【先关服务,再启动服务】

# 关闭NodeManager 、ResourceManager和HistoryServer
在ops02上执行: 
wangting@ops02:/home/wangting >stop-yarn.sh
在ops01上执行: 
wangting@ops01:/home/wangting >mr-jobhistory–daemon.sh stop historyserver

# 启动NodeManager 、ResourceManager和HistoryServer
在ops02上执行:
wangting@ops02:/home/wangting >start-yarn.sh
在ops01上执行:
wangting@ops01:/home/wangting >mr-jobhistory–daemon.sh start historyserver

# 删除HDFS上已经存在的输出文件
wangting@ops01:/home/wangting >hdfs dfs -rm -R /user/wangting/output

# 再次执行WordCount程序
wangting@ops01:/home/wangting >hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount /user/wangting/input /user/wangting/output

查看集群节点信息:

http://ops02:8088/cluster
在这里插入图片描述

页面查看文件管理系统:

http://ops01:9870/
在这里插入图片描述

Logo

瓜分20万奖金 获得内推名额 丰厚实物奖励 易参与易上手

更多推荐