1.关于hadoop与hive等
- 虚拟机中安装ubuntu
- 是否集群: 否(单机)
- 虚拟机设置8核8G,硬盘存储20G左右
- jdk hadoop,hdfs ,hbase ,hive ,mysql8均已安装
- jdk: 1.8.0_411
- hadoop版本: 3.35
- mysql版本: 8.0.31
- hive版本: 4.0.1
2.hive启动,默认端口10000未启动成功
2.1 hadoop,hdfs 启动端口如下
2.2 hdfs 图形界面,9870端口可访问
2.3 hadoop 图形界面,8088端口均可正常访问
2.4 启动hiveserver2
hadoop,hdfs,mysql 均提前启动成功
后启动hiveserver2
查看端口号
hiveserver2默认端口为10000,未监听到
2.5 启动hive
2.5.1 关于启动hive
- hiveserver2启动后,未监听到端口10000
- bin/hive 启动hive
- !connect jdbc:hive2://localhost:10000 进行连接hive
- root,123456 均为mysql数据库账号与密码
2.5.2 hive-site.xml配置
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<!-- 忽略HIVE 元数据库版本的校验,如果非要校验就得进入MYSQL升级版本 -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<!-- hiveserver2 -->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>localhost</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property>
</configuration>
2.5.3 关于mysql等
- mysql也安装在虚拟机中
- ip: 192.168.31.102
- 账号: root
- 密码: 123456
- 账号与密码均个人环境,仅个人使用
- 下图均是通过navicat工具链接情况等
2.5.4 hive与mysql
- hive lib中也配置了mysql-connector-j-8.0..31.jar 数据库链接jar包
3.hive启动异常解决
3.1 hiveserver2启动异常解决
- 在hive bin目录下进行数据初始化
# 格式化数据库
schematool -dbType mysql -initSchema
- 查看其端口号,10000与10002均启动成功
- 访问ip http://192.168.31.228:10002/, HiveServer2 管理端
3.2 启动hive
- !connect jdbc:hive2://localhost:10000 进行启动
- 账号: root 密码: 123456
- 启动后有对应的表数据加载
4.hive insert 插入数据
- 插入数据: insert into usr(
id
,name
,age
)values(1,’Pot’,18); - 查询数据: select * from usr;
- 操作记录
5.hive应用实例
5.1 HDFS下创建input文件
- input目录下有file1.txt与file2.txt文件
- echo “hello world” > file1.txt
- echo “hello hadoop” > file2.txt
- 参考命令: ./bin/hdfs dfs -put /home/hadoop/file1.txt input
- 参考命令: ./bin/hdfs dfs -put /home/hadoop/file2.txt input
- 可将file1.txt,file2.txt提前写好放到 /home/hadoop,然后加载到hdfs下的input目录
5.2 hive 命令操作等
5.2.1 hive操作命令
create table docs(line string);
load data inpath 'input' overwrite into table docs;
create table word_count as
select word, count(1) as count from
(select explode(split(line,' '))as word from docs) w
group by word
order by word;
5.2.2 创建表及结果图
- 建表语句: create table docs(line string);
5.2.3 加载hdfs 中input下的file1.txt语file2.txt文件
- load data inpath ‘input’ overwrite into table docs;
- input 下只有 file1.txt与file2.txt 文件
- 结果如下图
5.2.4 通过sql语句创建表
- sql语句,语句涉及子查询
create table word_count as
select word, count(1) as count from
(select explode(split(line,' '))as word from docs) w
group by word
order by word;
- 操作界面图:
5.2.5 查看最后结果
- 查询语句,word_count 表
select * from word_count
- 操作结果如下图
5.2.6 HiveServer2 图形化界面
- 可查看sql语句操作记录
- http://192.168.31.228:10002/
5.3 配置文件
5.3.1 core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/hadoop-3.3.5/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
</configuration>
5.3.2 hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hadoop-3.3.5/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/hadoop-3.3.5/tmp/dfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
5.3.3 yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_HOME,PATH,LANG,TZ,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
5.3.4 mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
5.3.5 hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
</configuration>
5.3.6 ~/.bashrc 配置
# java_home
export JAVA_HOME=/usr/local/java/jdk1.8.0_411
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
# hadoop
export HADOOP_HOME=/usr/local/hadoop/hadoop-3.3.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# ssh
export PDSH_RCMD_TYPE=ssh
# hive
export HIVE_HOME=/usr/local/hadoop/hive
export PATH=$PATH:$HIVE_HOME/bin
6.总结与复盘
- VM中安装ubuntu中不稳定,容易重新启动
- 官方教程与网上教程未吃透,初次安装使用部署与实践必会遇到各种问题
- 遇到问题卡壳,网上搜索解决方案,并未完全解决,耗时过久,还不如重新安装,操作与实践解决的快
- 实践与操作前,还是多看几遍官方教程,然后再看看网上其他教程,对遇到困难与坑心中便有底
- 上面关于hiveserver2启动后,端口10000与100002均未监听到,就是缺少操作数据库初始化,教程流程缺乏
- 遇到坑还是困难,还是记录与总结较好,困惑时间较久就重新安装等
- 安装与操作hive时,重新在vm中安装ubuntu,hadoop ,hdfs ,hive,mysql等