Hadoop
来源:互联网 发布:图形界面软件 编辑:程序博客网 时间:2024/06/03 20:08
Configuration Files
Hadoop configuration is driven by two types of important configuration files:
- Read-only default configuration -core-default.xml, hdfs-default.xml, yarn-default.xml and mapred-default.xml.
- Site-specific configuration - conf/core-site.xml, conf/hdfs-site.xml,conf/yarn-site.xmlandconf/mapred-site.xml.
Configuring Environment of Hadoop Daemons
Administrators should use the conf/hadoop-env.sh and conf/yarn-env.shscript to do site-specific customization of the Hadoop daemons' process environment.
At the very least you should specify the JAVA_HOME so that it is correctly defined on each remote node.
Configuring the Hadoop Daemons in Non-Secure Mode
- conf/core-site.xml
<configuration><property><name>fs.defaultFS</name><value>hdfs://master:8020</value></property><property><name>hadoop.tmp.dir</name><value>file:///home/aboutyun/hadoop/tmp</value><description>Abase for other temporary directories.</description></property><property><name>hadoop.proxyuser.aboutyun.hosts</name><value>*</value><description>abouyun 用户可以代理任意机器上的用户 </description></property><property><name>hadoop.proxyuser.aboutyun.groups</name><value>*</value><description>abouyun 用户代理任何组下的用户 </description></property><property><name>io.file.buffer.size</name><value>131072</value></property></configuration>
hadoop的用户代理机制:
以用户peerslee使用代理用户aboutyun提交作业为例,当用户peerslee提交作业时,aboutyun会接管该作业,负责作业资源的申请及监管。
但其中若遇到读取HDFS文件时,要判断是否有使用该文件的权限,此时使用的用户是peerslee,作业运行完后,作业列表中显示该作业的用户也是peerslee。当然,除此之外,剩下的工作都由aboutyun负责,以体现“代理”的作用。
注意:需要创建 tmp目录
- conf/hdfs-site.xml
<configuration><property><name>dfs.namenode.secondary.http-address</name><value>master:9001</value></property><property><name>dfs.namenode.name.dir</name><value>file:///home/aboutyun/hadoop/namenode</value></property><property><name>dfs.datanode.data.dir</name><value>file:///home/aboutyun/hadoop/datanode</value></property><property><name>dfs.replication</name><value>3</value></property><property><name>dfs.webhdfs.enabled</name><value>true</value></property></configuration>
注意:在本地创建 namenode,datanode 目录
HDFS web:ip:50070
- conf/yarn-site.xml
<configuration><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property><property><name>yarn.resourcemanager.address</name><value>master:8032</value></property><property><name>yarn.resourcemanager.scheduler.address</name><value>master:8030</value></property><property><name>yarn.resourcemanager.resource-tracker.address</name><value>master:8031</value></property><property><name>yarn.resourcemanager.admin.address</name><value>master:8033</value></property><property><name>yarn.resourcemanager.webapp.address</name><value>master:8088</value></property></configuration>
- conf/mapred-site.xml
<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.address</name><value>master:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>master:19888</value></property></configuration>
参考文档:
Hadoop配置文件参数详解
Hadoop Cluster Setup
阅读全文
0 0
- hadoop
- Hadoop
- Hadoop
- hadoop
- hadoop
- Hadoop
- Hadoop
- hadoop
- Hadoop
- hadoop
- hadoop
- hadoop
- hadoop
- Hadoop
- Hadoop
- hadoop
- Hadoop
- Hadoop
- C++调用微软actieX控件实现远程桌面实例
- ssh 服务器配置
- 能假冒一回百亿身家的硅谷大佬,也就今天了 ...
- Docker学习系列从零开始之基于SSH镜像制作tomcat和jdk的镜像【五】
- 前端学习资料,vue angular react webpack es6应有尽有
- Hadoop
- 项目中的一些bug
- 使用IntelliJ IDEA 配置JDK(入门)及Tomcat配置
- 服务器数据库系列
- ffmpeg播放音视频文件(ffmpeg-3.2.4-win32-dev版本)
- 自己看别人看了没用,遇到的查询很慢用的几个sql,备份用来查问题
- Ubuntu16.04设置Docker代理
- iOS测试的一般流程和注意事项
- 双向链表的基本操作