统计系统部署
来源:互联网 发布:一起做淘宝链接 编辑:程序博客网 时间:2024/05/21 09:42
一、 服务器准备
1. 192.168.1.23
2. 192.168.1.230
3. 192.168.1.248
4. 192.168.1.246
5. 192.168.1.232
二、 应用程序准备
1. Zookeeper3.4.6 负责管理namenode和master的ha(高可靠)切换
2. Hadoop2.4.1 负责存储日志
3. Spark1.3.1 负责统计计算
4. Mysql5.6.23 负责保存统计结果
5. Jdk1.6 运行环境
6. Scala2.10.4 运行环境
三、 服务器环境配置
1. 创建普通用户,用户名heren密码heren2015#@!12
新建用户 useradd heren
修改密码 passwd heren heren2015#@!12
创建应用目录 mkdir -p /usr/local/software
隶属于heren用户 chown -R heren:heren /usr/local/software
创建数据目录 mkdir -P /data/hrdata/hadoop/
mkdir -P /data/hrdata/spark1.3/
mkdir -P /data/hrdata/zookeeper/
chown -R heren:heren /data/hrdata
受影响服务器 192.168.1.23 192.168.1.230 192.168.1.248 192.168.1.246 192.168.1.232
2. 配置ssh免密码登录
登录heren用户
生成公钥 ssh-keygen -t rsa
生成验证文件 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
修改验证文件权限 chmod 600 ~/.ssh/authorized_keys
拷贝公钥到其他服务例 ssh-copy-id
受影响服务器 192.168.1.23 192.168.1.230 192.168.1.248 192.168.1.246 192.168.1.232
五台之间都可以通讯
配置/etc/hosts文件
192.168.1.23 node10
192.168.1.230 node20
192.168.1.248 node30
192.168.1.246 node40
192.168.1.232 node50
受影响服务器 192.168.1.23 192.168.1.230 192.168.1.248 192.168.1.246 192.168.1.232
3. 关闭防火墙
chkconfig iptables off
service iptables stop
受影响服务器 192.168.1.23 192.168.1.230 192.168.1.248 192.168.1.246 192.168.1.232
四、 应用程序部署
将hadoop 、jdk、scala、spark1.3、zookeeper安装在/usr/local/software目录下
配置环境变量
######jdk1.6##########
export JAVA_HOME=/usr/local/software/jdk
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JRE_HOME=$JAVA_HOME/jre
######scala2.10.4##########
export SCALA_HOME=/usr/local/software/scala
export PATH=$SCALA_HOME/bin:$PATH
######hadoop2.4.1########
export HADOOP_HOME=/usr/local/software/hadoop
export PATH=$HADOOP_HOME/bin:$PATH
####spark1.3.1#######
export SPARK_HOME=/usr/local/software/spark
export PATH=$SPARK_HOME/bin:$PATH
#######mysql5.6########
export PATH=/usr/local/mysql/bin:$PATH
alias mysql_start="mysqld_safe&"
alias mysql_stop="mysqladmin -uroot -p shutdown"
五、 应用程序配置
1. 配置zookeeper,目录/usr/local/software/zookeeper/conf/
修改zookeeper配置文件usr/local/software/zookeeper/conf/zoo.cfg
dataLogDir=/data/hrdata/zookeeper/logs
dataDir=/data/hrdata/zookeeper/data
server.1=192.168.1.248:2888:3888
server.2=192.168.1.246:2888:3888
server.3=192.168.1.232:2888:3888
受影响服务器192.168.1.248 192.168.1.246 192.168.1.232
配置三台zookeeper服务器id,以下三条命令对应上面三行server后台的数字
echo '1' > /data/hrdata/zookeeper/data/myid
echo '2' > /data/hrdata/zookeeper/data/myid
echo '3' > /data/hrdata/zookeeper/data/myid
启动zookeeper sh /usr/local/software/zookeeper/bin/zkServer.sh start
检查zookeeper是否启动成功 sh /usr/local/software/zookeeper/bin/zkServer.sh status
停止zookeeper sh /usr/local/software/zookeeper/bin/zkServer.sh stop
2. 配置hadoop,目录/usr/local/software/hadoop/etc/hadoop/
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node30:2181,node40:2181,node50:2181</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data/hrdata/hadoop/journal</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hrdata/hadoop/tmp</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>false</value>
</property>
</configuration>
hadoop-env.sh
export JAVA_HOME=/usr/local/software/jdk
export HADOOP_SSH_OPTS="-o StrictHostKeyChecking=no -p 8085"
export HADOOP_LOG_DIR='/data/hrdata/hadoop/logs'
HADOOP_PID_DIR='/usr/local/software/hadoop/pid'
hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/hrdata/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>node10:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>node20:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>node10:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>node20:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node30:8485;node40:8485;node50:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence(heren:8085)</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/heren/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node30:2181,node40:2181,node50:2181</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/data/hrdata/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.block.size</name>
<value>67108864</value>
</property>
</configuration>
Slaves
Node30
Node40
Node50
3. Hadoop格式化准备
在192.168.1.248 192.168.1.230 192.168.1.246启动journalnode
/usr/local/software/hadoop/sbin/hadoop-daemon.sh start journalnode
在192.168.1.23格式化namenode
/usr/local/software/hadoop/bin/hadoop namenode –format
/usr/local/software/hadoop/sbin/hadoop-daemon.sh start namenode
在192.168.1.230同步192.168.1.23的hadoop元数据
/usr/local/software/hadoop/bin/hdfs namenode -bootstrapStandby –force
在192.168.1.23上初始化zookeeper节点
/usr/local/software/hadoop/bin/hdfs zkfc –formatZK
启动停止集群命令
/usr/local/software/hadoop/sbin/start-dfs.sh
/usr/local/software/hadoop/sbin/stop-dfs.sh
4. 配置spark,目录/usr/local/software/spark1.3/conf/
spark-defaults.conf
spark.master spark://node10:7077,node20:7077
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.eventLog.enabled true
spark.eventLog.dir hdfs://mycluster/spark1.3logs
spark.eventLog.compress true
spark.shuffle.manager hash
spark.sql.shuffle.partitions 20
spark.cores.max 8
spark.executor.memory 3G
spark.executor.extraClassPath /usr/local/software/spark1.3/lib/mysql-connector-java-5.1.13.jar
spark-env.sh
export SPARK_DRIVER_MEMORY=2G
export HADOOP_CONF_DIR=/usr/local/software/hadoop/etc/hadoop
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=node30:2181,node40:2181,node50:2181 -Dspark.deploy.zookeeper.dir=/spark1.3"
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=7777 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://mycluster/spark1.3logs"
export SPARK_SSH_OPTS="-o StrictHostKeyChecking=no -p 8085"
export SCALA_HOME=/usr/local/software/scala
export JAVA_HOME=/usr/local/software/jdk
export SPARK_WORKER_MEMORY=7G
export SPARK_LOCAL_DIRS=/data/hrdata/spark1.3/local
export SPARK_WORKER_DIR=/data/hrdata/spark1.3/work
export SPARK_PID_DIR=/usr/local/software/spark1.3/pid
export SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true"
slaves
Node30
Node40
Node50
启动停止spark集群
/usr/local/software/spark1.3/sbin/start-all.sh
/usr/local/software/spark1.3/sbin/stop-all.sh
- 统计系统部署
- 系统统计
- walle部署系统部署
- 系统部署
- 在Linux系统设置共享文件夹、Hadoop单机/伪分布部署,运行Hadoop Wordcount单词统计实例
- 简易信息统计系统
- JSP流量统计系统
- 使用系统统计函数
- 词频统计系统
- 移动用户资费统计系统
- 更新系统统计数
- 双色球个数统计系统
- 调查统计系统设计
- oracle系统统计信息
- 选票统计系统
- 统计系统性能
- Android 系统电量统计
- oracle系统统计信息
- 位运算
- 二维向量叉乘的简单介绍及应用
- Oracle 删除重复数据只留一条
- 11 数据处理函数的使用
- ODBC Connection Properties
- 统计系统部署
- 转发一个深度、实用的技术帖——实现ADM3251E与3.3V系统的RS-232接口隔离
- 文件的上传
- 解决家庭版win10、win8没有远程桌面选项
- Computer HDU
- C#数据库操作基础类 SqlHelper
- 过滤器(Filter)和拦截器(Interceptor)的区别
- VM安装系统时提示硬件不支持(unsupported hardware detected)
- HDU 6143 Killer Names(dp 思维)