hadoop 开发调试环境
来源:互联网 发布:python bytes 编辑:程序博客网 时间:2024/06/16 09:57
一 目标
虚拟机安装ubuntu14.04(64位),然后安装hadoop 2.6.0(伪分布),pig、hive和mahout,用作开发调试。
二 安装
1. 配置ssh
ssh-keygen -t rsacd ~/.sshcat id_dsa.pub >> ~/.ssh/authorized_keys
2.软件准备
Jdk和mysql-server 直接用apt-get 安装sudo apt-get install openjdk-7-jresudo apt-get install openjdk-7-jdksudo apt-get install mysql-serverhadoop-2.6.0.tar.gzpig-0.15.0.tar.gzapache-hive-1.1.1-bin.tar.gzapache-mahout-distribution-0.9.tar.gzmysql-connector-java-5.1.39.tar.gzsynthetic_control.data3.设置环境变量
将软件解压缩,拷贝到/usr/local目录下,编辑.bashrc增加以下设置
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64export HADOOP_HOME=/usr/local/hadoopexport HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoopexport PIG_HOME=/usr/local/pigexport PIG_CLASSPATH=$HADOOP_HOME/etc/hadoop/export HIVE_HOME=/usr/local/hiveexport HIVE_CLASSPATH=/$HADOOP_HOME/etc/hadoop/export MAHOUT_HOME=/usr/local/mahoutexport MAHOUT_CONFI_DIR=/usr/loca/mahout/confexport PATH=.:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PIG_HOME/bin:$HIVE_HOME/bin:$MAHOUT_HOME/bin:$PATH检查mysql的安装
sudo /etc/init.d/mysql status
检查java运行情况
java -version4.配置hadoop伪分布
core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property></configuration>hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property></configuration>初始化
hadoop namenose -format
运行hdfs和yarn
start-dfs.sh 和 start-yarn.sh
检查运行状态
jps
用浏览器查看
http://localhost:500705.配置pig
不需要专门配置,用以下命令验证是否可用
hdfs dfs -put /etc/passwd /user/oliver/passwdpig -x mapreduceA = load 'passwd' using PigStorage(':');B = foreach A generate $0 as id;dump B6.配置hive
生成配置文件
cp hive-env.sh.template hive-env.shcp hive-default.xml.template hive-site.xml
编辑hive-env.sh
exportHADOOP_HOME=/usr/local/hadoopexport HIVE_CONF_DIR=/usr/local/hive/conf编辑hive-site.xml
<property> <name>javax.jdo.option.ConnectionURL </name> <value>jdbc:mysql://localhost:3306/hive </value></property><property> <name>javax.jdo.option.ConnectionDriverName </name> <value>com.mysql.jdbc.Driver </value></property><property> <name>javax.jdo.option.ConnectionPassword </name> <value>hive </value></property><property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>Username to use against metastore database</description> </property><property> <name>hive.exec.local.scratchdir</name> <value>/tmp/hive </value> <description>Local scratch space for Hive jobs</description> </property> <property> <name>hive.downloaded.resources.dir</name> <value>/tmp/hive</value> <description>Temporary local directory for added resources in the remote file system.</description> </property>
复制jar文件
cp mysql-connector-java-5.1.39-bin.jar /usr/local/hive/libcp jline-2.12.jar /usr/local/hadoop/share/hadoop/yarn/lib
pig不能用jline-2.12.jar,需换回原来的包
建库
insert into mysql.user(Host,User,Password) values("localhost","hive",password("hive"));create database hive;grant all on hive.* to hive@'%' identified by 'hive';grant all on hive.* to hive@'localhost' identified by 'hive';flush privileges;
用hive的命令初始化数据库
schematool -dbType mysql –initSchema
检查数据库
mysql –uhive –phiveuse hiveshow tables
启动metastore服务,启动正常则表示安装好了
hive -service metastore7.配置mahout
将软件解压后拷贝到/usr/local/mahout,设置环境变量即可。
下载数据并进行测试
wget http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.datahdfs dfs -mkdir /testdatahdfs dfs -put ./synthetic_control.data /testdatahadoop jar /usr/local/mahout/mahout-examples-0.9-job.jar org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
至此,开发调试环境安装完毕。
0 0
- hadoop 开发调试环境
- hadoop 开发调试环境-eclipse配置记录
- hadoop 本机环境开发调试注意事项
- Eclipse下hadoop开发调试环境配置笔记
- Hadoop - Hadoop开发环境搭建
- hadoop源码调试环境搭建
- Hadoop环境调试Java程序
- hadoop开发环境搭建
- hadoop开发环境体验
- 配置HADOOP开发环境
- Hadoop开发环境搭建
- hadoop 开发环境搭建
- hadoop开发环境配置
- idea hadoop 开发环境
- hadoop开发环境搭建
- Hadoop开发环境搭建
- Hadoop开发环境
- hadoop开发环境搭建
- servlet过滤器、监听器、struts2拦截器的区别
- PHP大量用户登录解决方案
- LeetCode之求二叉树最大路径和
- Python 小甲鱼教程 乌龟吃鱼游戏
- office下载、安装和激活(包含Visio)
- hadoop 开发调试环境
- Struts2 资源配置文件国际化详解
- tortoisegit 还原到某个版本
- 信号互相关及其应用
- AJAX 解析获取的返回json
- LeetCode—375. Guess Number Higher or Lower II
- 关于scrollview的滚动
- Android6.0权限处理
- More Effective C++读书小记