apache hadoop 2.6.4 64bit 在windows8.1下直接安装指南(无需虚拟机和cygwin)
来源:互联网 发布:windows server 2012 编辑:程序博客网 时间:2024/06/06 03:17
转载链接:官方文档
首先需要下载Apache hadoop 2.6.4的tar.gz包,到本地解压缩到某个盘下,注意路径里不要带空格。否则你配置文件里需要用windows 8.3格式的路径!
第二确保操作系统是64bit,已安装.netframework4.0以上版本,这个你懂的,微软的天下,没有这个怎么混!
第三确保安装了64 bit 版本的JDK1.8,笔者使用的就是JDK1.8.
第四请到github下载hadoop-commin-2.2.zip,官方下载的Apache hadoop 2.4.0的压缩包里,缺少windows下运行的链接库(hadoop.dll,winutils.exe,libwinutils.lib等),这个github的是大佬们编译好的64bit版的链接库包。下载直接解压缩,覆盖掉官方hadoop目录下的bin目录即可。
如果你想在windows下编译安装hadoop,则请直接参考官方原版文档:Hadoop2.X Windows安装指南
接下来进行配置(大部分摘抄自官方文档):
我的Hadoop解压缩目录是D:\tools\hadoop26,以下简称h_home
首先,修改%h_home%\etc\hadoop的hadoop-env.cmd脚本文件,在文件末尾增加环境变量定义。注意找到文件最初的JAVA_HOME设置,改成你的64位JDK路径,这一点一定要注意!比如我的:“set JAVA_HOME=E:\java\tools1\java”plain
set HADOOP_PREFIX=D:\tools\hadoop26 set HADOOP_CONF_DIR=%HADOOP_PREFIX%\etc\hadoop set YARN_CONF_DIR=%HADOOP_CONF_DIR% set PATH=%PATH%;%HADOOP_PREFIX%\bin
附上修改之后的hadoop-env.cmd
@echo off@rem Licensed to the Apache Software Foundation (ASF) under one or more@rem contributor license agreements. See the NOTICE file distributed with@rem this work for additional information regarding copyright ownership.@rem The ASF licenses this file to You under the Apache License, Version 2.0@rem (the "License"); you may not use this file except in compliance with@rem the License. You may obtain a copy of the License at@rem@rem http://www.apache.org/licenses/LICENSE-2.0@rem@rem Unless required by applicable law or agreed to in writing, software@rem distributed under the License is distributed on an "AS IS" BASIS,@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.@rem See the License for the specific language governing permissions and@rem limitations under the License.@rem Set Hadoop-specific environment variables here.@rem The only required environment variable is JAVA_HOME. All others are@rem optional. When running a distributed configuration it is best to@rem set JAVA_HOME in this file, so that it is correctly defined on@rem remote nodes.@rem The java implementation to use. Required.set JAVA_HOME=E:\java\tools1\java@rem The jsvc implementation to use. Jsvc is required to run secure datanodes.@rem set JSVC_HOME=%JSVC_HOME%@rem set HADOOP_CONF_DIR=@rem Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.if exist %HADOOP_HOME%\contrib\capacity-scheduler ( if not defined HADOOP_CLASSPATH ( set HADOOP_CLASSPATH=%HADOOP_HOME%\contrib\capacity-scheduler\*.jar ) else ( set HADOOP_CLASSPATH=%HADOOP_CLASSPATH%;%HADOOP_HOME%\contrib\capacity-scheduler\*.jar ))@rem The maximum amount of heap to use, in MB. Default is 1000.@rem set HADOOP_HEAPSIZE=@rem set HADOOP_NAMENODE_INIT_HEAPSIZE=""@rem Extra Java runtime options. Empty by default.@rem set HADOOP_OPTS=%HADOOP_OPTS% -Djava.net.preferIPv4Stack=true@rem Command specific options appended to HADOOP_OPTS when specifiedif not defined HADOOP_SECURITY_LOGGER ( set HADOOP_SECURITY_LOGGER=INFO,RFAS)if not defined HDFS_AUDIT_LOGGER ( set HDFS_AUDIT_LOGGER=INFO,NullAppender)set HADOOP_NAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_NAMENODE_OPTS%set HADOOP_DATANODE_OPTS=-Dhadoop.security.logger=ERROR,RFAS %HADOOP_DATANODE_OPTS%set HADOOP_SECONDARYNAMENODE_OPTS=-Dhadoop.security.logger=%HADOOP_SECURITY_LOGGER% -Dhdfs.audit.logger=%HDFS_AUDIT_LOGGER% %HADOOP_SECONDARYNAMENODE_OPTS%@rem The following applies to multiple commands (fs, dfs, fsck, distcp etc)set HADOOP_CLIENT_OPTS=-Xmx512m %HADOOP_CLIENT_OPTS%@rem set HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData %HADOOP_JAVA_PLATFORM_OPTS%"@rem On secure datanodes, user to run the datanode as after dropping privilegesset HADOOP_SECURE_DN_USER=%HADOOP_SECURE_DN_USER%@rem Where log files are stored. %HADOOP_HOME%/logs by default.@rem set HADOOP_LOG_DIR=%HADOOP_LOG_DIR%\%USERNAME%@rem Where log files are stored in the secure data environment.set HADOOP_SECURE_DN_LOG_DIR=%HADOOP_LOG_DIR%\%HADOOP_HDFS_USER%@rem The directory where pid files are stored. /tmp by default.@rem NOTE: this should be set to a directory that can only be written to by @rem the user that will run the hadoop daemons. Otherwise there is the@rem potential for a symlink attack.set HADOOP_PID_DIR=%HADOOP_PID_DIR%set HADOOP_SECURE_DN_PID_DIR=%HADOOP_PID_DIR%@rem A string representing this instance of hadoop. %USERNAME% by default.set HADOOP_IDENT_STRING=%USERNAME%
之后在该路径下找到或创建core-site.xml文件,修改内容如下:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://0.0.0.0:9000</value> </property> </configuration>
接下来是 hdfs-site.xml 文件,一样的修改内容如下。配置文件默认使用\tmp目录作为hdfs文件的存储位置,比如我解压hadoop在E:\下,则它就会创建E:\tmp来存放HDFS文件系统。
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
基本配置ok。
接下来我们配置一个YARN示例:
还是在该配置文件路径下,修改或者创建一个mapred-site.xml文件,路径下有个同名的模板文件,可以直接复制,然后修改其中的内容。注意替换配置文件中%USERNAME% 为你windows的用户名。默认是Administrator.
<configuration> <property> <name>mapreduce.job.user.name</name> <value>%USERNAME%</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.apps.stagingDir</name> <value>/user/%USERNAME%/staging</value> </property> <property> <name>mapreduce.jobtracker.address</name> <value>local</value> </property> </configuration>
最后,创建yarn-site.xml文件,变更内容如下:
<configuration> <property> <name>yarn.server.resourcemanager.address</name> <value>0.0.0.0:8020</value> </property> <property> <name>yarn.server.resourcemanager.application.expiry.interval</name> <value>60000</value> </property> <property> <name>yarn.server.nodemanager.address</name> <value>0.0.0.0:45454</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.server.nodemanager.remote-app-log-dir</name> <value>/app-logs</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/dep/logs/userlogs</value> </property> <property> <name>yarn.server.mapreduce-appmanager.attempt-listener.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.server.mapreduce-appmanager.client-service.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>-1</value> </property> <property> <name>yarn.application.classpath</name> <value>%HADOOP_CONF_DIR%,%HADOOP_COMMON_HOME%/share/hadoop/common/*,%HADOOP_COMMON_HOME%/share/hadoop/common/lib/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/*,%HADOOP_HDFS_HOME%/share/hadoop/hdfs/lib/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/*,%HADOOP_MAPRED_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/*,%HADOOP_YARN_HOME%/share/hadoop/yarn/lib/*</value> </property> </configuration>
ok,全部配置都改完了。点击命令提示符(管理员)运行命令提示符,切换到hadoop的安装目录。进行以下操作
1、切换到etc/hadoop目录,运行hadoop-env.cmd脚本,设置当前命令窗口执行环境变量。
2、格式化HDFS文件系统(建议切换到bin目录然后执行命令):
%HADOOP_PREFIX%\bin\hdfs namenode -format
3、运行HDFS示例:
%HADOOP_PREFIX%\sbin\start-all.cmd
运行结果:
检查进程,会有一个显示不出名字,正常情况:
安装如下的插件winutils.exe和hadoop.dll到D:\tools\hadoop26\bin 这个目录下面:
打开eclipse ,前提要安装可视化插件,如下配置eclipse
新建一个mapreduce项目,将如下的代码复制到项目当中:
import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Mapper.Context;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class mywc {//mapper类 public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } //reducer类 public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); if (args.length<2) {System.out.println("sb,请您输入两个路径,一个是为了文件输入的路径,一个是程序运行结果的路径。");} Job job = new Job(conf, "word count"); job.setJarByClass(mywc.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
得到如下的结果:
水完收工。
- apache hadoop 2.6.4 64bit 在windows8.1下直接安装指南(无需虚拟机和cygwin)
- apache hadoop 2.4.0 64bit 在windows8.1下直接安装指南(无需虚拟机和cygwin)
- 在windows8 64bit下cygwin中安装gitolite
- hadoop-2.7.3 在windows环境下安装(无需Cygwin)
- hadoop-2.7.3 在windows环境下安装(无需Cygwin)
- hadoop-2.7.3 在windows环境下安装(无需Cygwin)
- Windows下Cygwin环境的Hadoop安装(1)- Cygwin安装和配置
- Windows下Cygwin环境的Hadoop安装(1)- Cygwin安装和配置
- hadoop 在win7 下安装 +eclipse 和 cygwin
- Windows下Cygwin环境的Hadoop安装(4)- 在Eclipse中建立hadoop开发环境
- Windows下Cygwin环境的Hadoop安装(4)- 在Eclipse中建立hadoop开发环境
- windows和cygwin下hadoop安装配置
- windows和cygwin下hadoop安装配置
- Windows8.1 64bit环境下搭建深度学习平台之CUDA安装与配置
- Windows8.1 64bit环境下搭建深度学习平台之CUDA安装与配置
- Windows8.1 64bit环境下搭建深度学习平台之CUDA安装与配置
- Windows下安装Hadoop(免cygwin)
- Windows下Cygwin环境的Hadoop安装(2)- Hadoop安装和配置
- 文件权限管理
- SVN服务器端和客户端的搭建和使用
- localStorage存储数组以及取数组方法
- Python-WXPY实现微信监控报警
- html+css+javascript代码编程规范之JavaScript
- apache hadoop 2.6.4 64bit 在windows8.1下直接安装指南(无需虚拟机和cygwin)
- 欢迎使用CSDN-markdown编辑器
- 论文阅读:Instance-Level Salient Object Segmentation
- 【HDU 1069】Monkey and Banana(dp+sort结构体排序)
- elasticsearch sort评分
- 如何查找MySQL中查询慢的SQL语句
- ReLU Sigmoid and Tanh(2)
- scray中的Request 不执行回调
- bzoj2038 小Z的袜子(hose) 莫队算法(不修改只查询 基础版)