记录我的hadoop学习历程2--运行 wordcount
来源:互联网 发布:关于网络暴力的新闻 编辑:程序博客网 时间:2024/05/17 06:23
首先启动
sh /usr/local/hadoop/sbin/start-all.sh
导入数据到hdfs(当前位置为 hadoop 根目录)
1、创建数据仓库目录
./bin/hadoop dfs -mkdir -p /user/guoyakui/hadoopfile即:./bin/hadoop dfs -mkdir -p /user/用户名/自定义文件夹
2、拷贝数据到数据仓库
./bin/hadoop dfs -copyFromLocal /Users/guoyakui/Desktop/hadoop/data /user/guoyakui/hadoopfile即:./bin/hadoop dfs -copyFromLocal 本地数据地址 数据仓库地址(上面建立的目录)
3、拷贝完成之后可以查看一下
./bin/hadoop dfs -ls /user/guoyakui/hadoopfile即: ./bin/hadoop dfs -ls /user/用户名/自定义目录
输出:┌─[guoyakui@guoyakuideMBP] - [/usr/local/hadoop] - [二 5 23, 15:47]└─[$] <> ./bin/hadoop dfs -ls /user/guoyakui/hadoopfile/data-rw-r--r-- 1 guoyakui supergroup 1580879 2017-05-23 14:56 /user/guoyakui/hadoopfile/data/4300-0.txt-rw-r--r-- 1 guoyakui supergroup 1428841 2017-05-23 14:56 /user/guoyakui/hadoopfile/data/5000-8.txt-rw-r--r-- 1 guoyakui supergroup 674570 2017-05-23 14:56 /user/guoyakui/hadoopfile/data/pg20417.txt
4、运行examples-wordcount
./bin/hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.8.0-sources.jar org.apache.hadoop.examples.WordCount /user/guoyakui/hadoopfile/data /user/guoyakui/hadoopfile/data-output即: ./bin/hadoop jar example的jar包地址 具体的功能 输入源 输出地
输出:(输出内容较多,只截取了一部分)17/05/23 15:50:16 INFO mapred.LocalJobRunner: reduce > reduce17/05/23 15:50:16 INFO mapred.Task: Task 'attempt_local1414386995_0001_r_000000_0' done.17/05/23 15:50:16 INFO mapred.LocalJobRunner: Finishing task: attempt_local1414386995_0001_r_000000_017/05/23 15:50:16 INFO mapred.LocalJobRunner: reduce task executor complete.17/05/23 15:50:17 INFO mapreduce.Job: map 100% reduce 100%17/05/23 15:50:17 INFO mapreduce.Job: Job job_local1414386995_0001 completed successfully17/05/23 15:50:17 INFO mapreduce.Job: Counters: 35 File System Counters FILE: Number of bytes read=4121088 FILE: Number of bytes written=8782066 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=11959179 HDFS: Number of bytes written=879197 HDFS: Number of read operations=33 HDFS: Number of large read operations=0 HDFS: Number of write operations=6 Map-Reduce Framework Map input records=78096 Map output records=629882 Map output bytes=6091113 Map output materialized bytes=1454541 Input split bytes=403 Combine input records=629882 Combine output records=100609 Reduce input groups=81942 Reduce shuffle bytes=1454541 Reduce input records=100609 Reduce output records=81942 Spilled Records=201218 Shuffled Maps =3 Failed Shuffles=0 Merged Map outputs=3 GC time elapsed (ms)=14 Total committed heap usage (bytes)=1789919232 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=3684290 File Output Format Counters Bytes Written=879197
5、查看运行结果
a、查看输出的文件
./bin/hadoop dfs -ls /user/guoyakui/hadoopfile/data-output
┌─[guoyakui@guoyakuideMBP] - [/usr/local/hadoop] - [二 5 23, 15:50]└─[$] <> ./bin/hadoop dfs -ls /user/guoyakui/hadoopfile/data-outputFound 2 items-rw-r--r-- 1 guoyakui supergroup 0 2017-05-23 15:21 /user/guoyakui/hadoopfile/data-output/_SUCCESS-rw-r--r-- 1 guoyakui supergroup 879197 2017-05-23 15:21 /user/guoyakui/hadoopfile/data-output/part-r-00000
b、查看文件内容
./bin/hadoop dfs -cat /user/guoyakui/hadoopfile/data-output/part-r-00000
输出:(内容较多,截取一少部分)—A 40—About 2—Adiutorium 1—Afraid 2—After 7—After, 1—Afterwits, 1—Again, 1—Agonising 1—Ah 3—Ah, 10—Aha! 1—Aha... 1—Ahem! 1—Alas, 1—All 8—Am 2
阅读全文
0 0
- 记录我的hadoop学习历程2--运行 wordcount
- 学习记录:Hadoop之WordCount运行
- 记录我的学习历程
- 一步一步跟我学习hadoop(2)----hadoop eclipse插件安装和运行wordcount程序
- hadoop-(2)wordcount运行
- 记录我的hadoop学习历程1--hadoop 单节点环境搭建
- hadoop-examples-0.20.2-cdh3u6.jar wordcount 例子运行出现的问题记录
- hadoop集群运行小程序wordCount记录
- PHP(2)只记录我的学习历程
- 开张,记录我的学习历程
- 开始记录我的学习历程
- hadoop-2.7.2运行WordCount
- Hadoop-2.3.0学习(2)——Wordcount的运行以及遇到的问题
- hadoop学习笔记-3-运行wordcount示例
- hadoop学习过程-2013.08.22.1--运行WordCount
- hadoop系列学习之WordCount运行详解
- hadoop学习之wordcount运行错误处理
- Hadoop学习笔记:(一)WordCount运行
- SpringMVC工作原理
- Java中使用Jedis操作Redis
- response的contentType 几种类型
- 配置VsCode的C/C++编译环境
- new,delete和malloc,free
- 记录我的hadoop学习历程2--运行 wordcount
- linux环境配置多个tomcat
- Docker+Appium实现同时在多台手机上进行Android单元自动化测试
- Cadence 17.2 Padstack Editor入门指南(1)
- 进程、线程;僵尸/孤儿进程
- linux定时器
- 机器学习必备的计算机编程技巧(matlab、python)和总结——第二弹!!!
- Python 图形界面 GUI Tkinter 实例
- 修改EditText的游标颜色和宽度