用mapreduce去访问文件中每个用户的用户名

来源：互联网发布：互联网mysql开发规范编辑：程序博客网时间：2024/05/16 12:39

需要准备的文件: 这里写图片描述

<1>类似与上图的.txt文件,注意,文件编码格式为UTF-8,为的是可在编程软件中正常识别,上面的字段分别是: 用户名登录状态 ip地址
并且字段之间用空格隔开
<2>已经搭建好的集群,安装好了hadoop,因为要使用mapreduce,已在一个代码编辑工具中与hadoop成功连接,这里用的编辑工具是eclipse,用到的语言是java代码
实现流程
<1>上传上述的文件到hdfs上,记住上传的路径,这里可以使用SecureFXPor或FileZila与Xshell的组合去实现本地文件与linux系统是传输,在用hdfs命令去把linux系统上的文件传输到hdfs上
<2>在代码编辑器上新建一个maven工程
这里写图片描述
如果在新建里面没有MavenProject的选项,就选择Other,在里面搜索Maven,点击继续,

这里写图片描述

接着,操作下面图片上的内容

这里写图片描述
<3>新建一个包—->新建一个类—>开始写代码

/*分析:由于要统计用户名,只是对数据进行一个简单的分割,那么map的作用就是分割,reduce的作用只是做一个传值的作用,可以把用户名当作key,value不设置值*/public　class　UserName{    //定义map    public static class UserNameMap extends Map<LongWritable,Text,Text,NullWritable>{    //定义变量    private String[] infos;    private NullWritable oValue = NullWritable.get();    private Text oKey = new Text();    //实现里面的map方法    @Override        protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, NullWritable>.Context context)throws IOException, InterruptedException{        //解析一行文档里的内容,\\s表示以空格解析表格中的一行数据        infos = value.toString.split("\\s")        //观看文档,用户名在第一个字段,解析后其索引是0,为其赋值        oKey.set(infos[0]);        //输出key和value的值        context.write(oKey,oValue);        }    }//定义reduce,在这里穿个值就行,其他啥事也不用做public static class UserNameReduce extends Reduce<Text, NullWritable, Text, NullWritable>{    private final NullWritable oValue = new NullWritable.get();    //实现reduce方法    @Override        protected void reduce(Text key,Iterable<NullWritable> values,Reducer<Text, NullWritable, Text, NullWritable>.Context context)throws IOException, InterruptedException{        context.write(key,oValue);        }     }//组装map和reduce,会遇到异常,直接抛出来就好public static void main(String[] args)throws Exception    //创建配置对象    Configuration configuration = new Configuration();    //创建job对象    Job job = Job.getInstance(configuration);    //设置jar包的类,括号里面写 类名.class    job.setJarByClass(UserNane.class);    job.setJobName("统计文件中的每个用户的用户名");    job.setMapperClass(UserNameMapper.class);    job.setReducerClass(UsernameReducer.class);    //设置输出文件格式,由于map的输出格式和reduce的输出格式一样,只写一个reduce的文件输出格式就好    job.setOutputKeyClass(Text.class);    job.setOutputValueClass(NullWritable.class);    //设置数据源    Path inputPath = new Path("hdfs上面存放文件的路径");    Path outputPath = new Path("要在hdfs上存放输出文件的路径");    FileInputFormat.addInputPath(job,inputPath);    outputPath.getFileSystem(configuraction).delete(outputPath,true);    FileOutputForMat.setOutputPath(job,outputPath);    System.exit(job.waitForCompletion(true)?0:1);   }

<4>运行结果,会把用户名给列举出来
这里写图片描述

阅读全文

0 0