关于如何使用MapReducer来寻找共同好友

来源:互联网 发布:smt贴片机编程难吗 编辑:程序博客网 时间:2024/05/21 22:16

现在有一组数据,类似于

A:B,C,D,F,E,O
B:A,C,E,K
C:F,A,D,I
D:A,E,F,L
E:B,C,D,M,L
F:A,B,C,D,E,O,M
G:A,C,D,E,F
H:A,C,D,E,O
I:A,O
J:B,O
K:A,C,D
L:D,E,F
M:E,F,G

冒号前面的这个字母代表的是一个人的QQ号,而冒号后面的则是这个人的这个人的好友,现在我们的要求是通过程序找到任意两个人的共同好友,这个的思想我们可以这样想,

第一步,我们可以读取整个文件,得到<好友,人..>这样的一个输出结果,然后,我们在经过一层的mr过程,在<好友,人人..>的人人的这个list中,两两结合,找出共同好友,在以这个两两结合的Text当做key,通过reduce获得这两个好友的共同好友,即可,

public class ShareFriendsStepOne {


static class ShareFriendsStepOneMapper extends Mapper<LongWritable, Text, Text, Text>{

@Override
protected void map(LongWritable key, Text value,Context context)
throws IOException, InterruptedException {
//A:B,C,D,E
String line = value.toString();
String[] person_friends = line.split(":");
String person = person_friends[0];
String friends = person_friends[1];
for (String friend : friends.split(",")) {
//输出<好友,人>
context.write(new Text(friend),new Text(person));
}
}
}

static class ShareFriendsStepOneReducer extends Reducer<Text, Text, Text, Text>{

@Override
protected void reduce(Text friend, Iterable<Text> persons,Context context)
throws IOException, InterruptedException {
StringBuffer sb = new StringBuffer();
for (Text person : persons) {
sb.append(person).append(",");
}
context.write(friend, new Text(sb.toString()));

}
}

public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();

Job job = Job.getInstance(conf); 

job.setJarByClass(ShareFriendsStepOne.class);




job.setMapperClass(ShareFriendsStepOneMapper.class); 
job.setReducerClass(ShareFriendsStepOneReducer.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);

FileInputFormat.setInputPaths(job, new Path(args[0]));

FileOutputFormat.setOutputPath(job, new Path(args[1]));

boolean res = job.waitForCompletion(true);
System.exit(res?0:1);
}
}

此时得到的输出结果是

A I,K,C,B,G,F,H,O,D,
B A,F,J,E,
C A,E,B,H,F,G,K,
D G,C,K,A,L,F,E,H,
E G,M,L,H,A,F,B,D,
F L,M,D,C,G,A,
G M,
H O,
I O,C,
J O,
K B,
L D,E,
M E,F,
O A,H,I,J,F,

第二步,以这个两两结合的人为key,通过reduce来找出这两个人的共同的好友

public class ShareFriendsStepTwo {


static class ShareFriendsStepTwoMapper extends Mapper<LongWritable, Text, Text, Text>{

@Override
protected void map(LongWritable key, Text value,Context context)
throws IOException, InterruptedException {
//拿到的数据是上一个数据的输出结果
//A I,K,C,B
//友 人 人..
String line = value.toString();
String[] friend_persons = line.split("\t");
String friend = friend_persons[0];
String[] persons = friend_persons[1].split(",");
Arrays.sort(persons);
for (int i = 0; i < persons.length - 1; i++) {
for (int j = i+1; j < persons.length - 1; j++) {
String s = new String("");
s+=persons[i]+","+persons[j];
//发出<人-人,好友>,这样相同的人人对就会到同一个reducer中
context.write(new Text(s), new Text(friend));
}
}
}
}

static class ShareFriendsStepTwoReducer extends Reducer<Text, Text, Text, Text>{

@Override
protected void reduce(Text key, Iterable<Text> values,Context context)
throws IOException, InterruptedException {
StringBuffer sb = new StringBuffer();
for (Text value : values) {
sb.append(value).append("-");
}
context.write(key, new Text(sb.toString()));

}
}

public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();

Job job = Job.getInstance(conf); 

job.setJarByClass(ShareFriendsStepTwo.class);




job.setMapperClass(ShareFriendsStepTwoMapper.class); 
job.setReducerClass(ShareFriendsStepTwoReducer.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);

FileInputFormat.setInputPaths(job, new Path(args[0]));

FileOutputFormat.setOutputPath(job, new Path(args[1]));

boolean res = job.waitForCompletion(true);
System.exit(res?0:1);
}
}

最后输出的结果为:

A,B C-E-
A,C F-D-
A,D E-F-
A,E B-C-D-
A,F C-D-B-E-O-
A,G D-E-F-C-
A,H E-O-C-D-
A,I O-
A,K D-
A,L F-E-
B,C A-
B,D E-A-
B,E C-
B,F E-A-C-
B,G C-E-A-
B,H E-C-A-
B,I A-
B,K A-
B,L E-
C,D F-A-
C,E D-
C,F D-A-
C,G F-A-D-
C,H A-D-
C,I A-
C,K D-A-
C,L F-
D,F E-A-
D,G A-E-F-
D,H A-E-
D,I A-
D,K A-
D,L F-E-
E,F C-D-B-
E,G D-C-
E,H D-C-
E,K D-
F,G C-E-D-A-
F,H C-A-D-E-O-
F,I A-O-
F,K D-A-
F,L E-
G,H D-E-C-A-
G,I A-
G,K A-D-
G,L F-E-
H,I A-O-
H,K A-D-
H,L E-
I,K A-

0 0
原创粉丝点击