一些算法的MapReduce实现——好友推荐
来源:互联网 发布:淘宝网秒杀系统异常 编辑:程序博客网 时间:2024/05/22 02:00
Problem
”If two people in a social network have a friend in common, then there is an increased likelihood that they will become friends themselves at some point in the future.“
------by Networks, Crowds, and Markets: Reasoning About a Highly Connected World
也就是说如果B和C有一个共同好友A,那么未来B和C成为好友的可能性会很大,这种叫“三角闭合”原理,我们会发现这种由朋友连接成的图为无向图,因为朋友是相互的。但是像Facebook,微博这种粉丝关系就是有向图,因为你可能follow某明星,但是明星又不会粉你,那么你和明星之间就是单向连接,就是有向图。
Input
输入格式为一行,格式如下
<user><TAB><comma-separated list of user's friends>
Output
我们想向用户U推荐还不是其好友,但是和用户U共享好友的用户,最多推送N个,已共享好友数降序排列。
输出格式:
<user><TAB><comma-separated list of people the user may know>
Pseudocode
假设n=10,即推荐10个好友
map(key, value): [user, friends] = value.split("\t") friends = friends.split(",") for i = 0 to friends.length-1: emit(user, (1, friends[i])) // Paths of length 1 for j = i+1 to friends.length-1: emit(friends[i], (2, friends[j])) // Paths of length 2 emit(friends[j], (2, friends[i])) // Paths of length 2 reduce(key, values): hash = {} for (path_length, user) in values: if path_length == 1: // Paths of length 1 hash[user] = -1 else if path_length == 2: // Paths of length 2 if user in hash: if hash[user] != -1: hash[user]++ else: hash[user] = 1 // Remove paths of length 1. hash = {k:v for k,v in hash.items() if v != -1} // Convert hash to list. list = hash.items() // Sort key-value pairs in the list by values (number of common friends). list = sorted(list, key=lambda x: x[1]) MAX_RECOMMENDATION_COUNT = 10 // Output at most MAX_RECOMMENDATION_COUNT keys with the highest values (number of common friends). list = [k for k,v in list[:MAX_RECOMMENDATION_COUNT]] emit(key, ",".join(list)
Hadoop Code
import java.io.IOException;import java.util.*;import java.util.Map.Entry; import org.apache.commons.lang.*;import org.apache.hadoop.fs.Path;import org.apache.hadoop.conf.*;import org.apache.hadoop.io.*;import org.apache.hadoop.mapreduce.*;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class FriendshipRecommender { public static class Map extends Mapper<LongWritable, Text, IntWritable, Text> { public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String[] userAndFriends = line.split("\t"); if (userAndFriends.length == 2) { String user = userAndFriends[0]; IntWritable userKey = new IntWritable(Integer.parseInt(user)); String[] friends = userAndFriends[1].split(","); String friend1; IntWritable friend1Key = new IntWritable(); Text friend1Value = new Text(); String friend2; IntWritable friend2Key = new IntWritable(); Text friend2Value = new Text(); for (int i = 0; i < friends.length; i++) { friend1 = friends[i]; friend1Value.set("1," + friend1); context.write(userKey, friend1Value); // Paths of length 1. friend1Key.set(Integer.parseInt(friend1)); friend1Value.set("2," + friend1); for (int j = i+1; j < friends.length; j++) { friend2 = friends[j]; friend2Key.set(Integer.parseInt(friend2)); friend2Value.set("2," + friend2); context.write(friend1Key, friend2Value); // Paths of length 2. context.write(friend2Key, friend1Value); // Paths of length 2. } } } } } public static class Reduce extends Reducer<IntWritable, Text, IntWritable, Text> { public void reduce(IntWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException { String[] value; HashMap<String, Integer> hash = new HashMap<String, Integer>(); for (Text val : values) { value = (val.toString()).split(","); if (value[0].equals("1")) { // Paths of length 1. hash.put(value[1], -1); } else if (value[0].equals("2")) { // Paths of length 2. if (hash.containsKey(value[1])) { if (hash.get(value[1]) != -1) { hash.put(value[1], hash.get(value[1]) + 1); } } else { hash.put(value[1], 1); } } } // Convert hash to list and remove paths of length 1. ArrayList<Entry<String, Integer>> list = new ArrayList<Entry<String, Integer>>(); for (Entry<String, Integer> entry : hash.entrySet()) { if (entry.getValue() != -1) { // Exclude paths of length 1. list.add(entry); } } // Sort key-value pairs in the list by values (number of common friends). Collections.sort(list, new Comparator<Entry<String, Integer>>() { public int compare(Entry<String, Integer> e1, Entry<String, Integer> e2) { return e2.getValue().compareTo(e1.getValue()); } }); int MAX_RECOMMENDATION_COUNT = 10; if (MAX_RECOMMENDATION_COUNT < 1) { // Output all key-value pairs in the list. context.write(key, new Text(StringUtils.join(list, ","))); } else { // Output at most MAX_RECOMMENDATION_COUNT keys with the highest values (number of common friends). ArrayList<String> top = new ArrayList<String>(); for (int i = 0; i < Math.min(MAX_RECOMMENDATION_COUNT, list.size()); i++) { top.add(list.get(i).getKey()); } context.write(key, new Text(StringUtils.join(top, ","))); } } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "FriendshipRecommender"); job.setJarByClass(FriendshipRecommender.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(Text.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } }
以上程序可以很好的解决如下图1的问题,那么类似于图2这样的问题,就是说,A和B是好友,B和C互为好友,C和D是好友,那么能否用MapReduce实现把D推荐给A呐??
Reference
translated by http://importantfish.com/people-you-may-know-friendship-recommendation-with-hadoop/
测试数据:下载
http://developer.51cto.com/art/201301/375661.htm
0 0
- 一些算法的MapReduce实现——好友推荐
- MapReduce实现QQ好友推荐
- MapReduce实现QQ好友推荐
- mapreduce实现QQ好友推荐
- MapReduce实现QQ好友推荐
- MapReduce——求两个人的共同好友算法
- 基于mapreduce实现好友推荐功能
- 一些算法的MapReduce实现——最小生成树
- 一些算法的MapReduce实现——MapReduce Job的单元测试实例
- 好友推荐算法-基于关系的推荐
- 用hadoop2.7.1 mapreduce实现QQ好友推荐功能
- MapReduce之推荐算法实现
- 一些算法的MapReduce实现——矩阵相乘一步实现
- 一些算法的MapReduce实现——矩阵-向量乘法实现
- 一些算法的MapReduce实现——倒排索引实现
- MapReduce学习之好友推荐
- Hadoop/MapReduce 好友推荐解决方案
- 一些算法的MapReduce实现——有向图求边的交集
- 工业级国产曲线绘制工具CChart的主页开通
- 小心多任务设计被滥用
- Android4.3前后DNS解析简单研究
- ext4.2入门简单小例子(button的事件--对话框的几种使用情况)
- Java自学视频整理(持续更新中...)
- 一些算法的MapReduce实现——好友推荐
- php的扩展和嵌入--c++类的扩展开发
- python上传文件
- Java线程同步问题:设备独占
- 怎么快速开启/关闭windows7 aero特效?
- 嵌入式 获取文件真正的大小示例,经典短小精悍,以及文件上锁
- cortex-A8汇编指令练习一
- c++类的构造函数详解
- 基于C#弹幕类射击游戏的实现——(七)弹幕类实现