Item-Based Recommendations with Hadoop
来源:互联网 发布:php中str_replace函数 编辑:程序博客网 时间:2024/05/01 21:44
Mahout在MapReduce上实现了Item-Based Collaborative Filtering,这里我尝试运行一下。
1. 安装Hadoop
2. 从下载Mahout并解压
3. 准备数据
下载1 Million MovieLens Dataset,解压得到ratings.dat,用
sed 's/::\([0-9]\{1,\}\)::\([0-9]\{1\}\)::[0-9]\{1,\}$/,\1,\2/' ratings.dat
处理成需要的格式。
4. 运行
mahout recommenditembased -s SIMILARITY_LOGLIKELIHOOD -i /path/to/input/file -o /path/to/desired/output -n 25
参数:
MAHOUT-JOB: /home/laxe/apple/mahout/mahout-examples-0.11.0-job.jarJob-Specific Options:--input (-i) input Path to job input directory.--output (-o) output The directory pathname for output.--numRecommendations (-n) numRecommendations Number of recommendations per user.--usersFile usersFile File of users to recommend for.--itemsFile itemsFile File of items to recommend for.--filterFile (-f) filterFile File containing comma-separated userID,itemID pairs. Used to exclude the item from the recommendations for that user(optional).--userItemFile (-uif) userItemFile File containing comma-separated userID,itemID pairs(optional). Used to include only these items into recommendations. Cannot be used together with usersFile or itemsFile.--booleanData (-b) booleanData Treat input as without prefvalues.--maxPrefsPerUser (-mxp) maxPrefsPerUser Maximum number of preferences considered per user in final recommendation phase.--minPrefsPerUser (-mp) minPrefsPerUser Ignore users with less preferences than this in the similarity computation (default: 1).--maxSimilaritiesPerItem (-m) maxSimilaritiesPerItem Maximum number of similarities considered per item.--maxPrefsInItemSimilarity (-mpiis) maxPrefsInItemSimilarity Max number of preferences to consider per user or item in the item similarity computation phase, users or items with more preferences will be sampled down(default: 500).--similarityClassname (-s) similarityClassname Name of distributed similarity measures class to instantiate,alternatively use one of the predefined similarities([SIMILARITY_COOCCURRENCE, SIMILARITY_LOGLIKELIHOOD, SIMILARITY_TANIMOTO_COEFFICIENT, SIMILARITY_CITY_BLOCK, SIMILARITY_COSINE, SIMILARITY_PEARSON_CORRELATION, SIMILARITY_EUCLIDEAN_DISTANCE])--threshold (-tr) threshold Discard item pairs with a similarity value below this.--outputPathForSimilarityMatrix (-opfsm) outputPathForSimilarityMatrix Write the items imilarity matrix to this path(optional).--randomSeed randomSeed Use this seed for sampling.--sequencefileOutput Write the output into a Sequence File instead of a text file.--help (-h) Print out help.--tempDir tempDir Intermediate output directory.--startPhase startPhase First phase to run.--endPhase endPhase Last phase to run specify HDFS directories while running on hadoop; else specify local file system directories.
参考
Introduction to Item-Based Recommendations with Hadoop
mahout分布式:Item-based推荐
0 0
- Item-Based Recommendations with Hadoop
- 论文笔记:session-based recommendations with recurrent neural networks
- hadoop item based collaborative filtering use case
- Genre-based Music Recommendations Using Open Data (and the problem with recommender systems)
- 《session-based recommendations with recurrent neural networks》ICLR 2016 阅读笔记
- CTreeCtrl/CListCtrl/CListBox With ToolTip Based On the Item Data
- Recommendations
- 基于内容的推荐(Content-based Recommendations)
- 基于内容的推荐算法(Content-based Recommendations)
- Practical Recommendations for Gradient-Based Training of Deep Architectures
- 基于内容的推荐(Content-based Recommendations)
- Item-based协同过滤
- item-based algorithm
- mahout分布式:Item-based推荐
- qt 之 item-based view
- How-to: deploy hadoop client with some special user based on acl enbaled cluster
- 【Data Algorithms_Recipes for Scaling up with Hadoop and Spark】Chapter 10 Content-Based Recommend
- 《Improved Recurrent Neural Networks for Session-based Recommendations》 DLRS 2016 阅读笔记
- codeforces #302 Destroying Roads (最短路径+暴力)
- MAVEN初学者遇到的问题
- ASCII码表
- Design Pattern 之 观察者模式
- ListView+CheckBox,实现批量删除与解决listview滚动checkBox选择状态絮乱
- Item-Based Recommendations with Hadoop
- SQL中条件语句decode与case...when...else...end的用法
- 黑马程序员——OC基础02—封装、继承、多态
- 解析中国国家气象局天气预报信息接口 xml文件,包含省市县三层结构
- Eclipse V4.5.1 Mars使用Eclipse Color Them更换主题
- Android自定义View的实现方法,深入了解View(四)
- C++编程值得参考的博客资料
- iOS开发指南:从零基础到App Store上架(第2版)——互动出版网
- 2012年5月16日,Google发布“知识图谱(Knowledge Graph)”