Programming Collective Intelligence

来源：互联网发布：linux重启weblogic服务编辑：程序博客网时间：2024/05/01 14:26

user-based collaborative filtering

item-based collaborative filtering: The general technique is to precompute the most similar items for each item. Then, when you wish to make recommendations to a user, you look at his top-rated items and create a weighted list of the items most similar to those. The important difference here is that, although the first step requires you to examine all the data, comparisons between items will not change as often as comparisons between users. This means you do not have to continuously calculate each item’s most similar items.

*** Item-based filtering is significantly faster than user-based when getting a list of recommendations for a large dataset, but it does have the additional overhead of maintaining the item similarity table. Also, there is a difference in accuracy that depends on how “sparse” the dataset is. Item-based filtering usually outperforms user-based filtering in sparse datasets, and the two perform about equally in dense datasets. Having said that, user-based filtering is simpler to implement and doesn’t have the extra steps, so it is often more appropriate with smaller in-memory datasets that change very frequently. Finally, in some applications, showing people which other users have preferences similar to their own has its own value.

... a technique called multidimensional scaling, which will be used to find a two-dimensional representation of the dataset. The algorithm takes the difference between every pair of items and tries to make a chart in which the distances between the items match those differences.

(http://en.wikipedia.org/wiki/Multidimensional_scaling)

0 0