Random Walk and Personalized Pagerank
来源:互联网 发布:知呱呱 靠不靠谱 编辑:程序博客网 时间:2024/06/05 12:20
Random Walks
Let’s consider the graph below. And let’s assume that we drop a piece of information on the red vertex. Then we’d like to know a couple of things. For example, where does it spread first? How far does it spread? Will the vertices, close to the red vertex obtain more information then those far away? And finally, will the information continue to go back and forth forever or will we reach a stable distribution eventually?
Lazy Random Walks
The idea of lazy random walks is that we allow the random walkers to remain on a vertex with probability 1/2. Hence, our formula becomesx’ = 1/2(A + I)*x. In this formula, I is the identity matrix andA is the original matrix of transition probabilities. In the animation below, the thickness of an edge corresponds to the geometric mean of the amount of information of its adjacent vertices. The color of the edge is determined by the “information current”: The difference in the amount of information between the adjacent vertices. In other words: The thicker the edge, the more information flows along the edge. Edges with an adjacent vertex that has a high degree tend to be colored red. These are the vertices which contain a significantly larger part of information than the rest of the network.
As we can see, the distribution converges. And not only that: It also becomes less and less apparent where the information was originating from. In fact, for the vertices which are well connected, the distribution of information (or random walkers – whatever you want to call it) approximates their degree distribution. This means that high-degree vertices will contain proportionally more information than low-degree vertices.
But what if we actually wanted the vertex that contains the information initially to play an important role? One way to model that is to not stall on any vertex, but let the random walkers jump back to this specific vertex with a given probability (i.e. the teleport probability, alpha).
Personalized PageRank
It turns out that this is exactly what “Personalized PageRank” is all about. It models the distribution of rank, given that the distance random walkers (the paper calls them random surfers) can travel from their source (the source is often referred to as “seed”) is determined by alpha. In fact the expected walk-length is 1/alpha. The formula now becomesx’ = (1-alpha)*Ax + alpha*E. Here, alpha is a constant between 0 and 1 andE is the vector containing the source of information – i.e. in our case it is all zero, except for the red vertex where our information starts to spread.
In this animation, alpha is fixed to 1/2 in order to being able to allow for comparison with the lazy random walks. This is pretty high, though – with the result that our information indeed remains close to the seed vertex. In many cases we will want the random walkers to travel farther. Below, is an animation for alpha = 0.1
Conclusion
转自:
https://www.r-bloggers.com/from-random-walks-to-personalized-pagerank/
- Random Walk and Personalized Pagerank
- Personalized Pagerank:面向主题的pagerank
- Random Walk
- random walk
- random walk
- alogorithm: random walk and a knight's tour
- Seminar《Fast Random Walk with Restart and Its Applications》
- hdu4487 Maximum Random Walk
- D - Maximum Random Walk
- About Random Walk
- random walk DEMO
- Random Walk分割算法
- 随机漫步(random walk)
- hdu4579 Random Walk
- Personalized Engraved Artwork and accessories solution
- Hdu 4487 Maximum Random Walk
- hdu 4487 Maximum Random Walk
- hdu 4487 Maximum Random Walk
- Understanding glibc malloc
- 新接触-MongoDB、Docker
- getElementsByTagName用法示例之全选,不选,反选
- 顺序容器
- 常用的 GC 参数
- Random Walk and Personalized Pagerank
- 给Grid方式排列的RecyclerView添加间距
- 《数据压缩》实验报告五·JPEG编解码
- 星号密码查看器原理
- 最大公约数GCD
- CoordinatorLayout+AppBarLayout+Toolbar简单玩法,MD的behavior简单尝试
- 2017年1月历史文章汇总
- 练习1-13 编写一个程序,打印输入中单词长度的直方图(垂直)
- 2017春招笔试题