Alias sampling 算法用Python实现
来源:互联网 发布:淘宝网实拍保护 编辑:程序博客网 时间:2024/05/22 11:32
Line论文中采用了alias 采样算法进行优化,其源码为c++,现用Python实现一遍,加深一下印象
网上有人已经用C++ 脱离LINE算法,单独实现了这个算法并且测试,可以先看看。
而alias算法的原理可以看我的上一篇博客,其中包括C++源码的分析,所以Python就不加注释了。 。
python代码:
from gensim.models import Word2Vecimport numpy as npedg_num=0u = []#开始边v = []#目标边w = []#权重数组def readdata(): f = open("weight.txt", "r", encoding='utf-8')#测试用的文本边集合,每行为 u v weight weights = f.readlines() edg_num = len(weights) # 边数目 for i in range(len(weights)): u.append(int(weights[i].split()[0])) v.append(int(weights[i].split()[1])) w.append(int(weights[i].split()[2])) return edg_numdef initAliasTable(): alias=[-1 for i in range(edg_num)] prob=[0.0 for i in range(edg_num)] norm_prob = [0.0 for i in range(edg_num)] large = [] small= [] w_sum=0 for i in range(edg_num): w_sum+=w[i] for i in range(edg_num): norm_prob[i]=w[i]*edg_num/w_sum # print(norm_prob) small_num=0 large_num=0 for i in range(edg_num): if norm_prob[edg_num-i-1]<1: small.append(edg_num-i-1) small_num=small_num+1 else: large.append(edg_num-i-1) large_num=large_num+1 #print(small,small_num) # print(large,large_num) for l in large: prob[l]=1.0 while(large_num and small_num): small_num = small_num - 1 small_cur=small[small_num] large_num = large_num - 1 large_cur=large[large_num] if(norm_prob[large_cur]==1): prob[large_cur] = 1.0 large_num=large_num-1 large_cur=large[large_num] # print(small_cur,large_cur) prob[small_cur]=norm_prob[small_cur] alias[small_cur]=large_cur norm_prob[large_cur]=norm_prob[large_cur]+norm_prob[small_cur]-1.0 # print(prob) #print(alias) if norm_prob[large_cur]<1: small[small_num]=large_cur small_num=small_num+1 elif norm_prob[large_cur]==1: prob[large_cur] = 1.0 else: large[large_num]=large_cur large_num=large_num+1 #print(small,small_num) # print(large,large_num ) return prob,aliasdef sampleanedg(prob,alias): rand1=np.random.uniform(0,1) rand2=np.random.uniform(0,1) k=int(rand1*edg_num) return k if rand2<prob[k] else alias[k]if __name__ == '__main__': edg_num=readdata() prob,alias=initAliasTable() print("prob=",prob) print("alias=",alias) i=60 while(i): edg_cur=sampleanedg(prob,alias) i=i-1 print("采样的边序号为",edg_cur+1) #print("采样边为:",u[edg_cur],v[edg_cur],w[edg_cur])
采样了60次,其中18次为权重最大的边,权重越大越可能被采样。。
阅读全文
0 0
- Alias sampling 算法用Python实现
- Alias Sampling Algorithm With GSL C代码实现
- Alias Method for Sampling
- Alias Method for Sampling
- Line论文中的Alias Sampling Algorithm 分析
- Lda gibbs sampling --- python
- Sampling 蓄水池抽样算法
- Reservoir sampling 随机算法
- Gibbs Sampling实现LDA
- C#&PHP&Java实现Alias Method概率抽奖算法
- Windows下用DOSKEY实现alias
- Reservior Sampling (蓄水池抽样算法)
- 接受拒绝算法-rejection sampling
- 蓄水池算法(Reservoir Sampling)
- 蓄水池算法(Reservoir Sampling)
- alias python=python3
- 用Python实现KNN算法
- Sampling
- 博客第一次
- Handler或Runnable作为非静态内部类,引发的内存泄露问题
- android 文字超出控件宽度时,自动滚动显示,类似跑马灯效果
- TensorFlow模型保存的一个坑
- 安卓移动文件(图片)到指定目录,并在相册中显示
- Alias sampling 算法用Python实现
- 数据结构实验(二):间接寻址
- FFmpeg-3.3.1移植到Android平台
- document操作自定义属性
- Source Insight--艰难的编辑工具学习小记
- Java中的equals方法以及==
- java变量的数据类型
- 异或交换真的比开一个tmp快吗?
- laravel-admin新手的使用