Jaccard Coefficient(Jaccard Similarity)

来源:互联网 发布:下载2012新软件 编辑:程序博客网 时间:2024/06/04 17:47
Set s and Set t

JC(s,t)=|s∪t|/|s∩t|


Java实现如下所示:


package ruc.database.similarity;import java.util.ArrayList;import java.util.List;public class JaccardCoefficient {public static float jc(String s, String t){String[] sSplit=s.split(" ");String[] tSplit=t.split(" ");//calculate intersectionList<String> intersection=new ArrayList<String>();for(int i=0;i<sSplit.length;i++){for(int j=0;j<tSplit.length;j++){if(!intersection.contains(sSplit[i]))//no duplicateif(sSplit[i].equals(tSplit[j]))//has intersection{intersection.add(sSplit[i]);break;}}}//calculate unionList<String> union=new ArrayList<String>();if(sSplit.length>tSplit.length)//calculate big tupple first{for(int i=0;i<sSplit.length;i++)if(!union.contains(sSplit[i]))union.add(sSplit[i]);for(int i=0;i<tSplit.length;i++)if(!union.contains(tSplit[i]))union.add(tSplit[i]);}else{for(int i=0;i<tSplit.length;i++)if(!union.contains(tSplit[i]))union.add(tSplit[i]);for(int i=0;i<sSplit.length;i++)if(!union.contains(sSplit[i]))union.add(sSplit[i]);}return ((float)intersection.size())/((float)union.size());}}


原创粉丝点击