挖掘DBLP作者合作关系,FP-Growth算法实践(1):从DBLP数据集中提取目标信息(会议、作者等)

来源:互联网 发布:黄蜂vs活塞数据 编辑:程序博客网 时间:2024/04/29 14:48



首先从官网下载DBLP数据集http://dblp.uni-trier.de/xml/只需下载 dblp.xml.gz 解压后得到1G多dblp.xml文件!文件略大。






从原始数据中提取样本:

r=open("dblp.xml","r")w=open("dblpExample.xml","w")for i in range(30):print "extract line", ic=r.readline()w.write(c)r.close()w.close()
最终结果:

<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE dblp SYSTEM "dblp.dtd"><dblp><article mdate="2011-01-11" key="journals/acta/Saxena96"><author>Sanjeev Saxena</author><title>Parallel Integer Sorting and Simulation Amongst CRCW Models.</title><pages>607-619</pages><year>1996</year><volume>33</volume><journal>Acta Inf.</journal><number>7</number><url>db/journals/acta/acta33.html#Saxena96</url><ee>http://dx.doi.org/10.1007/BF03036466</ee></article><article mdate="2011-01-11" key="journals/acta/Simon83">...</article>...</dblp>

发现没用,因为只能看一种情况。下面采用另一种方法:




由于只提取如下会议:SDM, ICDM, ECML--PKDD, PAKDD, WSDM, DMKD, TKDE, KDD Explorations, ACM Trans. On KDD, CVPR, ICML, NIPS, COLT、CVPR、SIGIR、SIGKDD 十六个会议,至少从2000年至今的所有数据。

看一下SDM:

<inproceedings mdate="2014-02-12" key="conf/sdm/HanN08"><author>Shuguo Han</author><author>Wee Keong Ng</author><title>Preemptive Measures against Malicious Party in Privacy-Preserving Data Mining.</title><pages>375-386</pages><year>2008</year><booktitle>SDM</booktitle><ee>http://dx.doi.org/10.1137/1.9781611972788.34</ee><crossref>conf/sdm/2008</crossref><url>db/conf/sdm/sdm2008.html#HanN08</url></inproceedings><inproceedings mdate="2015-12-30" key="conf/sdm/LiGGDZ15"><author>Kang Li</author><author>Jing Gao</author><author>Suxin Guo</author><author>Nan Du</author><author>Aidong Zhang</author><title>Functional Node Detection on Linked Data.</title><pages>1-9</pages><year>2015</year><booktitle>SDM</booktitle><ee>http://dx.doi.org/10.1137/1.9781611974010.1</ee><crossref>conf/sdm/2015</crossref><url>db/conf/sdm/sdm2015.html#LiGGDZ15</url></inproceedings>


看一下ICDM:

<inproceedings mdate="2014-09-17" key="conf/icdm/LazarevicKKKT03"><author>Aleksandar Lazarevic</author><author>Ramdev Kanapady</author><author>Chandrika Kamath</author><author>Vipin Kumar</author><author>Kumar K. Tamma</author><title>Localized Prediction of Continuous Target Variables Using Hierarchical Clustering.</title><pages>139-146</pages><year>2003</year><crossref>conf/icdm/2003</crossref><booktitle>ICDM</booktitle><ee>http://dx.doi.org/10.1109/ICDM.2003.1250913</ee><ee>http://doi.ieeecomputersociety.org/10.1109/ICDM.2003.1250913</ee><url>db/conf/icdm/icdm2003.html#LazarevicKKKT03</url></inproceedings><inproceedings mdate="2014-09-17" key="conf/icdm/CampagnaP09"><author>Andrea Campagna</author><author>Rasmus Pagh</author><title>Finding Associations and Computing Similarity via Biased Pair Sampling.</title><pages>61-70</pages><year>2009</year><booktitle>ICDM</booktitle><ee>http://dx.doi.org/10.1109/ICDM.2009.35</ee><ee>http://doi.ieeecomputersociety.org/10.1109/ICDM.2009.35</ee><crossref>conf/icdm/2009</crossref><url>db/conf/icdm/icdm2009.html#CampagnaP09</url></inproceedings>

单独看ECML-PKDD:

<inproceedings mdate="2013-08-30" key="conf/pkdd/TomasevM13a"><author>Nenad Tomasev</author><author>Dunja Mladenic</author><title>Image Hub Explorer: Evaluating Representations and Metrics for Content-Based Image Retrieval and Object Recognition.</title><pages>637-640</pages><year>2013</year><booktitle>ECML/PKDD (3)</booktitle><ee>http://dx.doi.org/10.1007/978-3-642-40994-3_44</ee><crossref>conf/pkdd/2013-3</crossref><url>db/conf/pkdd/pkdd2013-3.html#TomasevM13a</url></inproceedings><inproceedings mdate="2015-08-30" key="conf/pkdd/BudhathokiV15"><author>Kailash Budhathoki</author><author>Jilles Vreeken</author><title>The Difference and the Norm - Characterising Similarities and Differences Between Databases.</title><pages>206-223</pages><year>2015</year><booktitle>ECML/PKDD (2)</booktitle><ee>http://dx.doi.org/10.1007/978-3-319-23525-7_13</ee><crossref>conf/pkdd/2015-2</crossref><url>db/conf/pkdd/pkdd2015-2.html#BudhathokiV15</url></inproceedings>


单独看PAKDD:

<inproceedings mdate="2008-05-15" key="conf/pakdd/HanN08"><author>Shuguo Han</author><author>Wee Keong Ng</author><title>Privacy-Preserving Linear Fisher Discriminant Analysis.</title><pages>136-147</pages><year>2008</year><booktitle>PAKDD</booktitle><ee>http://dx.doi.org/10.1007/978-3-540-68125-0_14</ee><crossref>conf/pakdd/2008</crossref><url>db/conf/pakdd/pakdd2008.html#HanN08</url></inproceedings><inproceedings mdate="2005-05-18" key="conf/pakdd/BoWJ05"><author>Liefeng Bo</author><author>Ling Wang</author><author>Licheng Jiao</author><title>Training Support Vector Machines Using Greedy Stagewise Algorithm.</title><pages>632-638</pages><year>2005</year><crossref>conf/pakdd/2005</crossref><booktitle>PAKDD</booktitle><ee>http://dx.doi.org/10.1007/11430919_73</ee><url>db/conf/pakdd/pakdd2005.html#BoWJ05</url></inproceedings>

单独看WSDM:

<inproceedings mdate="2011-01-31" key="conf/wsdm/Kawamae11a"><author>Noriaki Kawamae</author><title>Predicting future reviews: sentiment analysis models for collaborative filtering.</title><pages>605-614</pages><year>2011</year><booktitle>WSDM</booktitle><ee>http://doi.acm.org/10.1145/1935826.1935911</ee><crossref>conf/wsdm/2011</crossref><url>db/conf/wsdm/wsdm2011.html#Kawamae11a</url></inproceedings><inproceedings mdate="2015-01-29" key="conf/wsdm/TangCAL15"><author>Jiliang Tang</author><author>Shiyu Chang</author><author>Charu C. Aggarwal</author><author>Huan Liu</author><title>Negative Link Prediction in Social Media.</title><pages>87-96</pages><year>2015</year><booktitle>WSDM</booktitle><ee>http://doi.acm.org/10.1145/2684822.2685295</ee><crossref>conf/wsdm/2015</crossref><url>db/conf/wsdm/wsdm2015.html#TangCAL15</url></inproceedings>

单独看DMKD:

<inproceedings mdate="2003-04-04" key="conf/dmkd/KantarciogluC02"><author>Murat Kantarcioglu</author><author>Chris Clifton</author><title>Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data.</title><year>2002</year><booktitle>DMKD</booktitle><ee>http://www.bell-labs.com/user/minos/DMKD02/Papers/kantarcioglu.pdf</ee><url>db/conf/dmkd/dmkd2002.html#KantarciogluC02</url></inproceedings><inproceedings mdate="2003-04-04" key="conf/dmkd/ZhuFAE02"><author>Xingquan Zhu</author><author>Jianping Fan</author><author>Walid G. Aref</author><author>Ahmed K. Elmagarmid</author><title>ClassMiner: Mining Medical Video Content Structure and Events Towards Efficient Access and Scalable Skimming.</title><year>2002</year><booktitle>DMKD</booktitle><ee>http://www.bell-labs.com/user/minos/DMKD02/Papers/zhu.pdf</ee><url>db/conf/dmkd/dmkd2002.html#ZhuFAE02</url></inproceedings>

忽略TKDE,KDD Explorations,ACM Trans. On KDD。

单独看CVPR:

<inproceedings mdate="2014-07-31" key="conf/cvpr/BrandOP97"><author>Matthew Brand</author><author>Nuria Oliver</author><author>Alex Pentland</author><title>Coupled hidden Markov models for complex action recognition.</title><pages>994-999</pages><year>1997</year><crossref>conf/cvpr/1997</crossref><booktitle>CVPR</booktitle><ee>http://dx.doi.org/10.1109/CVPR.1997.609450</ee><ee>http://doi.ieeecomputersociety.org/10.1109/CVPR.1997.609450</ee><url>db/conf/cvpr/cvpr1997.html#BrandOP97</url></inproceedings><inproceedings mdate="2014-07-30" key="conf/cvpr/LiGK09"><author>Yan Li</author><author>Leon Gu</author><author>Takeo Kanade</author><title>A robust shape model for multi-view car alignment.</title><pages>2466-2473</pages><year>2009</year><booktitle>CVPR</booktitle><ee>http://dx.doi.org/10.1109/CVPRW.2009.5206799</ee><ee>http://doi.ieeecomputersociety.org/10.1109/CVPRW.2009.5206799</ee><crossref>conf/cvpr/2009</crossref><url>db/conf/cvpr/cvpr2009.html#LiGK09</url></inproceedings>

单独看ICML:

<inproceedings mdate="2013-11-25" key="journals/jmlr/WilsonFT12"><author>Aaron Wilson</author><author>Alan Fern</author><author>Prasad Tadepalli</author><title>Transfer Learning in Sequential Decision Problems: A Hierarchical Bayesian Approach.</title><pages>217-227</pages><booktitle>ICML Unsupervised and Transfer Learning</booktitle><crossref>conf/icml/2011utl</crossref><year>2012</year><ee>http://jmlr.csail.mit.edu/proceedings/papers/v27/wilson12a.html</ee><url>db/journals/jmlr/jmlrp27.html#WilsonFT12</url></inproceedings>
<inproceedings mdate="2013-12-04" key="journals/jmlr/GlowackaDS12"><author>Dorota Glowacka</author><author>Louis Dorard</author><author>John Shawe-Taylor</author><title>Preface.</title><booktitle>ICML On-line Trading of Exploration and Exploitation</booktitle><year>2012</year><crossref>conf/icml/2011otee</crossref><ee>http://jmlr.csail.mit.edu/proceedings/papers/v26/glowacka12a/glowacka12a.pdf</ee><url>db/journals/jmlr/jmlrp26.html#GlowackaDS12</url></inproceedings>


单独看NIPS:

<inproceedings mdate="2013-11-25" key="journals/jmlr/ZhengG10"><author>Cheng Zheng</author><author>Zhi Geng</author><title>Reverse Engineering of Asynchronous Boolean Networks.</title><pages>237-248</pages><booktitle>NIPS Causality: Objectives and Assessment</booktitle><year>2010</year><crossref>conf/nips/2008coa</crossref><ee>http://www.jmlr.org/proceedings/papers/v6/zheng10a.html</ee><url>db/journals/jmlr/jmlrp6.html#ZhengG10</url></inproceedings>
<inproceedings mdate="2013-11-25" key="journals/jmlr/WhiteCL11"><author>Halbert White</author><author>Karim Chalak</author><author>Xun Lu</author><title>Linking Granger Causality and the Pearl Causal Model with Settable Systems.</title><pages>1-29</pages><booktitle>NIPS Mini-Symposium on Causality in Time Series</booktitle><year>2011</year><crossref>conf/nips/2009mscts</crossref><ee>http://www.jmlr.org/proceedings/papers/v12/white11.htm</ee><url>db/journals/jmlr/jmlrp12.html#WhiteCL11</url></inproceedings>

单独看COLT:

<inproceedings mdate="2013-11-25" key="journals/jmlr/BalcanCIW12"><author>Maria-Florina Balcan</author><author>Florin Constantin</author><author>Satoru Iwata</author><author>Lei Wang</author><title>Learning Valuation Functions.</title><pages>4.1-4.24</pages><booktitle>COLT</booktitle><year>2012</year><crossref>conf/colt/2012</crossref><ee>http://www.jmlr.org/proceedings/papers/v23/balcan12b/balcan12b.pdf</ee><url>db/journals/jmlr/jmlrp23.html#BalcanCIW12</url></inproceedings>


单独看SIGIR:

<inproceedings mdate="2012-08-15" key="conf/sigir/RaveendranC12"><author>Gobaan Raveendran</author><author>Charles L. A. Clarke</author><title>Lightweight contrastive summarization for news comment mining.</title><pages>1103-1104</pages><year>2012</year><booktitle>SIGIR</booktitle><ee>http://doi.acm.org/10.1145/2348283.2348490</ee><crossref>conf/sigir/2012</crossref><url>db/conf/sigir/sigir2012.html#RaveendranC12</url></inproceedings><inproceedings mdate="2012-09-13" key="conf/sigir/KraftB84"><author>Donald H. Kraft</author><author>Duncan A. Buell</author><title>Advances in a Bayesian Decision Model of User Stopping Behaviour for Scanning the Output of an Information Retrieval System.</title><pages>421-433</pages><year>1984</year><booktitle>SIGIR</booktitle><url>db/conf/sigir/sigir84.html#KraftB84</url><ee>http://dl.acm.org/citation.cfm?id=636833</ee></inproceedings>


单独看SIGKDD:

<inproceedings mdate="2010-08-09" key="conf/kdd/FeiH10"><author>Hongliang Fei</author><author>Jun Huan</author><title>Boosting with structure information in the functional space: an application to graph classification.</title><pages>643-652</pages><year>2010</year><booktitle>KDD</booktitle><ee>http://doi.acm.org/10.1145/1835804.1835886</ee><crossref>conf/kdd/2010</crossref><url>db/conf/kdd/kdd2010.html#FeiH10</url></inproceedings><inproceedings mdate="2015-08-10" key="conf/kdd/OuCWW015"><author>Mingdong Ou</author><author>Peng Cui</author><author>Fei Wang</author><author>Jun Wang</author><author>Wenwu Zhu 0001</author><title>Non-transitive Hashing with Latent Similarity Components.</title><pages>895-904</pages><year>2015</year><booktitle>KDD</booktitle><ee>http://doi.acm.org/10.1145/2783258.2783283</ee><crossref>conf/kdd/2015</crossref><url>db/conf/kdd/kdd2015.html#OuCWW015</url></inproceedings>


好吧,费了九牛二虎之力,一个个看了一遍,到底是用来干嘛的???

第一,确定哪些标签对我们找到这16个会议有用:

DBLPContentHandler.pubList = ["article", "inproceedings", "proceedings", "book", "incollection", "phdthesis", "mastersthesis", "www"]
第二,确定哪些标签是我们感兴趣的数据:

DBLPContentHandler.fieldList = ["author", "editor", "title", "booktitle", "pages", "year", "address", "journal", "volume", "number", "month", "url", "ee", "cdrom", "cite", "publisher", "note", "crossref", "isbn", "series", "school", "chapter"]

红色的标志解决第一个问题;黑色的标志解决第二个问题。

还需要注意一点,比如ICML的<booktitle>就有多种,所以,不能直接用等于号去匹配!!!
      

<booktitle>ICML Unsupervised and Transfer Learning</booktitle>
<booktitle>ICML On-line Trading of Exploration and Exploitation</booktitle>




下面给出提取所需要的allDB(感兴趣的信息库)和authorDB(作者信息单独一个库)的代码:

#!usr/bin/env python# -*- coding:utf-8 -*-from xml.dom.minidom import parsefileName="dblp.xml"confNameDict={"SDM":1, "ICDM":1, "ECML/PKDD":1, "PAKDD":1, "WSDM":1, "DMKD":1, "CVPR":1, "ICML":1, "NIPS":1, "COLT":1, "SIGIR":1, "KDD":1}fromYear="2000"allList=[] #"confName    \t    year    \t    title    \t    author1|author2|..|authorn"authorDict={} #author: [frequence, yearStart, yearEnd]if __name__=="__main__":    domTree=parse(fileName)    dblp=domTree.documentElement    inproceedingsList=dblp.getElementsByTagName("inproceedings")    for inproceedings in inproceedingsList:                year=inproceedings.getElementsByTagName("year")[0]        yearStr=str(year.childNodes[0].data)        if yearStr<fromYear:            continue        print "yearStr", yearStr, "=="*20                booktitle=inproceedings.getElementsByTagName("booktitle")[0]        booktitleStr=str(booktitle.childNodes[0].data)        #for "<booktitle>ICML Unsupervised and Transfer Learning</booktitle>"        booktitleStr=booktitleStr.split(" ")[0]        if not confNameDict.has_key(booktitleStr):            continue        print "booktitleStr", booktitleStr, "^^"*20                #allList=[] #"confName    \t    year    \t    title    \t    author1|author2|..|authorn"        #authorDict={} #author: [frequence, yearStart, yearEnd]        allContent=booktitleStr+"\t"+yearStr+"\t" #confName    \t    year    \t        title=inproceedings.getElementsByTagName("title")[0]        titleStr=str(title.childNodes[0].data)        allContent+=titleStr+"\t" #title    \t        authorList=inproceedings.getElementsByTagName("author")        for i, author in enumerate(authorList):            authorStr=str(author.childNodes[0].data)            allContent+=authorStr+"|" #authori|            if authorDict.has_key(authorStr):                authorDict[authorStr][0]+=1                if yearStr<authorDict[authorStr][1]:                    authorDict[authorStr][1]=yearStr                elif yearStr>authorDict[authorStr][2]:                    authorDict[authorStr][2]=yearStr            else:                authorDict[authorStr]=[1, yearStr, yearStr]        allList.append(allContent)        allContent="\n".join(allList)    wf=open("allDB.txt","w")    wf.write(allContent)    wf.close()        authorList=sorted(authorDict.items(), lambda x, y: cmp(x[1], y[1]), reverse=True)    wf=open("authorDB.txt","w")    allContent="\n".join([author+"\t"+str(frequence)+"\t"+yearStart+"\t"+yearEnd for author, (frequence , yearStart, yearEnd) in authorList])    wf.write(allContent)    wf.close()



这里直接使用xml.dom去解析文件,这样的方式需要将所有数据读到内存,然后构建dom树;我在服务器上跑的,用台式机的建议使用sax解析(参考:http://www.runoob.com/python/python-xml.html)。

下面是测试数据集:好吧,上出了几次都不成功;

<?xml version="1.0" encoding="ISO-8859-1"?>  <!DOCTYPE dblp SYSTEM "dblp.dtd"><dblp><inproceedings mdate="2014-02-12" key="conf/sdm/HanN08">  <author>Shuguo Han</author>  <author>Wee Keong Ng</author>  <title>Preemptive Measures against Malicious Party in Privacy-Preserving Data Mining.</title>  <pages>375-386</pages>  <year>2008</year>  <booktitle>SDM</booktitle>  <ee>http://dx.doi.org/10.1137/1.9781611972788.34</ee>  <crossref>conf/sdm/2008</crossref>  <url>db/conf/sdm/sdm2008.html#HanN08</url>  </inproceedings>  <inproceedings mdate="2015-12-30" key="conf/sdm/LiGGDZ15">  <author>Kang Li</author>  <author>Jing Gao</author>  <author>Suxin Guo</author>  <author>Nan Du</author>  <author>Aidong Zhang</author>  <title>Functional Node Detection on Linked Data.</title>  <pages>1-9</pages>  <year>2015</year>  <booktitle>SDM</booktitle>  <ee>http://dx.doi.org/10.1137/1.9781611974010.1</ee>  <crossref>conf/sdm/2015</crossref>  <url>db/conf/sdm/sdm2015.html#LiGGDZ15</url>  </inproceedings><inproceedings mdate="2014-09-17" key="conf/icdm/LazarevicKKKT03">  <author>Aleksandar Lazarevic</author>  <author>Ramdev Kanapady</author>  <author>Chandrika Kamath</author>  <author>Vipin Kumar</author>  <author>Kumar K. Tamma</author>  <title>Localized Prediction of Continuous Target Variables Using Hierarchical Clustering.</title>  <pages>139-146</pages>  <year>2003</year>  <crossref>conf/icdm/2003</crossref>  <booktitle>ICDM</booktitle>  <ee>http://dx.doi.org/10.1109/ICDM.2003.1250913</ee>  <ee>http://doi.ieeecomputersociety.org/10.1109/ICDM.2003.1250913</ee>  <url>db/conf/icdm/icdm2003.html#LazarevicKKKT03</url>  </inproceedings>  <inproceedings mdate="2014-09-17" key="conf/icdm/CampagnaP09">  <author>Andrea Campagna</author>  <author>Rasmus Pagh</author>  <title>Finding Associations and Computing Similarity via Biased Pair Sampling.</title>  <pages>61-70</pages>  <year>2009</year>  <booktitle>ICDM</booktitle>  <ee>http://dx.doi.org/10.1109/ICDM.2009.35</ee>  <ee>http://doi.ieeecomputersociety.org/10.1109/ICDM.2009.35</ee>  <crossref>conf/icdm/2009</crossref>  <url>db/conf/icdm/icdm2009.html#CampagnaP09</url>  </inproceedings><inproceedings mdate="2013-08-30" key="conf/pkdd/TomasevM13a">  <author>Nenad Tomasev</author>  <author>Dunja Mladenic</author>  <title>Image Hub Explorer: Evaluating Representations and Metrics for Content-Based Image Retrieval and Object Recognition.</title>  <pages>637-640</pages>  <year>2013</year>  <booktitle>ECML/PKDD (3)</booktitle>  <ee>http://dx.doi.org/10.1007/978-3-642-40994-3_44</ee>  <crossref>conf/pkdd/2013-3</crossref>  <url>db/conf/pkdd/pkdd2013-3.html#TomasevM13a</url>  </inproceedings>  <inproceedings mdate="2015-08-30" key="conf/pkdd/BudhathokiV15">  <author>Kailash Budhathoki</author>  <author>Jilles Vreeken</author>  <title>The Difference and the Norm - Characterising Similarities and Differences Between Databases.</title>  <pages>206-223</pages>  <year>2015</year>  <booktitle>ECML/PKDD (2)</booktitle>  <ee>http://dx.doi.org/10.1007/978-3-319-23525-7_13</ee>  <crossref>conf/pkdd/2015-2</crossref>  <url>db/conf/pkdd/pkdd2015-2.html#BudhathokiV15</url>  </inproceedings><inproceedings mdate="2008-05-15" key="conf/pakdd/HanN08">  <author>Shuguo Han</author>  <author>Wee Keong Ng</author>  <title>Privacy-Preserving Linear Fisher Discriminant Analysis.</title>  <pages>136-147</pages>  <year>2008</year>  <booktitle>PAKDD</booktitle>  <ee>http://dx.doi.org/10.1007/978-3-540-68125-0_14</ee>  <crossref>conf/pakdd/2008</crossref>  <url>db/conf/pakdd/pakdd2008.html#HanN08</url>  </inproceedings>  <inproceedings mdate="2005-05-18" key="conf/pakdd/BoWJ05">  <author>Liefeng Bo</author>  <author>Ling Wang</author>  <author>Licheng Jiao</author>  <title>Training Support Vector Machines Using Greedy Stagewise Algorithm.</title>  <pages>632-638</pages>  <year>2005</year>  <crossref>conf/pakdd/2005</crossref>  <booktitle>PAKDD</booktitle>  <ee>http://dx.doi.org/10.1007/11430919_73</ee>  <url>db/conf/pakdd/pakdd2005.html#BoWJ05</url>  </inproceedings><inproceedings mdate="2011-01-31" key="conf/wsdm/Kawamae11a">  <author>Noriaki Kawamae</author>  <title>Predicting future reviews: sentiment analysis models for collaborative filtering.</title>  <pages>605-614</pages>  <year>2011</year>  <booktitle>WSDM</booktitle>  <ee>http://doi.acm.org/10.1145/1935826.1935911</ee>  <crossref>conf/wsdm/2011</crossref>  <url>db/conf/wsdm/wsdm2011.html#Kawamae11a</url>  </inproceedings>  <inproceedings mdate="2015-01-29" key="conf/wsdm/TangCAL15">  <author>Jiliang Tang</author>  <author>Shiyu Chang</author>  <author>Charu C. Aggarwal</author>  <author>Huan Liu</author>  <title>Negative Link Prediction in Social Media.</title>  <pages>87-96</pages>  <year>2015</year>  <booktitle>WSDM</booktitle>  <ee>http://doi.acm.org/10.1145/2684822.2685295</ee>  <crossref>conf/wsdm/2015</crossref>  <url>db/conf/wsdm/wsdm2015.html#TangCAL15</url>  </inproceedings><inproceedings mdate="2003-04-04" key="conf/dmkd/KantarciogluC02">  <author>Murat Kantarcioglu</author>  <author>Chris Clifton</author>  <title>Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data.</title>  <year>2002</year>  <booktitle>DMKD</booktitle>  <ee>http://www.bell-labs.com/user/minos/DMKD02/Papers/kantarcioglu.pdf</ee>  <url>db/conf/dmkd/dmkd2002.html#KantarciogluC02</url>  </inproceedings>  <inproceedings mdate="2003-04-04" key="conf/dmkd/ZhuFAE02">  <author>Xingquan Zhu</author>  <author>Jianping Fan</author>  <author>Walid G. Aref</author>  <author>Ahmed K. Elmagarmid</author>  <title>ClassMiner: Mining Medical Video Content Structure and Events Towards Efficient Access and Scalable Skimming.</title>  <year>2002</year>  <booktitle>DMKD</booktitle>  <ee>http://www.bell-labs.com/user/minos/DMKD02/Papers/zhu.pdf</ee>  <url>db/conf/dmkd/dmkd2002.html#ZhuFAE02</url>  </inproceedings><inproceedings mdate="2014-07-31" key="conf/cvpr/BrandOP97">  <author>Matthew Brand</author>  <author>Nuria Oliver</author>  <author>Alex Pentland</author>  <title>Coupled hidden Markov models for complex action recognition.</title>  <pages>994-999</pages>  <year>1997</year>  <crossref>conf/cvpr/1997</crossref>  <booktitle>CVPR</booktitle>  <ee>http://dx.doi.org/10.1109/CVPR.1997.609450</ee>  <ee>http://doi.ieeecomputersociety.org/10.1109/CVPR.1997.609450</ee>  <url>db/conf/cvpr/cvpr1997.html#BrandOP97</url>  </inproceedings>  <inproceedings mdate="2014-07-30" key="conf/cvpr/LiGK09">  <author>Yan Li</author>  <author>Leon Gu</author>  <author>Takeo Kanade</author>  <title>A robust shape model for multi-view car alignment.</title>  <pages>2466-2473</pages>  <year>2009</year>  <booktitle>CVPR</booktitle>  <ee>http://dx.doi.org/10.1109/CVPRW.2009.5206799</ee>  <ee>http://doi.ieeecomputersociety.org/10.1109/CVPRW.2009.5206799</ee>  <crossref>conf/cvpr/2009</crossref>  <url>db/conf/cvpr/cvpr2009.html#LiGK09</url>  </inproceedings><inproceedings mdate="2013-11-25" key="journals/jmlr/WilsonFT12">  <author>Aaron Wilson</author>  <author>Alan Fern</author>  <author>Prasad Tadepalli</author>  <title>Transfer Learning in Sequential Decision Problems: A Hierarchical Bayesian Approach.</title>  <pages>217-227</pages>  <booktitle>ICML Unsupervised and Transfer Learning</booktitle>  <crossref>conf/icml/2011utl</crossref>  <year>2012</year>  <ee>http://jmlr.csail.mit.edu/proceedings/papers/v27/wilson12a.html</ee>  <url>db/journals/jmlr/jmlrp27.html#WilsonFT12</url>  </inproceedings><inproceedings mdate="2013-12-04" key="journals/jmlr/GlowackaDS12">  <author>Dorota Glowacka</author>  <author>Louis Dorard</author>  <author>John Shawe-Taylor</author>  <title>Preface.</title>  <booktitle>ICML On-line Trading of Exploration and Exploitation</booktitle>  <year>2012</year>  <crossref>conf/icml/2011otee</crossref>  <ee>http://jmlr.csail.mit.edu/proceedings/papers/v26/glowacka12a/glowacka12a.pdf</ee>  <url>db/journals/jmlr/jmlrp26.html#GlowackaDS12</url>  </inproceedings><inproceedings mdate="2013-11-25" key="journals/jmlr/ZhengG10">  <author>Cheng Zheng</author>  <author>Zhi Geng</author>  <title>Reverse Engineering of Asynchronous Boolean Networks.</title>  <pages>237-248</pages>  <booktitle>NIPS Causality: Objectives and Assessment</booktitle>  <year>2010</year>  <crossref>conf/nips/2008coa</crossref>  <ee>http://www.jmlr.org/proceedings/papers/v6/zheng10a.html</ee>  <url>db/journals/jmlr/jmlrp6.html#ZhengG10</url>  </inproceedings><inproceedings mdate="2013-11-25" key="journals/jmlr/WhiteCL11">  <author>Halbert White</author>  <author>Karim Chalak</author>  <author>Xun Lu</author>  <title>Linking Granger Causality and the Pearl Causal Model with Settable Systems.</title>  <pages>1-29</pages>  <booktitle>NIPS Mini-Symposium on Causality in Time Series</booktitle>  <year>2011</year>  <crossref>conf/nips/2009mscts</crossref>  <ee>http://www.jmlr.org/proceedings/papers/v12/white11.htm</ee>  <url>db/journals/jmlr/jmlrp12.html#WhiteCL11</url>  </inproceedings><inproceedings mdate="2013-11-25" key="journals/jmlr/BalcanCIW12">  <author>Maria-Florina Balcan</author>  <author>Florin Constantin</author>  <author>Satoru Iwata</author>  <author>Lei Wang</author>  <title>Learning Valuation Functions.</title>  <pages>4.1-4.24</pages>  <booktitle>COLT</booktitle>  <year>2012</year>  <crossref>conf/colt/2012</crossref>  <ee>http://www.jmlr.org/proceedings/papers/v23/balcan12b/balcan12b.pdf</ee>  <url>db/journals/jmlr/jmlrp23.html#BalcanCIW12</url>  </inproceedings><inproceedings mdate="2012-08-15" key="conf/sigir/RaveendranC12">  <author>Gobaan Raveendran</author>  <author>Charles L. A. Clarke</author>  <title>Lightweight contrastive summarization for news comment mining.</title>  <pages>1103-1104</pages>  <year>2012</year>  <booktitle>SIGIR</booktitle>  <ee>http://doi.acm.org/10.1145/2348283.2348490</ee>  <crossref>conf/sigir/2012</crossref>  <url>db/conf/sigir/sigir2012.html#RaveendranC12</url>  </inproceedings>  <inproceedings mdate="2012-09-13" key="conf/sigir/KraftB84">  <author>Donald H. Kraft</author>  <author>Duncan A. Buell</author>  <title>Advances in a Bayesian Decision Model of User Stopping Behaviour for Scanning the Output of an Information Retrieval System.</title>  <pages>421-433</pages>  <year>1984</year>  <booktitle>SIGIR</booktitle>  <url>db/conf/sigir/sigir84.html#KraftB84</url>  <ee>http://dl.acm.org/citation.cfm?id=636833</ee>  </inproceedings><inproceedings mdate="2010-08-09" key="conf/kdd/FeiH10">  <author>Hongliang Fei</author>  <author>Jun Huan</author>  <title>Boosting with structure information in the functional space: an application to graph classification.</title>  <pages>643-652</pages>  <year>2010</year>  <booktitle>KDD</booktitle>  <ee>http://doi.acm.org/10.1145/1835804.1835886</ee>  <crossref>conf/kdd/2010</crossref>  <url>db/conf/kdd/kdd2010.html#FeiH10</url>  </inproceedings>  <inproceedings mdate="2015-08-10" key="conf/kdd/OuCWW015">  <author>Mingdong Ou</author>  <author>Peng Cui</author>  <author>Fei Wang</author>  <author>Jun Wang</author>  <author>Wenwu Zhu 0001</author>  <title>Non-transitive Hashing with Latent Similarity Components.</title>  <pages>895-904</pages>  <year>2015</year>  <booktitle>KDD</booktitle>  <ee>http://doi.acm.org/10.1145/2783258.2783283</ee>  <crossref>conf/kdd/2015</crossref>  <url>db/conf/kdd/kdd2015.html#OuCWW015</url>  </inproceedings></dblp>


allDB.txt部分结果:

SDM2008Preemptive Measures against Malicious Party in Privacy-Preserving Data Mining.Shuguo Han|Wee Keong Ng|SDM2015Functional Node Detection on Linked Data.Kang Li|Jing Gao|Suxin Guo|Nan Du|Aidong Zhang|ICDM2003Localized Prediction of Continuous Target Variables Using Hierarchical Clustering.Aleksandar Lazarevic|Ramdev Kanapady|Chandrika Kamath|Vipin Kumar|Kumar K. Tamma|ICDM2009Finding Associations and Computing Similarity via Biased Pair Sampling.Andrea Campagna|Rasmus Pagh|ECML/PKDD2013Image Hub Explorer: Evaluating Representations and Metrics for Content-Based Image Retrieval and Object Recognition.Nenad Tomasev|Dunja Mladenic|ECML/PKDD2015The Difference and the Norm - Characterising Similarities and Differences Between Databases.Kailash Budhathoki|Jilles Vreeken|PAKDD2008Privacy-Preserving Linear Fisher Discriminant Analysis.Shuguo Han|Wee Keong Ng|PAKDD2005Training Support Vector Machines Using Greedy Stagewise Algorithm.Liefeng Bo|Ling Wang|Licheng Jiao|WSDM2011Predicting future reviews: sentiment analysis models for collaborative filtering.Noriaki Kawamae|WSDM2015Negative Link Prediction in Social Media.Jiliang Tang|Shiyu Chang|Charu C. Aggarwal|Huan Liu|

authorDB.txt部分结果:

Shuguo Han220082008Wee Keong Ng220082008Jilles Vreeken120152015Aidong Zhang120152015Jing Gao120152015Suxin Guo120152015Fei Wang120152015Shiyu Chang120152015Nan Du120152015Wenwu Zhu 0001120152015Peng Cui120152015Huan Liu120152015Kang Li120152015Mingdong Ou120152015Charu C. Aggarwal120152015Jun Wang120152015Jiliang Tang120152015Kailash Budhathoki120152015Nenad Tomasev120132013Dunja Mladenic120132013






参考:http://ju.outofmemory.cn/entry/137734

1 0
原创粉丝点击