TopicModelCode

来源:互联网 发布:asp编程要点 编辑:程序博客网 时间:2024/04/29 15:27
Topic Models C++

This is a C++ implementation of topic models with variational inference
It include LDA, supervised-LDA, HDP, supervised HDP, online HDP, online SHDP.

Dowload Code here

Please cite [Bibtex]


Install:

1. This code require gcc4.8.
    If you use Ubuntu 12.04 and do not have gcc version 4.8. This link maybe helpful.
    You may aslo want to put
          export CC=gcc-4.8
          export CXX=g++-4.8
    in your .bashrc file :)
2. This code depend on the matrix manipulation libary buola.
    Download Buola here
   Buola is a very nice matrix manipulation libart which is implemented by Xavi Gratal. 
   To install Buola:
            cd minibuola
            mkdir build
            cd build
            cmake .. 
            make -j5
            sudo make install

3. This code depend on GSL
    Information about GSL, click here.

4. Woohoo! Now you can compile Topic Models C++
   As the standard way, 
        mkdir build && cd build
        cmake ..
        make


Play with Topic Models

3class KTH action data for fun
This data is preprocessed with bag-of-STIP

To check the options:
./TopicModel --help


Example 1: SLDA
./TopicModel --slda --alpha 0.1 --corpus_name KTH --data 
YOURPATH/KTH/Train.dat --label YOURPATH/KTH/ImgLabel.txt --test YOURPATH/KTH/Test.dat --shuffle --num_classes 3   -k 30  --truth YOURPATH/KTH/GroundTruth.txt --seed 2
The result will be:
0 0 2 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 2 1 2 2 2 2 2 2 2 2 1 1 2 2 2 2 2 0 2 2 2 
accuracy:0.84745762711864403016
Example 2: SHDP
./TopicModel  --corpus_name KTH --data YOURPATH/KTH/Train.dat --label YOURPATH/KTH/ImgLabel.txt --test YOURPATH/KTH/Test.dat --shuffle --num_classes 3  -k 80 -t 20 --truth YOURPATH/KTH/GroundTruth.txt --seed 2

The result will be:
0 0 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 1 1 2 2 2 2 2 0 0 2 2 
accuracy:0.864407

To use LDA  use --lda 
To use HDP use --hdp
In this case the label document is not needed anymore

For the onlineSHDP and onlineHDP, it need large data to converge. So it does not work for the KTH data that we used here as example.


References
LDA:  D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3:993–1022, 2003. 

SLDA: C. Wang, D. M. Blei, and L. Fei-Fei. Simultaneous image classification and annotation. In CVPR, 2009. 

HDP: Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2006.

SHDP&onlineSHDP:  C. Zhang, C.H. Ek, X. Gratal, F. Pokorny and H. Kjellström, Supervised Hierarchical Dirichlet Process with Variational Inference, In ICCV,2013 
PS: The suplement of this paper gives the computation of the bound and update equation in detail. Recomand for beginners. 

OnlineHDP: C. Wang, J. Paisley, and D. Blei. Online variational inference for the Hierarchical Dirichlet Process. In AISTATS, 2011.
 

0 0
原创粉丝点击