摘要

来源:互联网 发布:python post_context 编辑:程序博客网 时间:2024/05/03 05:24

 

Combination of Ant Colony Optimization and Bayesian Classification for feature Selection in a Bioinformatics Dataset

Received March 31, 2009; Accepted June 14, 2009; Published June 15, 2009  全文

 

http://www.omicsonline.com/ArchiveJCSB/2009/June/03/JCSB2.186.php?email1=

 

 

 

2008Application of ant colony optimization for feature selection in text categorization

Abstract

Feature selection is commonly used to reduce dimensionality of datasets with tens or hundreds of thousands of features. A major problem of text categorization is the high dimensionality of the feature space; therefore, feature selection is the most important step in text categorization. This paper presents a novel feature selection algorithm that is based on ant colony optimization. Ant colony optimization algorithm is inspired by observation on real ants in their search for the shortest paths to food sources. Proposed algorithm is easily implemented and because of use of a simple classifier in that, its computational complexity is very low. The performance of proposed algorithm is compared to the performance of information gain and CHI algorithms on the task of feature selection in Reuters-21578 dataset. Simulation results on Reuters-21578 dataset show the superiority of the proposed algorithm.

 

 

2009

Text feature selection using ant colony optimization

 

Feature selection and feature extraction are the most important steps in classification systems. Feature selection is commonly used to reduce dimensionality of datasets with tens or hundreds of thousands of features which would be impossible to process further. One of the problems in which feature selection is essential is text categorization. A major problem of text categorization is the high dimensionality of the feature space; therefore, feature selection is the most important step in text categorization. At present there are many methods to deal with text feature selection. To improve the performance of text categorization, we present a novel feature selection algorithm that is based on ant colony optimization. Ant colony optimization algorithm is inspired by observation on real ants in their search for the shortest paths to food sources. Proposed algorithm is easily implemented and because of use of a simple classifier in that, its computational complexity is very low. The performance of proposed algorithm is compared to the performance of genetic algorithm, information gain and CHI on the task of feature selection in Reuters-21578 dataset. Simulation results on Reuters-21578 dataset show the superiority of the proposed algorithm.

 

2009Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization

Malacca, Malaysia
December 04-December 07
ISBN: 978-0-7695-3879-2
ASCII Text
x  
Zi-jun Yu, Wei-gang Wu, Jing Xiao, Jun Zhang, Rui-Zhang Huang, Ou Liu, "Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization," Soft Computing and Pattern Recognition, International Conference of, pp. 430-435, 2009 International Conference of Soft Computing and Pattern Recognition, 2009.
 
 
BibTex
@article{ 10.1109/SoCPaR.2009.90,
author = {Zi-jun Yu and Wei-gang Wu and Jing Xiao and Jun Zhang and Rui-Zhang Huang and Ou Liu},
title = {Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization},
journal ={Soft Computing and Pattern Recognition, International Conference of},
volume = {0},
year = {2009},
isbn = {978-0-7695-3879-2},
pages = {430-435},
doi = {http://doi.ieeecomputersociety.org/10.1109/SoCPaR.2009.90},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
}
 
 
RefWorks Procite/RefMan/Endnote
x  
TY - CONF
JO - Soft Computing and Pattern Recognition, International Conference of
TI - Keyword Combination Extraction in Text Categorization Based on Ant Colony Optimization
SN - 978-0-7695-3879-2
SP430
EP435
A1 - Zi-jun Yu,
A1 - Wei-gang Wu,
A1 - Jing Xiao,
A1 - Jun Zhang,
A1 - Rui-Zhang Huang,
A1 - Ou Liu,
PY - 2009
KW - ant colony optimization
KW - concept learning
KW - feature selection
KW - keyword combination extraction
KW - text categorization
VL - 0
JA - Soft Computing and Pattern Recognition, International Conference of
ER -
 
Zi-jun Yu
Wei-gang Wu
Jing Xiao
Jun Zhang
Rui-Zhang Huang
Ou Liu Due to the increasing number of documents in digital form, the automated text categorization (TC) has become more and more promising in the last ten years. A TC system can automatically assign a document with the most suitable category, but the reason for such an assignment is usually unknown by users. To make the TC system be interpretable, it is necessary to select a group of keywords, or termed a keyword combination, to describe each text category. In this paper, we propose a novel algorithm, keyword combination extraction based on ant colony optimization (KCEACO), to search the optimal keyword combination of a target category. By extending the traditional feature selection techniques, an evaluation function is designed for evaluating a keyword combination. This function takes into account the relationships among different keywords. Experimental results show that KCEACO can efficiently find the optimal keyword combination from a large number of candidate combinations.
 
 
Expert Systems with Applications
Volume 36, Issue 3, Part 2, April 2009, Pages 6843-6853

doi:10.1016/j.eswa.2008.08.022 | How to Cite or Link Using DOI
Copyright © 2008 Elsevier Ltd All rights reserved.   Cited By in Scopus (2)   Permissions & Reprints


Text feature selection using ant colony optimization

Mehdi Hosseinzadeh AghdamCorresponding Author Contact Information, a, E-mail The Corresponding Author, E-mail The Corresponding Author, Nasser Ghasem-Aghaeea and Mohammad Ehsan Basiria

aDepartment of Computer Engineering, Faculty of Engineering, University of Isfahan, Hezar Jerib Avenue, Esfahan, Iran


Available online 12 August 2008.

Abstract

Feature selection and feature extraction are the most important steps in classification systems. Feature selection is commonly used to reduce dimensionality of datasets with tens or hundreds of thousands of features which would be impossible to process further. One of the problems in which feature selection is essential is text categorization. A major problem of text categorization is the high dimensionality of the feature space; therefore, feature selection is the most important step in text categorization. At present there are many methods to deal with text feature selection. To improve the performance of text categorization, we present a novel feature selection algorithm that is based on ant colony optimization. Ant colony optimization algorithm is inspired by observation on real ants in their search for the shortest paths to food sources. Proposed algorithm is easily implemented and because of use of a simple classifier in that, its computational complexity is very low. The performance of proposed algorithm is compared to the performance of genetic algorithm, information gain and CHI on the task of feature selection in Reuters-21578 dataset. Simulation results on Reuters-21578 dataset show the superiority of the proposed algorithm.

Keywords: Feature selection; Ant colony optimization; Genetic algorithm; Text categorization

Article Outline

1. Introduction 2. Feature selection approaches 3. Ant colony optimization (ACO) 3.1. Ant colony optimization for feature selection 3.1.1. Graph representation 3.1.2. Heuristic desirability 3.1.3. Pheromone update rule 3.1.4. Solution construction 4. Proposed feature selection algorithm 5. Genetic algorithm (GA) 5.1. Genetic algorithm for feature selection 6. Statistical approaches 6.1. Information gain (IG) 6.2. χ2 Statistic (CHI) 7. Experimental results 7.1. Dataset 7.2. Feature extraction 7.3. Performance measure 7.4. Results 8. Conclusion Acknowledgements References

原创粉丝点击