Segmentation Reading List
来源:互联网 发布:php权限管理系统 编辑:程序博客网 时间:2024/06/06 10:54
Segmentation Reading List
Word Segmentation and Word discovery
Reference & Comment1
Ogawa, Yasushi; Matsuda, Toru
1999
Overlapping statistical segmentation for effective indexing of Japanese text Information Processing & Management, Volume: 35, Issue: 4 pp. 463-480
2001.
A Dynamic HMM for On-line Segmentation of Sequential Data.
To appear in Proceedings of NIPS-2001.
和wordseg不太相关
3Unsupervised Learning of Word Segmentation Rules with Genetic Algorithms and Inductive Logic Programming.2001.
Dimitar Kazakov, Suresh Manandhar.
Machine Learning, 43 (1/2):121-162, April 2001. (C) Kluwer Academic Publishers
不错,可是是用来做Morph的(recommended)
4A Statistical Model for Word Discovery in Transcribed Speech2001.
Anand Venkataraman Computational Linguistics Volume 27 Number 3 Pages 351 - 379, 2001.5Sun Maosong, Shen Dayang, and Huang Changning,
1997.
Cseg & tag1.0: A Practical Word Segmenter and POS Tagger for Chinese Texts,
Fifth Conference on Applied Natural Language Processing, Washington, DC. USA, pp.119-126, 1997.3.31-4.3
Supervised 的survey
6Tom B.Y.Lai, Sun Maosong,, Benjamin K. Tsou, S. Caesar Lun,1997.
Chinese Word Segmentation and Part-of-Speech Tagging in One Step,
Proceedings of Rocling X International Conference 1997 Research on Computational Linguistics, Taipei, Taiwan, China, August 22-24, pp.229-236, 1997.
分而治之策略
7W. J. Teahan.Text Classification and Segmentation Using Minimum Cross-Entropy.
In Proceedings of the International Conference on Content-based Multimedia Information Access (RIAO 2000), pages 943-961. C.I.D.-C.A.S.I.S, Paris,France, 2000.
ISBN 2-905450-07-X.
和下一篇一样
8W. J. Teahan, Y. Wen*, R. McNab*, and I. H. Witten*.A Compression-based Algorithm for Chinese Word Segmentation.
Computational Linguistics, 26(3):375-393, 2000.
ISSN 0891-2017.
Supervised Word Segmentation,最短路算法框架
9A. Stolcke & E. Shriberg
(1996),
Automatic linguistic segmentation of conversational speech.
Proc. Intl. Conf. on Spoken Language Processing, vol. 2, pp. 1005-1008, Philadelphia, PA.
(1998).
Automatic Detection of Sentence Boundaries and Disfluencies based on Recognized Words.
Proc. Intl. Conf. on Spoken Language Processing, vol. 5, pp. 2247-2250, Sydney, Australia11
Deb Roy
2000.
A Computational Model of Word Learning from Multimodal Sensory Input.
International conference of Cognitive Modeling, Groningen, Netherlands, March 2000
Michael R. Brent and Xiaopeng Tao
2001.
Chinese Text Segmentation With MBDP-1: Making the Most of Training Corpora ACL2001.
没怎么看懂,感觉不太好
13Ando, R. K. and Lee, L.2000.
Mostly-Unsupervised Statistical Segmentation of Japanese: Application to Kanji.
ANLP-NAACL.
Mutual Information 体系,可以借鉴(recommended)
14Baker, D., Hofmann, T., McCallum, A. and Yang, Y.A Hierarchical Probabilistic Model for Novelty Detection in Text.
Unpublished manuscript.
和分词没什么关系
15Brand, M.1999.
Structure learning in conditional probability models via an entropic prior and parameter extinction.
In Neural Computation, vol.11, page 1155-1182
下面一篇的Journal版
16M. Brand,1998.
An entropic estimator for structure discovery.
To appear, NIPS98
虽然和wordseg不太相关,但是……太赞了,无语的赞(strongly recommended!)
17M. Brand,1999,
Pattern discovery via entropy minimization.
To appear, Uncertainty99 (AI & Statistics)
和上一篇一样
18Brent1999 Brent, M.1999.
An efficient, probabilistically sound algorithm for segmentation and word discovery.
Machine Learning, 34, 71-106.19Brent, M.R. & T. A. Cartwright.
1996.
Distributional regularity and phonotactic constraints are ueful for segmentation.
In Computational Approaches to Language Acquisition, ed. Michael Brent. Cambridge, MA, MIT Press.20Brent, M. R.
1999.
Speech segmentation and word discovery: A computational perspective.
Trends in Cognitive Science, 3, 294-301.21Dahan and Brent, M.
1999.
On the discovery of novel word-like units from utterances: An artificial-language study with implications for native-language acquisition.
In Journal of Experimental Psychology:General Vol. 128,pp. 165-18522Brown1991 Brown, E. K. , Miller, J.
1991.
Syntax:A Linguistic Introduction to Sentence Structure.
Publisher: HarperCollins ,London23Jing-Shin Chang and Keh-Yih Su,
1997,
An Unsupervised Iterative Method for Chinese New Lexicon Extraction,
InInternational Journal of Computational Linguistics & Chinese Language Processing.
太差了,废话又多,就是EM,何必弄那么复杂呢?
24Chang, Jing-Shin, Yi-Chung Lin and Keh-Yih Su.1995.
Automatic Construction of a Chinese Electronic Dictionary.
Proceedings of the Third Workshop on Very Large Corpora, pp. 107-120, MIT, June, 1995.
就是上一篇
25Brian Clarkson and Alex Pentland.1999.
Unsupervised clustering of ambulatory audio and video.
In In International Conference on Acoustics, Speech and Signal Processing, volume VI, pages 3037-3040. IEEE, 1999.26Deligne, S. and Bimbot, F.
1995.
Language Modeling by Variable Length Sequences:Theoretical Formulation and Evaluation of Multigrams.
ICASSP,199527S. Deligne, F. Yvon, and F. Bimbot.
1995.
Variable-length sequence matching for phonetic transcription using joint multigrams.
In EUROSPEECH.28Deligne, S.; Yvon, F.; and Bimbot, F.
1996.
Introducing statistical dependencies and structural constraints in variable-length sequence models.
In Miclet, L., and de la Higuera, C., eds., Grammatical Inference: Learning Syntax from Sentences, Lecture Notes in Artificial Intelligence 1147. Springer. 156-167.29de Marken, C.
1995.
The Unsupervised Acquisition of a Lexicon from Continuous Speech.
Technical Report A.I. Memo No. 1558, AI Lab., MIT. Cambridge, Massachusetts.30Ge, X., Pratt, W. and Smyth, P.
1999.
Discovering Chinese Words from Unsegmented Text.
SIGIR-99,pages 271-272.
EM体系。paper中报道的实验结果很好,还需实际验证(recommended)
31Goldsmith, J.2001.
Unsupervised Learning of the Morphology of a Natural Language.
to appear in Computational Linguistics 2001.32A. Hanjalic, R.L. Lagendijk, J. Biemond.
1999.
Automatically Segmenting Movies into Logical Story Units.
In D.P. Huijsmans, A.W.M. Smeulders (eds.): Lecture Notes in Computer Science 1614: Visual Information and Information Systems, ISBN 3-540-66079-8, pages 229-236, Springer Verlag 1999 (Proceedings of the Third International Conference VISUAL '99, Amsterdam (NL), June 1999)33Hua, Y.
2000.
Unsupervised word induction using MDL criterion.
ISCSL2000, Beijing.还不错,EM体系和MDL的结合。(recommended)34Kit, C. and Wilks, Y.
1999.
Unsupervised Learning of Word Boundary with Description Length Gain.
In Proceedings CoNLL99 ACL Workshop. Bergen.
有新意,但有缺陷。可以用来初始化EM(recommended)
35Kit, C.2000.
Unsupervised Lexical Learning as Inductive Inference
PhD thesis, University of Sheffield, UK, 2000.36Ponte, J. M. and Croft, W. B.
1996.
Useg: A retargetable word segmentation procedure for information retrievals.
In Symposium on Document Analysis and Information Retrival 96 (SDAIR).37Peng,Fuchun and Schuurmans, Dale
2001.
Self-supervised Chinese Word Segmentation.
The 4th Internation Symposium on Intelligent Data Analysis(IDA2001), September, 2001, Lisbon, Portugal.38Peng,Fuchun and Schuurmans, Dale
2001.
A Hierarchical EM Approach to Word Segmentation,
To appear in Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium (NLPRS 2001), Nov. 2001, Tokyo, Japan.
EM体系,但是想法比较繁琐。
39Sproate, R. and Shih, C.1990.
A statistical method for finding word boundaries in Chinese text.
Computer Processing of Chinese and Oriental Languages, 4:336-351.40Zhao, J., Gao, J., Chang, E. and Li, M.
2000.
Lexicon optimization for Chinese language modeling.
International Symposium Conference on Spoken Language Processing, Beijing.41Su, K., Wu, M., & Chang, J.
1994.
A Corpus-Based Approach to Automatic Compound Extraction.
ACL Proceedings: 32nd Annual Meeting of the Association for Computational Linguistics, (Las Cruces, NM, June 1994), ACL, Morristown, NJ, pp.242-247.42Wu, M.-W. and K.-Y. Su,
1993.
Corpus-based Automatic Compound Extraction with Mutual Information and Relative Frequency Count.
Proceedings of ROCLING VI, pp. 207-216, Nantou, Taiwan, ROC, Sep. 1993.43Chen, K., & Chen, H.
1994.
Extracting Noun Phrases from Large-Scale Texts: A Hybrid Approach and Its Automatic Evaluation.
ACL Proceedings: 32nd Annual Meeting of the Association for Computational Linguistics, (Las Cruces, NM, June 1994),ACL, Morristown, NJ, pp. 234-241.44Jin, Wanying.
1992.
Chinese Segmentation and its Disambiguation.
MCCS-92-227, Computing Research Laboratory, New Mexico State University, Las Cruces, New Mexico.45Kok-Wee Gan, Martha Palmer, Kim-Teng Lua
1996.
A Statistically Emergent Approach for Language Processing: Application to Modeling Context Effects in Ambiguous Chinese Word Boundary Perception. Computational Linguistics, Volume 22,531-553,1996.46Sun Maosong, Shen Dayang, Benjamin K. Tsou
1998.
Chinese Work Segmentation without Using Lexicon and Hand-crafted Training Data.
COLING-ACL 1998: 1265-1271
- Segmentation Reading List
- Reading List
- Reading List
- Reading List
- reading list
- reading list
- Reading List
- Reading List
- READING NOTE: Understanding Convolution for Semantic Segmentation
- future reading list
- My current reading list
- ++ Recommended Reading List
- Hadoop Reading List
- Hadoop Reading List
- Reading list 1
- Deep Learning Reading List
- Reading LIST output(三)
- Reading LIST output(一)
- Python计算机视觉编程练习10:csv 模块学习
- 几种字符串拼接方式
- 在Eclipse或者ADT中使用ButterKnifeZelezny,Android组件初始化从此变得简单易懂!!!!,androideclipseadt
- 计算机组成的五大部分
- 正则表达式的威力3_获取 find()和group()
- Segmentation Reading List
- 面试(二)
- 使用git操作github配置
- [Javascript Function] Arguments, call(), apply(), caller(), callee()
- 线程之间的同步和通信,synchronized,wait(),notify(),notifyAll()
- CRC校验码的理解+CRC校验码算法代码
- 黑马程序员—JAVA基础—常量、变量、数据类型、运算符
- __thread关键字
- leetcode_First Missing Positive