Lucene.Net2.9(中科院分词.net 版) DEMO
来源:互联网 发布:下载隐藏软件 编辑:程序博客网 时间:2024/06/05 04:33
中科院分词 ICTCLAS 的 .net 版本 是吕震宇1.0版本开发
Lucene.Net2.9
接口
ICTCLASAnalyzer .cs
using System;using System.Collections.Generic;using System.Linq;using System.Web;using System.IO;using Lucene.Net.Analysis;using Lucene.Net.Analysis.Standard;namespace Demo{ public class ICTCLASAnalyzer : Analyzer { //定义要过滤的词 private string dictPath; public ICTCLASAnalyzer(string dictPath) { this.dictPath = dictPath; } public override TokenStream TokenStream(string fieldName, TextReader reader) { TokenStream ts = new ICTCLASTokenizer(reader, dictPath); return ts; } }}
ICTCLASTokenizer.cs
using System;using System.Collections.Generic;using System.Linq;using System.Web;using System.IO;using Lucene.Net.Analysis;using Lucene.Net.Analysis.Standard;using SharpICTCLAS;namespace Demo{ class ICTCLASTokenizer : Tokenizer { int nKind = 2; List<WordResult[]> result; int startIndex = 0; int endIndex = 0; int i = 1; /**//// <summary> /// 待分词的句子 /// </summary> private string sentence; /**//// <summary>Constructs a tokenizer for this Reader. </summary> public ICTCLASTokenizer(System.IO.TextReader reader, string DictPath) { this.input = reader; sentence = input.ReadToEnd(); sentence = sentence.Replace("\r\n",""); //string DictPath = @"E:\TestDemo\lucene.net+2.9.2+实现索引生成,修改,查询,删除实例\Demo\WordSegmentDate\"; //string DictPath = Path.Combine(Environment.CurrentDirectory, "Data") + Path.DirectorySeparatorChar; //Console.WriteLine("正在初始化字典库,请稍候"); WordSegment wordSegment = new WordSegment(); wordSegment.InitWordSegment(DictPath); result = wordSegment.Segment(sentence, nKind); } /**//// <summary>进行切词,返回数据流中下一个token或者数据流为空时返回null /// </summary> public override Token Next() { Token token = null; while (i < result[0].Length-1) { string word = result[0][i].sWord; endIndex = startIndex + word.Length - 1; token = new Token(word, startIndex, endIndex); startIndex = endIndex + 1; i++; return token; } return null; } }}
DEMO地址:
- Lucene.Net2.9(中科院分词.net 版) DEMO
- lucene.net2.9搜索Demo
- lucene分词器分词demo
- Lucene中文分词Demo
- 完整的站内搜索Demo(Lucene.Net+盘古分词)
- 完整的站内搜索Demo(Lucene.Net+盘古分词)
- 让中科院中文分词系统ICTCLAS为lucene所用的简单程序(C#版)
- 让中科院中文分词系统ICTCLAS为lucene所用的简单程序(C#版)
- 盘古分词 lucene.net
- ICTCLAS 中科院分词 在java上的实现demo
- 中科院分词NLPIR,demo运行初始化失败问题记录
- Lucene.net中文分词探究
- Lucene.net中文分词探究
- Lucene.net中文分词探究
- Lucene.net中文分词探究
- Lucene.net中文分词探究
- Lucene.net中文分词探究
- Lucene.Net 与 盘古分词
- Map迭代 方式
- MFC常规DLL和扩展DLL比较编写
- 一个基于NIO的下载队列实现
- 如何在命令行下调试Django的查询语句
- UIApplication对象及其代理UIApplicationDelegate
- Lucene.Net2.9(中科院分词.net 版) DEMO
- 《设计模式 ● 观察者》之业务场景
- 七款开源ERP评估比较
- Quartz 2D编程指南(1) - 概览
- html5标签参考手册
- Quartz 2D编程指南(2) - 图形上下文(Graphics Contexts)
- drawRect方法实现进度条progress
- 得到linux平台上ssh或ftp用户
- uva 11205 - The broken pedometer