摘要
系统基于Windows平台,使用VB语言在微软.NET框架下完成,设计并实现一个基于"非规范用词"和"典型案例词典"的文本分类系统,系统采用人脑对自然语言理解的心理学原理"人们总是根据已知的最熟悉的、最典型例子的进行判断,只有在该方法不奏效的时候才使用频率这一概念,并且使用的是十分简单的频率"的分类策略。详细介绍词语切分统计、词库设计、父类词匹配、子类词匹配等子系统模块。
The system is based on Windows platform and use the VB under the Microsoft. NET framework. We design and implement a text classification system based on "non- normative terms" and "the dictionary based on typical cases", the system use the psychological Principles of the natural language understanding of human brain "People judge through what they are most familiar with, and the most typical cases, only if it does not work and then the concept of frequency will be used, they will use the very simple methods of requency". The article introduces some models in detail such as the statistics of word segmentation, the database design, the matching of the father category, the matching of the sub-categories, etc.
出处
《现代情报》
CSSCI
2010年第1期159-161,共3页
Journal of Modern Information
关键词
知识管理
文本分类
特征选择
VB.NET
knowledge management
text classification
feature selection
VB.NET