摘要
文章区别于传统的基于词的中文文本自动分类方法,在选取文本特征时,考虑了词语的语言学信息以及词语概念之间的相关性,提出基于语义的方法和基于概念属性的方法,建立了分类模型。实验表明,改进后的这两种方法使分类系统具有较高的精度。
Different from the conventional word-form based automatic classification system of Chinese texts,giving fur-ther consideration on words' linguistic information and inter-phrase relativity,this paper proposes two feature selection algorithms ,based respectively on words' semantic information and concept attributes.The improved algorithms give a higher accuracy to the automatic classification system.
出处
《计算机工程与应用》
CSCD
北大核心
2003年第12期106-109,共4页
Computer Engineering and Applications
关键词
文本分类
特征抽取
语义
概念属性
Text classification,Feature selection,Semantic,Concept attribute