摘要
本文提出了一种提高中文文本分类器推广性能的方法。一般而言 ,采用机器学习的方法对文本集合进行训练 ,可以获得文本分类器。本文引入了文本语义不变性常识 ,并将其融合到文本分类器中 ,提出了改进文本分类器的方法。与支撑向量机相结合 ,设计并实现了改进的文本分类器。对中文文本分类的实验表明 。
In the paper,a method to improve the generalization performance of the Chinese text classifier is put forward.Generally speaking,a text classifier is obtained by training text set with a machine learning method.A kind of common sense about text semantic invariance is introduced.A method to improve the text classifier is put forward by fusing the common sense into it.With the combination with a Support Vector Machine,we design and implement the improved text classifier.The experiment shows that the generalization performance of the text classifier is improved with the method.
出处
《中文信息学报》
CSCD
北大核心
2002年第2期7-13,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金 (6 0 0 730 19)
国家自然科学基金重大项目 (6 9790 0 80 )支持