摘要
基于中文新闻信息分类体系,探索了中文新闻信息分类与代码的自动分类方法。根据中文新闻信息分类与代码的特点以及初始主题词满足的规则获得分类的初始主题词。
Traditional classifying method for Chinese News Information Classification and Code is not suitable for text classification.In this paper,we research on automatic classification methods on the basis of Chinese News Information Classification and Code.According to characters of Chinese-language news classification and code and the rules for the initial topic words and phrases,the initial topic words and phrases are extracted with the help of information given by the classification system.The feature vector of Chinese News Information Classification and Code is constructed by the use of the initial topic words and phrases.The automatic text classification is implemented.the result is discussed by sampling analysis and the classification precision is 72%.
出处
《太原理工大学学报》
CAS
北大核心
2010年第4期402-405,411,共5页
Journal of Taiyuan University of Technology
基金
国家自然科学基金项目(60663008)
国家语言资源监测与研究中心平面媒体语言分中心项目:基于"中文新闻信息分类和代码"的分类语料库建设方法研究资助
关键词
文本分类
中文新闻信息分类与代码
新闻文本
text classification
Chinese-language news classification and code
news texts