摘要
针对电力客户投诉信息进行短文本分类,介绍了K近邻(KNN)算法和中心向量算法,并针对KNN分类算法的某些缺陷作了相关改进,主要加入了中心向量法的思想.对改良后的KNN算法、中心向量算法和传统的KNN算法进行了实验比较,结果发现,相比传统的KNN算法,改良后的新方案能更好地运用在电力客户投诉信息的分类操作上.
In terms of the claims by electric power cousumers,texts of claims are classified. The central vector algorithms and KNN algorithms two classification methods are introduced some improvements are made on the drawbacks of KNN algorithm. Finally,an empirical study of using the improved KNN algorithm,the central vector algorithm and the traditional KNN algorithm to categorize the Chinese text is conducted. The result of the experiment shows that,compared with the improved KNN algorithm,the improved algorithm has better categorization effect of the Chinese text,verifying better validity and feasibility.
出处
《上海电力学院学报》
CAS
2017年第6期597-600,共4页
Journal of Shanghai University of Electric Power
关键词
文本分类
中心向量法
K近邻算法
相似度
text classification
central vector method
K-Nearest Neighbor algorithm
similarity