摘要
自21世纪以来,农业信息网站开始迅速增加。为了给广大农民朋友和农业科研人员提供方便,需要对农业信息进行分类。将农业信息进行分类有利于农业信息的获取和管理,农业分类的方法有很多种,其中中心法分类相对简单且卓有成效。中心向量计算方法是中心法分类的核心,文中实验目的在于找出效率较高的中心向量计算方法来提高分类的准确率。目前文本类的中心向量计算多数是由该类别文本特征向量的简单算术平均得到的,这样计算得出的中心向量往往会有模型偏差,以至于不能得到很好的分类效果。为解决这个问题,使用总和法、均值法和归一化法计算中心向量,并进行对比实验,结果表明归一化法在查准率、查全率和F1测度都有较好的表现。
Since twenty-first century, the site of agricultural information has increased rapidly. In order to provide the convenience for farmers and agricultural researchers,it is need to classify agricultural information. The classification of agricultural information is favor of acquisition and management of the agricultural information. There are several ways to classify agricultural information,in which the cen- troid-based classification is simple and effective. In this paper,it uses centroid-based classification to find the more efficient one to im- prove the accuracy of agricultural information. At present, most of the methods for calculating the center vector of the text are the average value of the text feature vector. This method can' t get a good classification results due to the model deviation for center vector obtained. In order to solve this problem, the sum method, means method and normalization method is used to calculate the center vector and the re- suit of three methods are compared. The results show that the normalization method has better performance in Precision, Recall and Fl measure.
出处
《计算机技术与发展》
2016年第8期146-151,共6页
Computer Technology and Development
基金
新疆维吾尔自治区高技术研究发展计划项目(2015X0103)
关键词
农业信息
分类
中心法
中心向量
文本特征向量
agricultural information
classification
centroid-based method
center vector
text feature vector