摘要
常规的档案信息文本自动分类方法主要使用Bloom二维分类矩阵标注分类特征,导致分类评价指标偏低。对此,提出基于改进K最近邻(K-Nearest Neighbor,KNN)算法的档案信息文本自动分类方法,即提取档案信息文本自动分类特征,再利用改进KNN算法优化信息文本自动分类流程,实现档案信息文本自动分类。实验结果表明,基于改进KNN算法的档案信息文本自动分类方法的加权精确率(weighted-P)、加权召回率(weighted-R)、加权F值(weighted-F)均较高,证明该方法的分类效果较好,有一定的应用价值。
The conventional automatic classification method of archival information text mainly uses Bloom two-dimensional classification matrix to annotate classification features,resulting in low classification evaluation indicators.Therefore,the automatic classification method of archival information text based on the improved KNN algorithm is proposed.Namely,the automatic classification features of archive information texts are extracted,and the improved K-Nearest Neighbor(KNN)algorithm is used to optimize the automatic classification process of archive information texts,achieving automatic classification of archive information texts.The experimental results show that the automatic classification method of archival information text based on the improved KNN algorithm has high weighted-P,weighted-R,and weighted-F,which proves that this method has good classification effect and has certain application value.
作者
潘国炀
PAN Guoyang(Zhejiang Provincial Hospital of Chinese Medicine,Hangzhou Zhejiang 310006,China)
出处
《信息与电脑》
2024年第4期71-73,共3页
Information & Computer
关键词
档案信息
文本
自动分类
file information
text
automatic classification