摘要
对于科研项目查重,常规方法为对科研项目的项目名称、申请书等项目组成部分分别进行相似度计算,加权处理得到综合相似度后再判断项目的相似度,不易区分客观上有一定相似度而实际上不相似的科研项目.提出一种基于最小值和Min-Max方法修正的科研项目综合相似度计算方法,先设定项目组成部分相似度最小值,低于最小值的项目组成部分不参与项目综合相似度计算,再基于Min-Max方法对项目组成部分相似度进行修正,经过修正的相似度加权处理后得到项目综合相似度.实验结果表明,基于最小值和Min-Max方法修正的计算方法能较大程度提高似相似实际不相似科研项目综合相似度的精确度.
For the duplication checking of scientific research projects,the conventional method is to calculate the similarity of the project components such as the project name and applications respectively,and then judge the similarity of projects after weighted processing to obtain the comprehensive similarity.It is difficult to distinguish scientific research projects that are objectively similar but actually not similar.This paper proposes a comprehensive similarity calculation method of scientific research projects based on minimum value and Min-Max method.First,set the minimum similarity value of project components,and the similarity of project components lower than the minimum value will not participate in the calculation of project comprehensive similarity.Then,the similarity of project components is modified based on the Min-Max method,and the comprehensive similarity of the project is obtained after the modified similarity weighting processing.The experimental results show that the calculation method modified based on the minimum value and Min-Max method can greatly improve the accuracy of the comprehensive similarity of similar but actually dissimilar scientific research projects.
作者
杜军
谭鹏
陈曦
李俊
马继涛
DU Jun;TAN Peng;CHEN Xi;LI Jun;MA Ji-tao(Yunnan Academy of Scientific&Technical Information,Kunming 650051,China)
出处
《云南民族大学学报(自然科学版)》
CAS
2023年第6期759-763,共5页
Journal of Yunnan Minzu University:Natural Sciences Edition
基金
云南省科技计划项目(2018DA006,2019DA002)。
关键词
相似度
科技计划项目
科研项目
MIN-MAX
最小值
similarity
science and technology project
scientific research project
Min-Max
minimum value