摘要
针对大规模数据库的相似重复记录的检测问题,提出了一种量子群优化算法(QPSO)与最小二乘支持向量机(LSSVM)相结合的相似重复记录检测方法(QPSO-LSSVM)。首先计算记录字段的相似度值;然后利用QPSO对LSSVM参数进行优化,构建相似重复记录检测模型;最后通过具体数据集进行仿真测试实验。仿真结果表明,QPSO-LSSVM不仅提高了重复记录检测准确率,而且提高了检测效率,是一种有效的相似重复记录检测算法。
Approximately duplicate record detection algorithm was proposed based on quantum swarm algorithm(QPSO) and least squares support vector machine(LSSVM) to solve the large-scale database approximation duplicate record detection problem.Firstly,the record field similarity values are calculated,and then the LSSVM parameters are optimized,by QPSO to construction the approximately duplicate records detection model,finally simulation experiments are carried out on the data set.The simulation results show that QPSO-LSSVM not only improves the accuracy of the duplicate record detection but also improves the detection efficiency,and it is an effective approximate duplicate record,Detection algorithm.
出处
《计算机科学》
CSCD
北大核心
2012年第11期157-159,190,共4页
Computer Science
基金
河南省科学技术厅科技攻关科学项目(112102210199)
河南省科学技术厅基础与前言研究项目(112300410201)资助
关键词
量子粒子群优化算法
最小二乘支持向量机
相似重复记录
检测
Quantum particle swarm optimization
Least square support vector machines
Approximately duplicate record
Detection