期刊文献+

一种多策略结合的地址匹配算法 被引量:6

A multi-strategy combined address matching algorithm
下载PDF
导出
摘要 针对现有的地址匹配算法地址要素切分存在歧义、匹配率和准确率低等问题,提出一种多策略结合的地址匹配算法。利用双向最大匹配分词算法提取有歧义的地址要素,通过建立地址要素特征字词典与地址标准数据库,对歧义结果进行首次歧义消除,再利用基于序列标注的中文分词进行二次歧义消除,将得到的各地址要素匹配数据库后计算相似性匹配得分,最后按照各地址要素的重要程度分配权重,加权求和后得到匹配总得分。结果表明,该算法优于其他传统的地址匹配算法,提高了地址匹配的匹配率与准确率。 In order to solve the problems such as the ambiguity of address element segmentation,the low matching rate and accuracy in available address matching algorithms,a multi-strategy combined address matching algorithm(MSC)was proposed.The bidirectional maximal matching participle algorithm was used to extract ambiguous address elements in MSC.Firstly,the ambiguous results were disambiguated by establishing the address element feature word dictionary and address standard database for the first time.Then,the Chinese word segmentation based on sequence labeling was used for secondary ambiguity elimination.Meanwhile,the similarity matching score was calculated after each address element matched with the database.Finally,the weights were assigned according to the importance of each address element,as the total score was obtained after the weighted summation.The experimental results showed that the MSC improved the matching rate and accuracy of address matching which was superior to other traditional address matching algorithms.
作者 吴睿 龙华 熊新 彭艺 WU Rui;LONG Hua;XIONG Xin;PENG Yi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650000,Yunnan,China)
出处 《河南理工大学学报(自然科学版)》 CAS 北大核心 2019年第5期124-129,共6页 Journal of Henan Polytechnic University(Natural Science)
基金 国家自然科学基金资助项目(61761025)
关键词 多策略 地址匹配 序列标注 权重 匹配得分 multi-strategy address matching sequence annotation weight matching score
  • 相关文献

参考文献9

二级参考文献81

共引文献166

同被引文献61

引证文献6

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部