摘要
中文地址解析是中文地址匹配最核心的问题。针对当前比较流行的基于条件随机场(CRF)或者基于规则的中文地址解析方法,该文结合深度学习中的双向门控循环网络(BiGRU)和CRF的方法来实现中文地址分词;并且针对当前的层次地址模型和四词位标注体系,该文采用了基于空间关系地址模型和五词位的标注方法。然后分别采用基于规则的模型、CRF、BiGRU+SoftMax和BiGRU+CRF模型进行对比实验,发现该文提出的BiGRU+CRF模型配上新的空间关系地址模型及标注体系,可以对地址解析方面有更好的效果。
Chinese address understanding is the most difficult problems in Chinese geocoding. Currently, most methods are rule-based or CRF-based models. In the paper, we try to use BiGRU+CRF model to solve this problem. In addition, we improve the 4-tags labeling architecture with 5-tags labeling architecture, in the meanwhile, we use spatial relationship address model to replace the most used hierarchical address model. At last, we conduct some experiments to compare the performance of rule-based model, CRF,BiGRU+SoftMax and our model, we find that our model has achieved the best result.
作者
刘现印
李玉琳
尹斌
田沁
LIU Xianyin;LI Yulin;YIN Bin;TIAN Qin(Shandong Provincial Land Surveying and Mapping Institute,Jinan 250013,China;Shenzhen Research Center of Digital City Engineering,Shenzhen,Guangdong 518034,China;Key Laboratory of Urban Land Resource Monitoring and Simulation,Ministry of Natural Resource,Shenzhen,Guangdong 518034,China)
出处
《测绘科学》
CSCD
北大核心
2021年第8期165-171,212,共8页
Science of Surveying and Mapping
基金
山东省重大科技创新工程项目(2019JZZY020103)
山东省“十三五”基础测绘规划基金资助项目(201605097)。