期刊文献+

基于条件随机场的中文地址要素识别方法 被引量:20

CRFs-based approach to recognition of Chinese address element
下载PDF
导出
摘要 由于中文地址命名的不规范性和汉语语言特点,中文地址要素识别成为地址编码的关键技术。传统的特征字匹配和字典匹配方法,难以解决地址要素命名的多样性问题。借鉴自然语言处理技术,通过构建地址要素标注集,设计了基于条件随机场的中文地址要素识别方法。实验证明,与基于特征字的规则方法相比,基于条件随机场的方法能够在较大程度上提高识别效果。由于条件随机场模型具有较好的泛化能力,该方法具有更强的通用性,特别适宜于大规模地址数据的批量解析和大众化位置服务中地址编码的快速处理。 Because of the nonstandard named Chinese address and description character of Chinese language,recognition of Chinese address elements has been regarded as key issues of Chinese geocoding.It is difficult to resolve the problem of address name diversity by traditional method of character words matching and dictionary or gazetteer matching.Chinese address recognition method on the basis of CRFs is designed by constructing address annotation set using NLP technology.The experiment proves that CRFs based method is better than character based rule method in recognition result.As CRFs model has good generalization ability,this method has greater generality that especially fits for large-scale batch parsing and quick geocoding in LBS.
出处 《计算机工程与应用》 CSCD 北大核心 2010年第13期129-131,共3页 Computer Engineering and Applications
基金 国家自然科学基金No.40971231~~
关键词 地址编码 中文地址要素 自然语言处理 条件随机场 geocoding Chinese address element natural language processing conditional random fields
  • 相关文献

参考文献12

  • 1江洲,李琦.地理编码(Geocoding)的应用研究[J].地理与地理信息科学,2003,19(3):22-25. 被引量:79
  • 2OpenGIS Consortium.Gecoder service draft candidate implementation specification 0.7.6[S].Open Consortium Discussion Paper 01-026r1,2001.
  • 3Goldberg D W,Wilson J P,Knoblock C A.From text to geographic coordinates:The current state of geocoding[J].URISA Journal,2007,19(1):33-46.
  • 4Leidner J L.Toponym resolution in text:Annotation,evaluation and applications of spatial grounding of place names[D].Edinburgh:University of Edinburgh,2007.
  • 5Hill L L.Georoferencing:The geographic associations of information[M].Cambridge,Mass:MIT Press,2009.
  • 6江洲,李小林,刘碧松.地理信息系统地址编码技术标准化研究[J].世界标准化与质量管理,2007(5):22-25. 被引量:21
  • 7李军,李琦,毛东军,郭玲玲.北京市地理编码数据库的研究[J].计算机工程与应用,2004,40(2):1-3. 被引量:43
  • 8崔恒异.中国古今地理通名汇释[M].安徽:黄山书社出版社,2003.
  • 9Lafferty J,McCallum A,Pereira F.Conditional random fields:Probabilistic models for segmenting and labeling sequence data[C]//Proc of the 18th ICMLSan Francisco:Morgan Kaufmann,2001:282-289.
  • 10周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112

二级参考文献50

共引文献245

同被引文献153

引证文献20

二级引证文献74

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部