摘要
地理信息与数据是客观知识世界的重要组成部分。研究如何从大量非结构化的信息中自动抽取地理实体位置关系具有重要意义。提出一种基于语义文法的地理实体位置关系获取方法,该方法可准确地从网页文本中获取多个地理实体之间的复合位置关系。首先,设计一种反映地理实体位置关系的语义文法GeoRSG。GeoRSG反映了地理实体位置关系的层次分类关系,并采用基于规则的方式刻画地理实体位置关系在文本中的语言表达方式。然后,实现地理实体位置关系解析器GeoRSG Parser。该解析器利用GeoRSG对文本进行解析,获得谓词表达形式的位置关系知识。实验结果显示,该方法从1000条语句中获取了81条三元和816条二元地理实体位置关系,并且取得了88.85%的正确率。
Geographic information and data are important components of the objective knowledge world. Geographic information extraction (GIE) aims to extract various relationships between geographic entities from unstructured geo- graphic text. A novel method for GIE was proposed, which depends on semantic parsing with a geographic grammar. First, GeoRSG (Geographical Relationship Semantic Grammar) was constructed, which reflects geographic relationships in Chinese written language. GeoRSG also reflects a classification of relationships between geographic entities, and uses a rule-based method to depict linguistic expressions of relationships in the text. Then, we implemented a parser, called the GeoRSG Parser,which is used to obtain the geographical knowledge in the form of the predicate with the help of GeoRSG. Experiments indicate that the method can obtain 81 triples relationships and 816 binary relationships between geographic entities from 1000 statements,and has achieved a precision rate of 88. 85%.
出处
《计算机科学》
CSCD
北大核心
2016年第7期208-216,共9页
Computer Science
基金
国家自然科学基金项目(91224006
61173063
61203284)
科技部项目(201303107)资助
关键词
地理实体位置关系
语义文法
知识抽取
Relationship between geographic entities, Semantic grammar, Knowledge acquisition from text