摘要
探讨了基于触发词汇和规则模型相结合的中文文本中时间信息解析方法。通过分析、归纳中文文本中时间信息描述特点,构建时间词汇词典和时间信息描述模式库,设计时间信息抽取、规范化表达和语义推理算法,实现了中文文本中时间信息的解析。实验结果表明,中文文本中时间信息抽取的准确率、召回率和F1值分别为75.00%、88.24%和40.54%,为泛在时空信息动态关联更新和实时挖掘分析提供数据源,且通过与空间维数据有机地、交互地组织,能够实时展现地理现象和事物的时空演化过程、时空分布特征,从而推动地理信息检索、LBS等地理信息服务向动态化、多维化方向发展。
There is a rapid development of unstructured geographic information in text,however,most of them are rarely with an effective use.Currently,some researches focused on extraction of spatial information,such as place names and spatial relations in text,however,rich temporal information is ignored.In this paper,borrowing the idea from natural language processing technology,an interpretation approach of temporal information in Chinese text is investigated.Firstly,trigger words and expression patterns are summarized based on the linguistic characteristics of temporal information in Chinese text.Secondly,with this knowledge base,extraction algorithm of temporal information from Chinese text is discussed.Finally,a standardization and reasoning algorithm of abnormal and relative temporal information is proposed.The experiment result shows that the proposed method can obtain precision,recall and F1 value with 75.00%,88.24% and 40.54% respectively.
出处
《地理与地理信息科学》
CSCD
北大核心
2014年第6期1-6,F0002,共7页
Geography and Geo-Information Science
基金
国家863计划项目"泛在空间信息关联更新与面向主题时空信息挖掘"(2012AA12A403-3)
国家自然科学基金青年基金项目(41401451)
中央高校基本科研业务费专项资金项目(JZ2014HGBZ0064)
国家自然科学基金项目(40971231)
关键词
时间信息抽取
时间词汇词典
规范化表达
时间推理
中文文本
extraction of temporal information
time vocabulary dictionary
normalized expression
temporal reasoning
Chinese text