期刊文献+

面向导游词的景区地理实体显著性排序方法 被引量:1

A Method of Geographic Entity Significance Ranking with Tour Guide Speeches for Scenic Spots
下载PDF
导出
摘要 地理实体显著性排序是面向自然语言的层次化场景认知研究的重要内容之一。导游词作为系统描述特定景区环境、景点与重要资源的自然语言形式,包含大量的景区地理实体,但传统实体排序方法忽视了地理空间信息的重要作用,难以处理地理实体特有的非结构化或半结构化地理空间特征。该文提出一种面向导游词的景区地理实体显著性排序(Geographic Entity Significance Ranking,GESR)模型,通过分析包含空间拓扑关系、模糊形态描述在内的景区地理实体相关特征构建目标排序函数,迭代生成基于样本误差分布与随机梯度下降法的弱学习器,再通过加权平均集成与降误差剪枝获得提升后的强学习器,即排序模型。利用中文导游词文本对模型进行验证,结果表明:1)与3种基线方法对比,GESR模型的归一化折损累积增益达0.8841,AUC达0.7579,排序性能最优;2)空间拓扑关系和模糊形态描述特征对GESR模型的影响最显著;3)相比人群关注热度,GESR模型对导游词中地理实体空间特征的反映能力更强。 Geographic entity significance ranking is the important content of natural language oriented hierarchical scene cognition and geospatial expression.As a natural language form that systematically describes the environment,attractions and important resources of specific scenic spots,tour guide speeches contain geographic entities with different degrees of significance.However,traditional entity ranking methods lack pertinence and ignore the important role of geospatial information,which are difficult to deal with unstructured or semi-structured geospatial features.In order to solve this problem,this paper proposes a geographic entity significance ranking(GESR)model based on tour guide speeches for scenic spots.Firstly,combining the spatial cognition results in tour guide speeches,the objective function is constructed by selecting and extracting five features:geographic entity frequency,clustering coefficient,feature based on co-occurrence relationship,spatial topological relations and ambiguity levels of morphological descriptions.Secondly,based on the sample error distribution,this paper improves the learning ability of learners to difficult samples,and iteratively generates linear weighted weak learners and figures out their parameters.Finally,the weighted averaging process and reduced-error pruning are used to obtain the strong learner,namely the ranking model.The experiment with Chinese tour guide speeches shows that:1)Compared with three baseline methods,the normalized discounted cumulative gain(NDCG)of the GESR model is 0.8841 and area under curve(AUC)is 0.7579,and the ranking performance is best.2)Features of spatial topological relations and ambiguity levels of morphological descriptions have the most significant influence on the GESR model.3)Compared with the popular attention,the GESR model has a stronger ability to reflect geospatial features of geographic entities in tour guide speeches.
作者 吴越 张翎 龙毅 WU Yue;ZHANG Ling;LONG Yi(School of Geography,Nanjing Normal University/Key Laboratory of Virtual Geographic Environment of Ministry of Education,Nanjing 210023;Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application,Nanjing 210023,China)
出处 《地理与地理信息科学》 CSCD 北大核心 2022年第3期9-16,共8页 Geography and Geo-Information Science
基金 国家自然科学基金项目“多模态地理信息融合机制及其关键技术研究”(42171403)。
关键词 导游词 地理实体 显著性 实体排序 空间拓扑关系 模糊形态描述 tour guide speech geographic entity significance entity ranking spatial topological relations ambiguity levels of morphological descriptions
  • 相关文献

参考文献13

二级参考文献168

共引文献2281

同被引文献8

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部