摘要
旅游业是藏族地区主要的经济来源之一。然而,目前互联网上缺乏藏文旅游信息智能化服务系统,且藏文景点介绍文本也十分匮乏;相反,汉文旅游网站信息量大,但各旅游网站包含的景点不尽相同,景点介绍文本篇幅较长,且各旅游网站对同一个景点描述侧重点不同。为便于不同语言使用者能快速准确地了解景点相关的知识,该文首先在汉文旅游领域分别采用基于BLSTM神经网络模型、基于维基百科以及基于网络爬虫等形式获取与景点相关的共8种属性知识;并通过采用基于维基百科等方法构建的旅游领域汉藏词典,将获取的汉文知识迁移到藏文,其翻译覆盖率平均值达70.44%。最终,构建汉藏双语旅游领域知识图谱。
Tourism is one of the main economic sources in the Tibetan region.However,there is no Tibetan tourism information intelligent service system on the Internet,and the introduction text of Tibetan attractions is also rare.In contrast,Chinese tourism websites have a large amount of information and contain different attractions.To facilitate the understanding of the knowledge related to the attraction,this paper firstly uses the BLSTM neural network model to acquire 11kinds of attribute knowledge related to scenic spots in the Chinese tourism field.Through the Chinese-Tibetan dictionary of tourism,the Chinese knowledge acquired is transferred to Tibetan,and the translation coverage rate is 70.44%.Finally,a knowledge graph of Chinese-Tibetan bilingual tourism is constructed.
作者
冯小兰
赵小兵
FENG Xiaolan;ZHAO Xiaobing(School of Information Engineering,Minzu University of China,Beijing 100081,China)
出处
《中文信息学报》
CSCD
北大核心
2019年第11期64-72,共9页
Journal of Chinese Information Processing
基金
国家语委重点项目(ZDI135-39)