摘要
针对俄文新闻文本的话题检测问题,以俄文文本的自动形态分析、命名实体识别作为辅助手段,设计了一种基于本体描述俄文新闻文本和话题信息并进行相似度计算的方法,随后使用Single-pass算法进行俄文文本的话题检测实验。通过对比基于向量空间模型和基于本体模型的俄文话题检测结果,证明了后者具有相对较高的准确性和有效性。
Aiming at the problem of topic detection in Russian news,using automatic morphological analysis and named entity recognition as the auxiliary means,a method for describing Russian news elements and calculating their similarities based on ontology was designed. The Single-pass algorithm was used to carry out text clustering experiments for topic detection. By comparing the results of vector space model(VSM) model and ontology model,it is proved that the latter has relatively high accuracy and validity.
作者
原伟
唐亮
易绵竹
YUAN Wei;TANG Liang;YI Mian-zhu(Post-Doctoral Research Station of Shanghai International Studies University,Shanghai 200083,China;Information Engineering University,Luoyang 471003,Henan,China)
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2018年第9期49-54,61,共7页
Journal of Shandong University(Natural Science)
基金
国家社会科学基金资助项目(14CYY051
18BYY235)
中国博士后科学基金面上资助项目(2017M610268
2018T110403)
关键词
俄语
本体
话题检测
Russian
ontology
topic detection