摘要
针对网络舆情BBS上的热点话题,提出了一种基于给定话题关键词的多级关键词话题检测算法,借助主次关键词、同义词和变形词扩展,构建了层次化的话题模型。并加入命名实体识别技术和具有就近原则及周期性原则的话题时序关系。测试验证本算法较之传统的文本聚类检测算法更加有效。
For a large number of BBS hot topics, a detecting technology of topic with multi - level keywords is put forward based on given topic keywords,and a topic model is built with major keywords,subordinate keywords,synonym and extended keywords. At the same time name entity recognization technology is equipped with, and sequential relationship of topic with principle of proximity and periodic law is bui lt. tests show that this algorithm is more effective than the traditional one.
出处
《广东石油化工学院学报》
2017年第3期41-45,共5页
Journal of Guangdong University of Petrochemical Technology