摘要
传统的话题演化跟踪任务主要使用基于主题模型的方法,但该方法对于文本语义的提取及表征能力较弱。该文在词嵌入方法的基础上结合LDA和注意力增强的孪生BiLSTM网络,提出文本邻近度模型PDRBL来确定话题演化过程中的时态判定。此外,基于PDRBL模型给出了六个话题演化时态及其判定方法,进而提出了话题演化跟踪方法TETP。实验表明,该文所提模型在精确率、召回率、F;值三个方面具有优化或可比较的性能,并可以有效捕获话题演化路径。
Previous topic evolution tracking methods are mostly based on topic models, with defect in extracting and representing text semantics. Based on word embedding, this paper proposes a text proximity model PDRBL that combines explicit similarity and implicit similarity to determine the temporal judgment in the topic evolution process. Based on the PDRBL, this paper gives six topic evolution tenses and their judgment methods, and then the topic evolution tracking method based on PDRBL(TETP) is proposed. Experiments show that the proposed models have better or comparable performance in terms of Precision, Recall and F;value, and can effectively capture the topic evolution path.
作者
龚晓康
应文豪
王骏
龚声蓉
GONG Xiaokang;YING Wenhao;WANG Jun;GONG Shengrong(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China;School of Computer Science and Engineering,Changshu Institute of Technology,Changshu,Jiangsu 215500,China;School of Communication and Information Engineering,Shanghai University,Shanghai 200444,China)
出处
《中文信息学报》
CSCD
北大核心
2022年第2期93-103,共11页
Journal of Chinese Information Processing
基金
国家重点研发计划项目(2018YFB1004901)
江苏省自然科学基金(BK20161268)
教育部人文社科基金(18YJCZH229)
国家自然科学基金(61972059)
江苏省教育科学“十三五”规划课题(X-a/2018/08)。