期刊文献+

基于社交媒体文本挖掘的个人事件检测方法 被引量:2

Personal event detection method based on text mining in social media
下载PDF
导出
摘要 用户的社交媒体中蕴含着他们过去的个人经历和潜在的生活规律,研究其规律对预测用户未来的行为以及对用户进行个性化推荐有很大的价值。通过收集微博数据,定义了11种类型的事件,并提出了一个三阶段的Pipeline的系统,利用BERT预训练模型,分别在三个阶段使用BERT+BiLSTM+Attention、BERT+FullConnect、BERT+BiLSTM+CRF方法进行个人事件检测。从微博文本中抽取出该文本是否包含定义的事件、包含的事件类型、每种事件包含的元素等信息,具体元素为Subject(事件主语)、Object(事件元素)、Time(事件发生时间)、Place(事件发生的地点)和Tense(事件发生的时态),从而探究用户个人时间轴上的事件变化规律来预测个人事件。在收集的真实用户微博数据集上进行实验,并与逻辑回归、朴素贝叶斯、随机森林、决策树等分类算法进行对比分析。实验结果表明,三个阶段中的BERT+BiLSTM+Attention、BERT+FullConnect和BERT+BiLSTM+CRF方法均取得了最高的F1值,验证了所提方法的有效性。最后根据所提方法抽取出的事件和其中的时间信息可视化地构建了用户的个人事件时间轴。 Users’social media contains their past personal experiences and potential life patterns,and the study of their patterns is of great value for predicting users’future behaviors and performing personalized recommendations for users.By collecting Weibo data,11 types of events were defined,and a three⁃stage Pipeline system was proposed to detect personal events by using BERT(Bidirectional Encoder Representations from Transformers)pre⁃trained models in three stages respectively,including BERT+BiLSTM+Attention,BERT+FullConnect and BERT+BiLSTM+CRF.The information of whether the text contained defined events,the event types of events contained,and the elements contained in each event were extracted from the Weibo,and the specific elements are Subject(subject of the event),Object(event element),Time(event occurrence time),Place(place where the event occurred)and Tense(tense of the event),thereby exploring the change law of user’s personal event timeline to predict personal events.Comparative experiments and analysis were conducted with classification algorithms such as logistic regression,naive Bayes,random forest and decision tree on a collected real user Weibo dataset.Experimental results show that the BERT+BiLSTM+Attention,BERT+FullConnect,BERT+BiLSTM+CRF methods used in three stages achieve the highest F1⁃score,verifying the effectiveness of the proposed methods.Finally,the personal event timeline was visually built according to the extracted events with time information.
作者 肖锐 刘明义 涂志莹 王忠杰 XIAO Rui;LIU Mingyi;TU Zhiying;WANG Zhongjie(Faculty of Computing,Harbin Institute of Technology,Harbin Heilongjiang 150001,China)
出处 《计算机应用》 CSCD 北大核心 2022年第11期3513-3519,共7页 journal of Computer Applications
基金 国家自然科学基金资助项目(61772155)。
关键词 社交媒体 个人事件 事件检测 BERT模型 个人事件时间轴 social media personal event event detection BERT(Bidirectional Encoder Representations from Transformers)model personal event timeline
  • 相关文献

同被引文献17

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部