期刊文献+

基于Structured Streaming的实时文本画像系统设计与实现

Real-time Text Profile System Based on Structured Streaming
下载PDF
导出
摘要 针对大数据环境下画像系统的实时性和准确性问题,提出一种基于Structured Streaming的实时画像系统设计与实现。利用canal组件对用户行为日志系统实现增量订阅,kafka消息中间件完成实时数据流接入,应用Structured Streaming实时计算框架对用户的实时数据进行分析处理,刻画用户的实时兴趣。通过改进的TF-IDF算法改善文本画像系统的准确性与可靠性,并借助Structured Streaming与静态数据良好的交互性减轻实时计算压力,提高系统响应速度。 Aiming at the real-time and accuracy of profile system in big data environment,a real-time profile system design and implementation based on Structured Streaming is proposed in this paper.The canal component is used to implement incremental subscription to the user behavior log system,kafka message middleware completes real-time data stream access,and the Structured Streaming real-time processing framework analyzes and processes the user’s real-time data to describe the user’s real-time interest.Through the improved TF-IDF algorithm,the accuracy and reliability of the text profile system are improved,the good interaction between Structured Streaming and static data is used to reduce the pressure of real-time computing and improve the system response speed.
出处 《工业控制计算机》 2022年第11期114-116,118,共4页 Industrial Control Computer
基金 国家重点研发计划(2021YFB2900800) 上海市科委项目(20511102400)、(20ZR1420900)。
关键词 Structured Streaming 大数据 画像系统 TF-IDF Structured Streaming big data profile system TF-IDF
  • 相关文献

参考文献5

二级参考文献20

共引文献152

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部