摘要
大数据环境下,基于海量数据,针对用户画像的精准度和实时性问题,对实时用户画像系统进行了研究工作,提出了一种采用流式计算思想的实时用户画像系统架构。从整体角度梳理分析了用户画像的体系结构,利用消息队列中间件Kafka实时采集不同维度的用户数据,利用大数据分析和机器学习技术构建了相对精准立体的用户画像数据标签体系及用户画像模型,应用Flink框架和数据挖掘技术对多源流式数据进行实时计算处理,深度分析用户,挖掘用户的特征及需求,进而刻画出精准的用户画像,提供精准的个性化信息服务。该架构能准确对用户进行全方位、高精度的画像构建,结果具有较高的实时性和精确度,从而能达到快速且准确地了解用户需求、利用数据服务用户和业务发展的目的。
Under the background of the big data,we carry out the research on real-time user profile of massive data and propose a real-time user profile system architecture based on stream computing for the problem of accuracy and real-time. Analyze the architecture of user profile from holistic perspective. With message queue middleware Kafka,the user data from different dimensions in real time is collected,and relatively accurate stereoscopic user profile data labeling system and user profile model are constructed through big data analysis and machine learning techniques. Flink framework and data mining technology are used to process real-time multi-source streaming data for in-depth analysis of users,mining user characteristics and needs,and then depicting accurate user portrait,so as to provide accurate personalized information services. This architecture can accurately construct the user’s image with high accuracy in all directions. The results have high real-time performance and accuracy,which can realize the purpose of quickly and accurately understanding user needs,using data to serve users and business development.
作者
姜红玉
汪朋
封雷
JIANG Hong-yu;WANG Peng;FENG Lei(The 15th Research Institute of China Electronics Technology Group Corporation,Beijing 100083,China)
出处
《计算机技术与发展》
2020年第7期186-193,共8页
Computer Technology and Development
基金
中国电子科技集团重点科研项目(JY201802850)。