摘要
传统的用户属性推断方法主要基于机器学习及统计学习,其推断方法忽略了用户的整体表征及任务之间的相关性。本文提出一种基于多任务融合模型的用户属性推断方法,利用doc2vec独特的结构特性,加入文档向量以实现用户整体表征,避免人工提取特征的局限性。为实现用户多属性推断任务,本文提出基于关联学习的多任务融合推断框架,即在分别识别用户多个属性基础上赋予单用户多属性表征,在增强用户整体表征能力的同时,建立多个属性间的关联关系,提高单任务学习的区分度;然后采用模型融合技术,完成属性间关联学习,提高学习准确率及模型泛化能力,同时使用尽可能少的模型进行融合,提高模型运行效率。经实验比对,本文在多个数据集上的实验结果较其他算法有一定优势。
Traditional user attribute inference method is mainly based on machine learning and statistical learning methods,and its inference method ignores the user′s overall representation and the correlation between tasks.A user attribute inference method based on multitasking ensemble model is proposed,which uses doc2vec unique structural characteristics and adds document vector to achieve the overall representation of the user,thus avoiding the limitations of artificial features extraction.In order to realize the multi-attribute inference task,a multi-task ensemble framework based on association learning is proposed,which is to identify multiple attributes of a user individually and give the multi-attribute representation of a single user.It enhances the overall representation of user.The relationship between multiple attributes is established at the same time,so as to improve the distinguishing degree of single-task learning.Then,this paper uses the model ensemble technology to complete the inter-attribute learning,improves the accuracy of learning and model generalization ability,and uses as few models as possible to improve the model operation efficiency.Experimental comparison on several data sets shows some advantages over other algorithms.
作者
赵宇
李佳艺
王莉
Zhao Yu;Li Jiayi;Wang Li(Department of Computer Science and Technology,Taiyuan University of Technology,Jinzhong,030600,China;College of Big Data,Taiyuan University of Technology,Jinzhong,030600,China)
出处
《数据采集与处理》
CSCD
北大核心
2018年第2期334-342,共9页
Journal of Data Acquisition and Processing
基金
国家高技术研究发展计划("八六三"计划)(2014AA015204)资助项目
山西省自然科学基金(201703D421013)资助项目
中科院计算技术研究所网络数据科学重点实验课题(CASNDST20140X)资助项目