摘要
电子商务应用中产生了大量用户评分数据,而这些数据中富含了用户观点和偏好信息,为了能够从这些数据中准确地推断出用户偏好,提出一种面向评分数据中用户偏好发现的隐变量模型(即含隐变量的贝叶斯网)构建和推理的方法。首先,针对评分数据的稀疏性,使用带偏置的矩阵分解(BMF)模型对其进行填补;其次,用隐变量表示用户偏好,给出了基于互信息(MI)、最大半团和期望最大化(EM)算法的隐变量模型构建方法;最后,给出了基于Gibbs采样的隐变量模型概率推理和用户偏好发现方法。实验结果表明,与协同过滤的方法相比,该方法能有效地描述评分数据中相关属性之间的依赖关系及其不确定性,从而能够更准确地推断出用户偏好。
Large amount of user rating data, involving plentiful users' opinion and preference, is produced in e-commerce applications. An construction and inference method for latent variable model ( i. e., Bayesian Network with a latent variable) oriented to user preference discovery from rating data was proposed to accurately infer user preference. First, the unobserved values in the rating data were filled by Biased Matrix Faetorization (BMF) model to address the sparseness problem of rating data. Second, latent variable was used to represent user preference, and the construction of latent variable model based on Mutual Information (MI), maximal semi-clique and Expectation Maximization (EM) was given. Finally, an Gibbs sampling based algorithm for probabilistic inference of the latent variable model and the user preference discovery was given. The experimental results demonstrate that, compared with collaborative filtering, the latent variable model is more efficient for describing the dependence relationships and the corresponding uncertainties of related attributes among rating data, which can more accurately infer the user preference.
作者
高艳
岳昆
武浩
付晓东
刘惟一
GAO Yan YUE Kun WU Haol FU Xiaodong LIU Weiyi(School of Information Science and Engineering, Yunnan University, Kunming Yunnan 650504, China Faculty of lnforrnation Engineering and Automation, Kunming University of Science and Technology, Kunming Yunnan 650500, China)
出处
《计算机应用》
CSCD
北大核心
2017年第2期360-366,共7页
journal of Computer Applications
基金
国家自然科学基金资助项目(61472345
61562090
61462056)
云南省应用基础研究计划项目(2014FA023
2014FA028)
云南省中青年学术和技术带头人才后备人才培育计划项目(2012HB004)
云南大学青年英才培育计划项目(XT412003)
云南大学创新团队培育计划项目(XT412011)~~
关键词
用户偏好
评分数据
贝叶斯网
隐变量模型
概率推理
带偏置的矩阵分解
user preference
rating data
Bayesian network
latent variable model
probabilistic inference
biased matrix faetorization