期刊文献+

基于聚类数的评分矩阵恢复算法

Rating matrix completion algorithm based on number of clusters
下载PDF
导出
摘要 评分矩阵(rating matrix)的特点是高维、稀疏、低秩,对其研究的主要方法是低秩矩阵恢复。对这些算法而言,不同评分矩阵的秩,会得到不同的恢复精度。但目前没有理论来研究评分矩阵秩的估计,从而影响了这些算法的应用。从理论上分析了用户聚类数与评分矩阵秩的关系,给出用户聚类数的计算方法,并在此基础上提出一种基于聚类数的秩1矩阵恢复(Clusters Number Rank-1 Matrix Completion,CN-R1MC)算法来恢复评分矩阵。通过在多个推荐系统数据集上的实验证明:用户聚类数能较好地近似评分矩阵的秩,这对提高评分矩阵的恢复精度有重要的作用。所提出的算法有较好的应用价值。 Rating matrix is high-dimensional, sparse and low rank. The low rank matrix recovery is the important method for rating matrix of research. For these algorithms, different scoring matrix rank will obtain different recovery precision. But there is no theory to study the score matrix rank, thus affecting the application of these algorithms. This paper analyzes the relationship between clustering number of user and rank of rating matrix, and then it presents the method of computing the cluster number of user, and on this basis, it proposes a number of clusters based on rank 1 matrix recovery(Clusters Number Rank-1 Matrix Completion, CN-R1MC)algorithm to recover rating matrix. Through a plurality of recommendation system data sets on the experiments, the cluster number of user can approximate rank of rating matrix better, which has an important role in improving recovery accuracy for the rating matrix. The proposed algorithm has good application value.
作者 刘波 何希平
出处 《计算机工程与应用》 CSCD 北大核心 2015年第21期6-11,47,共7页 Computer Engineering and Applications
基金 国家自然科学基金青年科学基金项目(No.61402063) 重庆市教委科学技术项目(No.KJ1400612 No.KJ130709) 重庆工商大学项目(No.20135609)
关键词 评分矩阵 低秩矩阵恢复 秩1矩阵 用户聚类数 奇异值分解 rating matrix low-rank matrix completion rank-one matrix number of user clustering singular value decomposition
  • 相关文献

参考文献15

  • 1彭义刚,索津莉,戴琼海,徐文立.从压缩传感到低秩矩阵恢复:理论与应用[J].自动化学报,2013,39(7):981-994. 被引量:80
  • 2Keshavan R H,Montanari A,Oh S.Matrix completion from noisy entries[J].Journal of Machine Learning Research,2010,11:2057-2078.
  • 3Cai Jianfeng,Candes E J,Shen Zuowei.A singular value thresholding algorithm for matrix completion[J].SIAM Journal on Optimization,2010,20(4):1956-1982.
  • 4Prateek J,Raghu M,Dhillon I S.Guaranteed rank minimization via singular value projection[C]//NIPS,2010.
  • 5Haim A,Satyen K,Prasad K S,et al.Efficient and practical stochastic subgradient descent for nuclear norm regularization[C]//The 29th International Conference on Machine Learning,2012.
  • 6Martin J.Frank-Wolfe R.Projection-free sparse convex optimization[C]//The 30th International Conference on Machine Learning,2013.
  • 7Cho-Jui H,Olsen P A.Nuclear norm minimization via active subspace selection[C]//The 31st International Conference on Machine Learning,2014.
  • 8Wang Zheng,Laiy Ming-Jun,Luz Zhaosong.Rank-one matrix pursuit for matrix completion[C]//The 31st International Conference on Machine Learning,2014.
  • 9Liu Guangcan,Lin Zhouchen,Yan Shuicheng.Robust recovery of subspace structures by low-rank representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):171-184.
  • 10周世兵,徐振源,唐旭清.K-means算法最佳聚类数确定方法[J].计算机应用,2010,30(8):1995-1998. 被引量:145

二级参考文献23

  • 1CALINSKI R,HARABASZ J.A dendrite method for cluster analysis[J].Communications in Statistics,1974,3(1):1 -27.
  • 2DAVIES D L,BOULDIN D W.A cluster separation measure[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1979,1(2):224-227.
  • 3DUDOIT S,FRIDLYAND J.A prediction-based resampling method for estimating the number of clusters in a dataset[J].Genome Biology,2002,3(7):1-21.
  • 4DIMITRIADOU E,DOLNICAR S,WEINGESSEL A.An examination of indexes for determining the number of cluster in binary data sets[J].Psychometrika,2002,67(1):137-160.
  • 5KAPP A V,TIBSHIRANI R.Are clusters found in one dataset present in another dataset?[J].Biostatistics,2007,8(1):9-31.
  • 6ROUSSEEUW P J.Silhouettes:a graphical aid to the interpretation and validation of cluster analysis[J].Journal of Computational and Applied Mathematics,1987,20(1):53 -65.
  • 7DEMB(E)L(E) D,KASTNER P.Fuzzy C-means method for clustering microarray data[J].Bioinformatics,2003,19(8):973-980.
  • 8Frey B J, Dueck D. Clustering by passing messages between data points. Science, 2007, 315(5814): 972-976
  • 9Kelly K. Affinity program slashes computing times [Online], available: http://www.news.utoronto.ca/bin6/070215-2952. asp. October 25, 2007
  • 10Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology, 2002, 3(7): 1-21

共引文献367

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部