期刊文献+

一种改进的LSH/MinHash协同过滤算法 被引量:5

An Improved LSH /MinHash Collaborative Filtering Algorithm
下载PDF
导出
摘要 近年来很多基于协同过滤的推荐系统得到了成功应用,但随着系统中用户和项目数量的不断增加,相似度计算量剧增,使得协同过滤推荐系统的扩展性问题变得日益突出。本文提出改进的基于近似最近邻的LSH/MinHash算法,并运用到图书馆资源聚类中,以解决在合理时间复杂度下的高维大数据量聚类问题,降低相似度计算量,提高算法的可扩展性。实验表明此算法有较高的效率与精度。 In recent years , many collaborative filtering-based recommender systems have been successfully applied , but with the increasing number of system users and projects , the amount of similarity calculation increases sharply , collaborative filtering rec-ommendation system scalability issues become increasingly prominent .This paper puts forward a LSH/MinHash algorithm based on the approximate nearest neighbor , and applies it to the clustering of library resources , for solving the problem of high dimen-sion and a amount of data cluster in the complexity under reasonable time .It reduces the amount of similarity calculation , im-proves the scalability of the algorithm .Experiments show that this algorithm is of higher efficiency and accuracy .
机构地区 河海大学商学院
出处 《计算机与现代化》 2013年第12期19-22,26,共5页 Computer and Modernization
关键词 图书馆 个性化推荐 协同过滤 LSH library personalized recommendation collaborative filtering LSH
  • 相关文献

参考文献10

  • 1Sarwar M B, Karypis G, Konstan A J, et al. hem-based collaborative filtering recommendation algorithms [ C ]/! Proceedings of the 10th International Conference on World Wide Web. 2001:285-295.
  • 2Sarwar M B, Karypis G, Konstan A J, et al. Application of dimensionality reduction in recommender system: A case stud- y[C]// WebKDD Workshop at the ACM SIGKKD. 2000.
  • 3Mcginty L, Smyth B. Adaptive selection: An analysis of critiquing and preference-based feedback in conversational recommender systems [ J ]. International Journal of Elec- tronic Commerce, 2006,11(2) :35-57.
  • 4Gaede V, Gunther O. Multidimensional access methods [ J ]. ACM Computing Surveys, 1998,30 (2) : 170-231.
  • 5Rajaraman A, Ullman J D. Mining of Massive Datasets [ M ]. Cambridge University Press, 2010.
  • 6Broder A Z. On the resemblance and containment of docu- ments [ C ]// Proceedings of the Compression and Com- plexity of Sequences, 1997. 1997:21-29.
  • 7Charikar M S. Similarity estimation techniques from roun- ding algorithms [ C ]/! Proceedings of the 34th Annual ACM Symposium on Theory of Computing. 2002:380- 388.
  • 8黄维篁,李国良,冯建华.高效的数据源选择方式[J].计算机科学与探索,2010,4(10):890-898. 被引量:1
  • 9李晓光.基于联接的高校图聚类方法研究[D].沈阳:辽宁大学,2012.
  • 10蔡衡,李舟军,孙健,李洋.基于LSH的中文文本快速检索[J].计算机科学,2009,36(8):201-204. 被引量:13

二级参考文献28

  • 1Stein B. Principles of hash - based text retrieval [C]//Annual ACM Conference on Research and Development in Information Retrieval Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007.
  • 2Athitsos V,Potamias M,Papapetrou P,et al. Nearest Neighbor Retrieval Using Distanee-Based Hashing[C] // Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on. 2008.
  • 3IndykP, DatarM, ImmorlicaN. Locality-SensitiveHashingScheme Based on p-Stable[C]//Annual Symposium on Computational Geometry. 2004.
  • 4Arya S, Mount D. Ann: Library for approximate nearest neighbor search[OL], http: //www. cs. umd. edu/-mount/ANN/.
  • 5Indyk P, Motwani R. Approximate nearest neighbors : Towards removing the curse of dimensionality[C]//Jeffrey V, ed. Proc. of the 30th Annual ACM Symp. on Theory of Computing. New York: ACM Press, 1998 : 604-613.
  • 6Panigrahy R. Entropy based nearest neighbor searchin high dimensions[C]//Proc, of ACM-SIAMSymposium on Discrete Algorithms(SODA). 2006.
  • 7Ravichandran D,Pantel P, Hovy E. Randomized Algorithms and NLP..Using Locality Sensitive Hash Function for High Speed Noun Clustering[M]. Information Sciences Institute University of Southern California, 2004.
  • 8Cai Rui, Zhang Chao, Zhang Lei, et al. Scalable Music Recommendation by Search [C]// International Multimedia Conference. 2007.
  • 9Charikar M S. Similarity Estimation Techniques from Rounding Algorithms[C]//Annual ACM Symposium on Theory of Computing. 2002.
  • 10Qin L, Josephson W, Wang Zhe, et al. A TimeSpace Efficient Locality Sensitive Hashing Method for Similarity Search in High Dimensions.

共引文献12

同被引文献63

  • 1邓爱林,左子叶,朱扬勇.基于项目聚类的协同过滤推荐算法[J].小型微型计算机系统,2004,25(9):1665-1670. 被引量:147
  • 2张海燕,丁峰,姜丽红.基于模糊聚类的协同过滤推荐方法[J].计算机仿真,2005,22(8):144-147. 被引量:25
  • 3Sarwar B M. Sparsity, Scalability, and Distribution in Recommender Systems[D]. Minneapolis, USA: University of Minnesota, 2001.
  • 4Sarwar B M, Karypis G, Konstan J, et al. Recommender Systems for Large-scale E-commerce: Scalable Neighborhood Formation Using Clustering[C]. In: Proceedings of the 5th International Conference on Computer and Information Technology. 2002.
  • 5Rashid A M, Lam S K, Karypis G, et al. ClustKNN: A Highly Scalable Hybrid Model-&Memory-based CF Algorithm[C]. In: Proceedings of the KDD Workshop on Web Mining and Web Usage Analysis, at 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006.
  • 6Rennie J D M, Srebro N. Fast Maximum Margin Matrix Factorization for Collaborative Prediction[C]. In: Proceedings of the 22nd International Conference on Machine Learning. New York: ACM Press, 2005: 713-719.
  • 7Goldberg K, Roeder T, Gupta D, et al. Eigentaste: A Constant Time Collaborative Filtering Algorithm[J]. Information Retrieval, 2001, 4(2):133-151.
  • 8Kim D, Yum B J. Collaborative Filtering Based on Iterative Principal Component Analysis[J]. Expert Systems with Applications, 2005, 28(4): 823-830.
  • 9Aggarwal C C. On the Effects of Dimensionality Reduction on High Dimensional Similarity Search[C]. In: Proceedings of the 20th ACM Sigmod-Sigact-Sigart Symposium on Principles of Database Systems. 2001: 256-266.
  • 10Papagelis M, Plexousakis D, Kutsuras T. Alleviating the Sparsity Problem of Collaborative Filtering Using Trust Inferences[C]. In: Proceedings of the 3rd International Conference on Trust Management. Berlin, Heidelberg: Springer-Verlag, 2005: 224-239.

引证文献5

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部