摘要
为提高网络书店相似性搜索效率,降低时间和存储开销以适应大规模数据,提出一种基于P-Rank的相似性搜索优化算法ProductP-Rank。对相似性搜索算法进行分析和比较,指出相似性计算精确度和复杂度是现有算法所面临的难点;依据消费者与图书之间的购买关系构建购物网络,离线计算一步相似性矩阵,在线计算两步相似性矩阵。实验结果表明,该方法降低了相似性计算的存储和预计算时间的开销,具有较高精确度,能够快速响应查询请求。
To increase the efficiency of algorithms on online bookstore's similarity search,and reduce time and space cost to adapt to large information network,ProductP-Rank,an optimized similarity search method,was proposed based on the basic idea of P-Rank.The past algorithms for similarity search were analyzed and discussed and the accuracy and complexity problems in similarity search were pointed out.By building the customer-product network according to the co-purchasing relationship,for a given query,the 2-hop similarity matrix between query and each item was computed based on the pre-computed 1-hop similarity matrix.Experimental results show the space cost and pre-computation time cost of ProductP-Rank were evidently less than that of P-Rank with little effectiveness loss and low online-query time cost.
出处
《计算机工程与设计》
北大核心
2015年第10期2849-2855,共7页
Computer Engineering and Design
基金
国家自然科学基金项目(61202376)
上海出版传媒研究院
上海出版印刷高等专科学校招标课题基金项目(SAYB1410)
上海高校青年教师培养资助计划基金项目(ZZSLG14021)
上海市教育基金会晨光计划基金项目(10CG49)
上海市教委科研创新基金项目(13YZ075)