期刊文献+

按需印刷平台中的相似搜索研究

Similarity Search over Print-on-demand Platform
下载PDF
导出
摘要 目的研究按需印刷平台中的相似搜索效率问题。方法利用用户与产品之间的"购买"关系构建"用户-产品"关系,基于P-Rank提出一种高效的相似搜索方法 POD-Rank,用于从"用户-产品"关系中发现相似产品。POD-Rank相似搜索过程依据"用户-产品"关系离线计算用户相似性,并利用用户相似性在线计算产品相似性,而后进一步提出优化的在线查询处理算法,以降低查询处理的时间开销。结果 POD-Rank的计算时间开销和存储开销显著低于P-Rank,而且能够快速响应查询请求。结论POD-Rank的相似性计算开销为P-Rank的0.03%,存储开销为P-Rank的0.06%,计算效果与P-Rank接近,能够满足按需印刷平台中大规模产品数据处理的需求。 The aim of this work was to study the efficiency problem of similarity search over Print-On-Demand(POD) Platform. A "user-product" relation graph was built by utilizing the purchasing relationship between user and product, the similarity between products was measured according to the structure of "user-product" relation graph. For improving the efficiency, we proposed a similarity search method, POD-Rank, which divided the computation process into2 steps. In the first step, we computed the similarity between users in an off-line manner; and in the second step, we computed the similarity between the query and each candidate product based on user similarity in an online manner. For further reducing the response time of on-line query processing, we proposed an optimized online query processing algorithm by skipping the unnecessary accumulation operations on zero-values. The space cost and pre-computation time cost of POD-Rank were evidently lower than those of P-Rank with little effectiveness loss and short online query time. By adopting the 2-step similarity computation method, the time cost was significantly reduced, the computation time cost was only 0.03% of that of P-Rank, the size of similarity matrix was only 0.06% of that of P-Rank, and the effectiveness was close to that of P-Rank. This method can therefore be efficiently applied to processing of large datasets of POP platform.
机构地区 上海理工大学
出处 《包装工程》 CAS CSCD 北大核心 2015年第23期135-139,共5页 Packaging Engineering
基金 上海市教委科研创新项目(15ZZ074) 上海高校青年教师培养资助计划(ZZSLG14021) 上海出版传媒研究院招标课题(SAYB1410) 上海理工大学博士启动基金(1D-14-309-001)
关键词 按需印刷 P-RANK 相似搜索 “用户-产品”关系图 Print-On-Demand P-Rank similarity search "user-product" relation graph
  • 相关文献

参考文献17

  • 1皮阳雪,何永志.一种基于数据链管理的印刷ERP解决方案[J].包装工程,2014,35(11):122-127. 被引量:1
  • 2丁毅,苏杰,赵丽娟.论ERP系统和EXCEL在包装企业管理中的综合应用[J].包装工程,2012,33(11):135-138. 被引量:2
  • 3孔玲君,姜中敏,高雪玲.数字印刷品非均匀性的综合评价[J].包装工程,2014,35(1):114-119. 被引量:2
  • 4JEH G, WIDOM J. Simrank: A Measure of Structural-context Similarity[C]//In Proceedings of KDD, 2002: 538--543.
  • 5ZHAO P, HAN J, SUN Y. P-rank : A Comprehensive Structur- al Similarity Measure over Information Networks[C]//In Pro- eeedings of CIKM, 2009 : 553--562.
  • 6ZHANG M, HU H, HE Z, et al. Top-k Similarity Search in Heterogeneous Information Networks with X-star Network Schema[J]. Expert Systems with Applications, 2015,42 (2) : 699--712.
  • 7LI C, HAN J, HE G. Fast Computation of SimRank for Statie and Dynamic Information Networks[C]// In Proceedings of EDBT, 2010.
  • 8ZHAO X, XIAO C, LIN C, et al. A Partition-based Approach to Structure Similarity Search[J]. VLDB, 2013, 7 (3) : 169-- 180.
  • 9LI P, LIU H, XU J, et al. Fast Single-pair Simrank Computa- tion[C]// In Proceedings of SDM,2010:571--582.
  • 10CAI Y, CONG G, JIA X, et al. Efficient Algorithms for Com- puting Link Based Similarity in Real World Networks[C]// In Proceedings of ICDM, 2009 : 734--739.

二级参考文献57

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部