摘要
目前国内问答社区的用户检索推荐主要基于字符匹配,缺乏对用户的历史行为信息的综合利用。提出了一种基于LSI(latent semantic indexing)的用户实时推荐算法,融合了检索关键词和社区用户历史行为信息,实时推荐与检索内容确实相关的高质量用户。在关键词检索的问题上,突破了传统字符匹配的框架,融合了社区用户的历史行为信息进行检索,避免了因字符匹配产生的信息单薄而推荐错误;不同于LSI的传统应用,该系统利用LSI挖掘词语潜在语义和对向量空间降维的两个特性,将LSI应用在实时用户推荐情形,更高效地做出社区用户的推荐。推荐算法的训练和测试以知乎为例。真实数据表明,该算法推荐效果对比知乎推荐现状有明显提升,使推荐用户的历史行为信息与检索关键词相契合。
At present,the user retrieval recommendation of domestic question and answer community is mainly based on character matching,lacking of comprehensive utilization of user behavior information.Therefore,we propose a user real-time recommendation algorithm based on LSI( latent semantic indexing) which fuses search keywords and community user behavior information to recommend high quality users who are really related to search keywords in real time.On the issue of search keywords,we break through the traditional characters matching framework,combine with the historical behavior information from community users for retrieval,and avoid the recommendation error because of information thin cause by the character matching.Be different from the traditional application of LSI,the system applies the two characteristics of latent semantic words mined by LSI and the vector space dimension reduction,and makes LSI applied in real-time users recommendation to recommend the community users more efficiently.The training and testing of recommendation algorithms are based on Zhihu.Validation of real data shows that the proposed algorithm has an obvious improvement over Zhihu's current recommendation and makes the recommendation users behavior information correspond to key words.
作者
何子健
李嘉敏
李秋锐
余俊辉
郑圆君
李乡儒
HE Zi-jian;LI Jia-min;LI Qiu-rui;YU Jun-hui;ZHENG Yuan-jun;LI Xiang-ru(School of Mathematical Sciences,South China Normal University,Guangzhou 510631,China)
出处
《计算机技术与发展》
2018年第7期73-77,82,共6页
Computer Technology and Development
基金
国家自然科学基金(61273248
61075033)
国家级大学生创新创业训练项目(201610574033)
广东大学生科技创新培育专项资金(攀登计划专项资金)(pdjh2017b0139)
关键词
知乎
潜在语义索引
实时推荐
信息融合
Zhihu
latent semantic indexing
real-time recommendation
information fusion