摘要
随着微博服务的出现,微博中的个性化搜索成为了新的研究热点。主题模型作为一种有效的文本主题建模工具,被广泛应用到个性化搜索领域,然而微博数据存在主题多样化、实时性强等特点,传统的基于主题模型的个性化搜索算法难以解决微博数据中存在的主题粗糙性问题。为了解决这一问题,结合主题模型和语言模型建立了主题类一词语的双层用户兴趣模型,这种模型不仅反映了用户的兴趣主题,而且避免了主题的粗糙性,并利用用户的兴趣模型实现了微博搜索的个性化服务。同时,考虑到微博数据实时性的特点,将时间因素融入到微博搜索模型当中,提出了一种时间敏感的个性化搜索结果重排序算法。将此模型与传统的基于主题模型的个性化搜索模型进行比较,实验结果表明本文提出的模型更适合微博数据的搜索,更能满足用户的需求。
With the emergence of the microblogging service, the personalized search for microblog has become a new research hotspot. Topic model as an effective text subject modeling tools, has been widely applied in the field of personalized search. However, microblogging data relating to the presence of diverse, real-time and other characteristics, the traditional personal- ized search algorithms based on the topic model are difficult to resolve microblogging problem presented in the data with rough ness issues. To solve this problem, combining with topic model and language model, this paper established a two layer user in- terest model, which is composed of a class of topic layer and a word layer. This model not only reflects the user's topic of inter- est, and avoids roughness topic. We use the user's interest model to implement the personalized search for microblog. Because of the characteristic of real-time, we also develop a mieroblog search model that is able to incorporate time factor and propose a time-aware personalized re-ranking algorithm for search results. Finally, the proposed model and the traditional model of per sonalized search topic model are compared. Experimental results show that the proposed model is more suitable microblogging search and is better to meet the needs of users.
出处
《微型电脑应用》
2017年第2期46-50,共5页
Microcomputer Applications
关键词
用户兴趣模型
时间
重排序
微博
User interest model
Time
Re-ranking
Microblog