摘要
为能在搜索引擎返回的结果集上构建贴近用户意图的主题层,并在文档词与主题间建立映射,将社会化标注引入经典的LDA模型,构建一种基于主题-标签-文档词之间关系的三层主题模型,并将其用于伪相关反馈查询扩展词的选取.实验结果表明,该模型提取的查询扩展词能描述标签的语义,模型用于伪相关反馈后,提取的扩展词能覆盖查询条件,在多数情况下结果列表的NDCG值高于基本伪相关反馈和结果集聚类方法.
Topic model can help pseudo feedback in query expansion. The main shortcoming of classic topic model is that a topic level needs to be assumed. For constructing the topic levels of closing users on corpus and creating the mapping between topics and words, social annotation was introduced into classic topic model LDA( latent dirichlet allocation), and a three-level topic model of topic, label and word was constructed, which was applied to choose query expansion of pseudo feedback. The results showed that this model can describe the semantic of the label, and extract the expansion items that covered the query. The model' s NDCG values are higher than those of the classic pseudo feedback and result set clustering.
出处
《东北大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2013年第3期348-351,共4页
Journal of Northeastern University(Natural Science)
基金
辽宁省自然科学基金资助项目(20102060)