摘要
词项权重已经广泛应用于信息检索模型中,针对传统的词项独立性假设的词袋模型的问题,本文将基于词重要性的词项权重的计算方法应用于Markov网络查询扩展模型中。该词项权重的计算方法须先建立文档的词项图,然后根据词项图得到词项的共现矩阵和词项间的概率转移矩阵,最后利用Markov链的计算方法得到词的权重。将得到的词项权重代入Markov网络扩展模型中,在5个标准数据集上的实验结果表明,采用基于词重要性的Markov网络查询扩展模型的检索结果优于传统的基于词袋的检索结果。
The weight of term has been widely used in models of information retrieved.In order to solve the problem of independence assumption of word bags mode for traditional model,the weight of term based on the importance of term will be used in the Markov network query expansion model.In order to calculate the weight of the term,firstly we must establish the graph-of-word of documents.Then according to the graph-of-word,we get the matrix that terms occur together and the probability transfer matrix between terms.Lastly,we use the chain of Markov to get the weight of term.By putting the weight of term into the Markov network query expansion model,the experiment results on 5 standard datasets show that the search results of using Markov network query expansion model based on term importance are better than those based on traditional model of word bags.
出处
《计算机与现代化》
2017年第11期89-94,共6页
Computer and Modernization