摘要
系统地研究了查询词与候选人在文档中的距离和顺序关系对专家搜索算法准确率的影响。首先在概率语言模型的框架下提出了顺序核函数来建模顺序关系证据;然后进一步提出两种对不同关系证据进行统一建模的概率框架,并通过在TREC标准数据集上的对比实验,探索了结合两种关系证据进行专家搜索的可行性。实验结果表明,距离和顺序关系证据对专家搜索系统的准确率提高能力相近,而对它们的适当结合可以获得比单独利用其中任何一种更好的效果。
This paper studied the influence of using the relationship evidences,namely the distance and sequential dependencies between query terms and candidates in a document,to the precision of expert finding algorithms.Specifically,first proposed an order kernel function to model the sequential relationship,and then proposed two probabilistic frameworks to model two kinds of relationship evidences in a unified way.Experiment results show that the distance and sequential evidences achieve comparable performance gains over the baseline and a combination of both can achieve better performance than using any of them alone.
出处
《计算机应用研究》
CSCD
北大核心
2010年第11期4040-4043,4047,共5页
Application Research of Computers
基金
国家自然科学基金资助项目(90924026)
国家"863"高技术研究发展计划项目(2008AA01Z121
2007AA01Z338)
关键词
概率语言模型
专家搜索
关系证据
核函数
统一建模
probability language model
expert search
relationship evidence
kernel function
unified framework