摘要
信息检索的核心问题就是在文档集中为用户检索出最相关的子文档集,并依靠排序算法对检索结果按照相关性进行排序,因此排序算法的优劣直接影响检索的效率.RLR算法改进了正则经验风险模型,大大减少了计算复杂度.通过设定一定范围的允许误差值,采用对称ε-insen-sitive对数亏损函数作为亏损函数,给出对称ε-insensitive对数亏损函数满足的一些特殊性质,进而改进RLR算法.实验表明新算法对文本排序是有效的.
The core function of information retrieval is to find out a subset of the most relevant documents from the files, and to rank the relevance of the results according to the ranking algorithm, so the effectiveness of the ranking algorithm directly affects the efficiency of retrieval. RLR algorithm improves the regularized empirical risk model, and greatly reduces the computational complexity. This has improved the RLR algorithm by allowing certain errors and using the symmetric ε - insensitive logistic loss function, and proved some properties of the symmetric ε - insensitive logistic loss function. Experiments show that the new algorithm is effective for text ranking.
出处
《云南民族大学学报(自然科学版)》
CAS
2010年第1期52-55,共4页
Journal of Yunnan Minzu University:Natural Sciences Edition
基金
国家自然科学基金(60903131)
云南省教育厅科学研究基金(07Z40092)
关键词
信息检索
排序
边际
RLR算法
information retrieval
ranking
margin
RLR algorithm