摘要
在Web上精确检索XML代价非常昂贵。为了缩短操作代价,定义了XML查询松弛的概念,利用三个松弛原操作对用户提交的查询进行松弛,产生查询松弛集。为了定量衡量松弛的可信度,定义了松弛损率,引入传统信息检索中TFIDF评分思想,给出了基于文档统计特性和松弛损失的XML Web检索的TFIDF评分公式,并实现该算法。一系列实验表明此方法在XML Web数据检索中具有常数查全率和较高的查准率。
In order to reduce cost of precise XML retrieval on Web, the query expression relaxation strategy is proposed the computing formular for information lost rate is defined. According to this, the twig scoring method of XML fuzzy retrival based on Xeluster is proposed and the algorithm has been implemented. The expriment results show the method is effective and has the characteristic of constant recall and high precision.
出处
《计算机技术与发展》
2008年第10期53-56,60,共5页
Computer Technology and Development
基金
湖南省教育科研基金(05C671)