摘要
XML检索时,考虑关键词在文档中的位置有助于改善检索效果,一种常用的方法是为文档中不同的标签赋予不同的权重,并根据关键词所在结点的标签合理地设置权重。然而,目前为标签赋予权重的方法大都是人工设置,这种方法工作量大且主观性强。提出了用主题概括强度衡量XML标签权重的方法,实验结果显示,该方法能有效提高XML检索的质量。
Taking the occurrence position of a term in XML (extensive makeup language) retrieval is helpful to improve the retrieval performance. The common method sets the weight of tag in XML document and integrates the tag weight into term weight model. However, tag weight is set manually in most related works, which is a subjective and heavy work. A tag weight model based on topic generalization is advanced, by which the tag weight is calculated automatically. Experiment results show that this model performs well in XML retrieval.
出处
《计算机科学与探索》
CSCD
2010年第8期723-730,共8页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金No.60803105
60763001
江西省教育厅科技项目No.GJJ08508~~