摘要
目前关系数据库的相关性排序方法有很多。对象级别检索可以更好地将分散在各个元组中的信息进行整合,得到完整的信息。由于每个对象是唯一的,对象之间的区别不仅仅是通过关键词来体现,而且是通过它们包含的属性值来体现的。因此介绍的方法通过统计包含关键词的对象中的属性值的分布情况,运用信息熵的方法为每个属性分配权值,由此计算每个对象针对单关键词的相关性得分,进而以对象为单位将针对每个关键词的得分求和,得到最终的排序得分。
Currently, there are many ranking methods sorted by correlation in relational databases. Object-level retrieval can integrate information which is dispersed in many tuples into complete information. As each object is unique, the difference among the objects is not only embodied in the keywords, but also embodied in the attribute values which they contain. Thus, this article described a method which gathers statistics of the distribution of attribute values in the object which contains the keyword, and makes use of the information entropy method to assign weights for each attribute value to calculate the correlation score of each tuple to each keyword. In the end, the object's score for each keyword is summed up to get the final ranking score for the object.
出处
《计算机科学》
CSCD
北大核心
2013年第3期219-224,共6页
Computer Science
基金
国家自然科学基金面上项目(61073057
60972090)
中央高校基本科研业务费专项资金项目(2011JC007)资助
关键词
关系数据库
对象级别
排序算法
属性值
信息熵
Relational databases, Obj ect-level, Ranking algorithm, Attribute value, Information entropy