摘要
每个领域下的深网数据源众多,如果检索领域内所有深网以获取所需的集成信息,那么工作量将十分巨大,因而数据源选择技术应运而生。医学领域实体间存在着丰富的关联关系,把相关关联信息进行有效集成可以促进人们健康生活。为提升医学领域实体关联的信息集成效率,提出了一种基于实体关联特征的数据源选择方法。基于实体关联图中的实体权重以及链接信息,构建了实体关联矩阵摘要;基于实体关联查询意图提出了数据源相关性计算方法。利用领域数据集进行了大量的实验,结果表明所提出方法准确率和召回率较高,可以为医学领域信息集成提供有效支撑。
There is lots of deep web in each field, if people retrieve all deep web in an area to obtain the required information,the workload is very huge. For the above reason, the data source selection technology is introduced. There are rich relationships among entities in the area of medicine, effectively integrate the entity association can promote the people's health.In order to enhance the efficiency of information integration for entity association, proposing a data source selection method based on the characteristics of the associated entities. Firstly, construct a matrix summary of entity association based on weight and link information; Secondly, propose a correlation calculation method of data source based on the query intent.A number of experiments based on field data collection are conducted, the result show that our method's accuracy and recall are higher. So, it can provide a effective support to the entity integration in medical field.
出处
《计算机工程与应用》
CSCD
北大核心
2016年第10期135-140,共6页
Computer Engineering and Applications
基金
国家自然科学基金项目(No.61462037
No.61262033
No.61563016)
江西省自然科学基金(No.20142BAB217014
No.20142BAB207009)
江西省教育厅科技项目(No.GJJ13303)
关键词
数据源选择
摘要
医学
实体关联
data source selection
summary
biomedical literature
entity association