随着互联网技术的迅猛发展,越来越多的非结构化数据涌入到人们的生活中,为这些数据建立高效的索引面临极大的挑战.键值数据库Key-Value以其结构简单和高扩展性而引起人们的广泛关注,已成为海量数据存储系统中的重要组成部分.由于Key-Va...随着互联网技术的迅猛发展,越来越多的非结构化数据涌入到人们的生活中,为这些数据建立高效的索引面临极大的挑战.键值数据库Key-Value以其结构简单和高扩展性而引起人们的广泛关注,已成为海量数据存储系统中的重要组成部分.由于Key-Value系统对吞吐量要求较高,而基于Flash的固态硬盘(solid state drive,SSD)能够提供很高的随机读性能,在SSD上构建Key-Value系统已成为海量数据存储领域的一大研究热点.鉴于Flash具有非定点更新、寿命有限等特性,基于SSD的KeyValue系统必须针对Flash的特性作专门优化.以一种称为SkimpyStash的基于SSD的Key-Value系统为基础,提出了一种新的Key-Value系统低延迟存储系统(low latency store,LLStore).LLStore使用内存文件映射技术来减少针对SSD的IO请求,除此之外,针对SkimpyStash中低效的压缩策略,提出一种改进方法,可以在少量增加内存开销的情况下极大地减少查询时间.通过与原系统的性能比较实验,LLStore在平均查询时间上可以获得至少12%的加速.展开更多
To facilitate users to access the desired information, many researches have dedicated to the Deep Web (i.e. Web databases) integration. We focus on query translation which is an important part of the Deep Web integr...To facilitate users to access the desired information, many researches have dedicated to the Deep Web (i.e. Web databases) integration. We focus on query translation which is an important part of the Deep Web integration. Our aim is to construct automatically a set of constraints mapping rules so that the system can translate the query from the integrated interface to the Web database interfaces based on them. We construct a concept hierarchy for the attributes of the query interfaces, especially, store the synonyms and the types (e.g. Number, Text, etc.) for every concept At the same time, we construct the data hierarchies for some concepts if necessary. Then we present an algorithm to generate the constraint mapping rules based on these hierarchies. The approach is suitable for the scalability of such application and can be extended easily from one domain to another for its domain independent feature. The results of experiment show its effectiveness and efficiency.展开更多
In this paper, an algorithm for eliminating extreme values and reducing the estimation variance of an integrated trispectrum under low signal-to-noise ratio and short data sample conditions is presented. An analysis o...In this paper, an algorithm for eliminating extreme values and reducing the estimation variance of an integrated trispectrum under low signal-to-noise ratio and short data sample conditions is presented. An analysis of the results of simulations using this algorithm and comparison with the conventional power spectrum and integrated trispectrum methods are presented.展开更多
文摘随着互联网技术的迅猛发展,越来越多的非结构化数据涌入到人们的生活中,为这些数据建立高效的索引面临极大的挑战.键值数据库Key-Value以其结构简单和高扩展性而引起人们的广泛关注,已成为海量数据存储系统中的重要组成部分.由于Key-Value系统对吞吐量要求较高,而基于Flash的固态硬盘(solid state drive,SSD)能够提供很高的随机读性能,在SSD上构建Key-Value系统已成为海量数据存储领域的一大研究热点.鉴于Flash具有非定点更新、寿命有限等特性,基于SSD的KeyValue系统必须针对Flash的特性作专门优化.以一种称为SkimpyStash的基于SSD的Key-Value系统为基础,提出了一种新的Key-Value系统低延迟存储系统(low latency store,LLStore).LLStore使用内存文件映射技术来减少针对SSD的IO请求,除此之外,针对SkimpyStash中低效的压缩策略,提出一种改进方法,可以在少量增加内存开销的情况下极大地减少查询时间.通过与原系统的性能比较实验,LLStore在平均查询时间上可以获得至少12%的加速.
基金Supported by the National Natural Science Foundation of China (60573091)the Natural Science Foundation of Beijing(4073035)the Key Project of Ministry of Education of China (03044)
文摘To facilitate users to access the desired information, many researches have dedicated to the Deep Web (i.e. Web databases) integration. We focus on query translation which is an important part of the Deep Web integration. Our aim is to construct automatically a set of constraints mapping rules so that the system can translate the query from the integrated interface to the Web database interfaces based on them. We construct a concept hierarchy for the attributes of the query interfaces, especially, store the synonyms and the types (e.g. Number, Text, etc.) for every concept At the same time, we construct the data hierarchies for some concepts if necessary. Then we present an algorithm to generate the constraint mapping rules based on these hierarchies. The approach is suitable for the scalability of such application and can be extended easily from one domain to another for its domain independent feature. The results of experiment show its effectiveness and efficiency.
基金Supported by the National Natural Science Foundation of China under Grant No.60072027
文摘In this paper, an algorithm for eliminating extreme values and reducing the estimation variance of an integrated trispectrum under low signal-to-noise ratio and short data sample conditions is presented. An analysis of the results of simulations using this algorithm and comparison with the conventional power spectrum and integrated trispectrum methods are presented.