摘要
时态索引作为一种高效管理和检索时态数据的有效手段,一直是时态数据领域的研究热点.提出了一种基于时序分区的时态索引技术TPindex.首先将海量时态数据的时态属性映射到二维平面上,对平面上的"有效时间"点进行采样处理,通过使用自上而下,自左而右的时序分区方法将平面划分成若干个均匀的区域.其次,使用基于拟序关系的线序划分算法对每个分区中的数据构建数据结构,并建立基于"有效时间戳"的全区索引,实现"一次一集合"的数据查询操作.再次,还提出了使用分文件存储线序索引的模式将分区线序索引磁盘化,同时可以结合多线程技术并行处理数据,充分利用现代化硬件资源以满足海量数据下的高性能需求,提高索引性能.另一方面,我们还研究了海量时态数据下TPindex的增量式更新操作.最后,设计相应的仿真实验,通过与现有的代表性工作进行对比评估,验证了所提出方法的有效性和实用价值.
Temporal index is one of key methods for temporal data managements and retrieval,which has been a hotspot in the field of temporal data.This paper presents a temporal index technique TPindex which is based on a temporal timing partition method.Firstly,the temporal attributes of massive amount of temporal data is mapped to a two-dimensional plane and the"Valid Time"points in this plane are sampled for timing partition.A"form up to down and form left to right"timing partition method is used to divide the plane into several balanced temporal areas and whole-partition index would be established at the same time.Once the steps above are completed,temporal data can be dynamically indexed by its querying schema of"one time,one set".Secondly,the TPindex would build data structures through using"linear order partition"algorithm based on quasi-order relation for the data in each temporal area.Besides,a"Separated Files Model Index"based on disks and multi-threading parallel process technique that can be combined are proposed to make full use of modern hardware resources to meet the high performance needs under high-volume data,leading to better performance with index.On the other hand,the incremental updating algorithm was also studied.Finally,the corresponding simulation experiments are designed to compare with the current representative work to verify the feasibility and validity of the proposed algorithm.
作者
杨佐希
汤娜
汤庸
潘明明
李丁丁
叶小平
YANG Zuo-Xi;TANG Na;TANG Yong;PAN Ming-Ming;LI Ding-Ding;YE Xiao-Ping(School of Computer Science,South China Normal University,Guangzhou 510631,China)
出处
《软件学报》
EI
CSCD
北大核心
2020年第11期3519-3539,共21页
Journal of Software
基金
国家自然科学基金(61772211,U181120009)
广州市产学研协同创新重大专项(201704020203)
广东省应用型科技研发专项(2016B010124008)。
关键词
时态索引
时序分区
拟序关系
海量数据
并行化
temporal index
timing partition
quasi-order relation
high-volume data
parallelize