摘要
XML数据过滤过程中存在的缓存失效会导致过滤效率的降低,因此研究基于确定有限自动机的XML数据过滤过程中如何减少缓存失效对于改进过滤的性能具有重要意义.对已有的Lazy DFA执行算法进行改进,引入了频繁访问区的概念,对缓存中的状态增加一个状态转换计数器,然后通过设定该计数器的访问阈值对缓存中的数据进行筛选,超过该阈值的状态被确定为频繁访问区中的状态.并通过实验证明该访问机制可以减少自动机中状态转移过程在缓存大量状态中的搜索时间,从而有效地提高过滤和查询的时间性能.
The invalidation of cache during XML data filtering would reduce efficiency of filtering,so,it is of great significance to improving the performance of filtering to research how to reduce invalidity of cache during XML data filtering on the basis of deterministic finite automation(DFA).A proposal for improvement of XML filtering was put forward on the basis of the Lazy DFA.For each node in the automaton,a state transition counter was added to build a frequently accessed area.By setting an access threshold for the counter,the data in the cache could be filtered.The nodes whose counter value exceeds the threshold are identified as frequently accessed areas of the state.Experiments prove that this proposal can greatly reduce the search time during the transition between a large number of automation states in the cache,thus increasing the efficiency of filtering and query performance.
出处
《哈尔滨工程大学学报》
EI
CAS
CSCD
北大核心
2011年第3期328-333,共6页
Journal of Harbin Engineering University
基金
国家863计划资助项目(2007AA012401)
中央高校基本科研业务费专项基金资助项目(HEUCF100606)
关键词
DFA
XML
状态转移
频繁访问节点
deterministic finite automaton(DFA)
XML
state transition
frequent access nodes