In uncertain data management, lineages are often used for probability computation of result tuples. However, most of existing works focus on tuple level lineage, which results in imprecise data derivation. Besides, co...In uncertain data management, lineages are often used for probability computation of result tuples. However, most of existing works focus on tuple level lineage, which results in imprecise data derivation. Besides, correlations among attributes cannot be captured. In this paper, for base tuples with multiple uncertain attributes, we define attribute level annotation to annotate each attribute. Utilizing these annotations to generate lineages of result tuples can realize more precise derivation. Simultaneously,they can be used for dependency graph construction. Utilizing dependency graph, we can represent not only constraints on schemas but also correlations among attributes. Combining the dependency graph and attribute level lineage, we can correctly compute probabilities of result tuples and precisely derivate data. In experiments, comparing lineage on tuple level and attribute level, it shows that our method has advantages on derivation precision and storage cost.展开更多
基金Supported by the Key Program of National Natural Science Foundation of China(61232002)The National Natural Science Foundation of China(61202033)+2 种基金The Program for Innovative Research Team of Wuhan(2014070504020237)The Ph.D.Seed Foundation of Wuhan University(2012211020207)The Science and Technology Support Program of Hubei Province(2015BAA127)
文摘In uncertain data management, lineages are often used for probability computation of result tuples. However, most of existing works focus on tuple level lineage, which results in imprecise data derivation. Besides, correlations among attributes cannot be captured. In this paper, for base tuples with multiple uncertain attributes, we define attribute level annotation to annotate each attribute. Utilizing these annotations to generate lineages of result tuples can realize more precise derivation. Simultaneously,they can be used for dependency graph construction. Utilizing dependency graph, we can represent not only constraints on schemas but also correlations among attributes. Combining the dependency graph and attribute level lineage, we can correctly compute probabilities of result tuples and precisely derivate data. In experiments, comparing lineage on tuple level and attribute level, it shows that our method has advantages on derivation precision and storage cost.