This paper concentrates on the problem of data redundancy under the extended-possibility-based model. Based on the information gain in data classification, a measure - relation redundancy - is proposed to evaluate the...This paper concentrates on the problem of data redundancy under the extended-possibility-based model. Based on the information gain in data classification, a measure - relation redundancy - is proposed to evaluate the degree of a given relation being redundant in whole. The properties of relation redundancy are also investigated. This new measure is useful in dealing with data redundancy.展开更多
As there is datum redundancy in tradition database and temporal database in existence and the quantities of temporal database are increasing fleetly. We put forward compress storage tactics for temporal datum which co...As there is datum redundancy in tradition database and temporal database in existence and the quantities of temporal database are increasing fleetly. We put forward compress storage tactics for temporal datum which combine compress technology in existence in order to settle datum redundancy in the course of temporal datum storage and temporal datum of slow acting domain and momentary acting domain are accessed by using each from independence clock method and mutual clock method .We also bring forward strategy of gridding storage to resolve the problems of temporal datum rising rapidly.展开更多
Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only ...Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining al gorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.展开更多
This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relatio...This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relation hierarchical data model. Based on the multilevel relation hierarchical data model, the concept of upper lower layer relational integrity is presented after we analyze and eliminate the covert channels caused by the database integrity. Two SQL statements are extended to process polyinstantiation in the multilevel secure environment. The system is based on the multilevel relation hierarchical data model and is capable of integratively storing and manipulating multilevel complicated objects ( e.g., multilevel spatial data) and multilevel conventional data ( e.g., integer, real number and character string).展开更多
Analysis results of the average annual sea levels in the Caspian Sea obtained from ground and satellite observations, corresponding to solar activity characteristics, magnetic field data, and length of day are present...Analysis results of the average annual sea levels in the Caspian Sea obtained from ground and satellite observations, corresponding to solar activity characteristics, magnetic field data, and length of day are presented. Spectra of the indicated processes were investigated and their approximation models were also built. Previously assumed statistical relationships between space-geophysical processes and Caspian Sea level(CSL) changes were confirmed. A close connection was revealed between the low-frequency models of the solar and geomagnetic activity parameters and the CSL changes. Predictions extending into the next decades showed a high probability of an increase in the CSL and a decrease of the compared space-geophysical parameters.展开更多
This paper presents the recent progress in our project of estimating near real-time electric fields and currents in the ionosphere through our computer system called the Geospace Environment Data Analysis System (GEDA...This paper presents the recent progress in our project of estimating near real-time electric fields and currents in the ionosphere through our computer system called the Geospace Environment Data Analysis System (GEDAS). We show a new technique in which data from ground magnetometers are collected by the system and used as input for the KRM and AMIE programs to calculate the distribution of ionospheric electric fields and currents, as well as of other ionospheric parameters, such as electric potential patterns. One of the goals of this project is to specify ionospheric processes. Examples of the near real-time calculation and the data flow of our scheme are presented.展开更多
The rapid growth of structured data has presented new technological challenges in the research fields of big data and relational database. In this paper, we present an efficient system for managing and analyzing PB le...The rapid growth of structured data has presented new technological challenges in the research fields of big data and relational database. In this paper, we present an efficient system for managing and analyzing PB level structured data called Banian. Banian overcomes the storage structure limitation of relational database and effectively integrates interactive query with large-scale storage management. It provides a uniform query interface for cross-platform datasets and thus shows favorable compatibility and scalability. Banian's system architecture mainly includes three layers:(1) a storage layer using HDFS for the distributed storage of massive data;(2) a scheduling and execution layer employing the splitting and scheduling technology of parallel database; and(3)an application layer providing a cross-platform query interface and supporting standard SQL. We evaluate Banian using PB level Internet data and the TPC-H benchmark. The results show that when compared with Hive, Banian improves the query performance to a maximum of 30 times and achieves better scalability and concurrency.展开更多
In this paper, an attempt has been made to find out the vertical distribution of RH at levels of 850, 700 and 500 hPa by using satellite-derived radiation parameters (i.e., albedo, outgoing longwave fluxes, absorb- ed...In this paper, an attempt has been made to find out the vertical distribution of RH at levels of 850, 700 and 500 hPa by using satellite-derived radiation parameters (i.e., albedo, outgoing longwave fluxes, absorb- ed solar radiation and net radiation). For this purpose, multiple regression equations are derived from MONEX-79 upsonde and dropsonde data over the Arabian Sea for the period 11--20 June 1979. Satellite- estimated RH fields have been compared with ECMWF RH fields obtained from FGGE level ⅢB data. The RMS error and error variance for satellite-estimated RH fields have been found to be less than for those of ECMWF. Satellite-estimated isohygric patterns show good agreement with the cloudiness patterns of GOES satellite, whereas ECMWF isohygric patterns do not show much resemblance with the cloudiness patterns. The results of the study suggest that satellite-estimated RH fields could be more useful than ECMWF RH fields and they can be used with some confidence in NWP models.展开更多
Background The "National" Health Insurance (NHI) in Taiwan, China is a single-payer system that was introduced in 1995 to provide universal health care. It is worth noting that three stakeholders are involved in T...Background The "National" Health Insurance (NHI) in Taiwan, China is a single-payer system that was introduced in 1995 to provide universal health care. It is worth noting that three stakeholders are involved in Taiwan's NHI, which can be seen as a triangular governance regime between the Bureau of "National" Health Insurance (BNHI), the insured and providers. Accordingly, this study intended to assess the efficiency of various different production processes that occur among these stakeholders in Taiwan's NHI system. Methods A two-stage relational Data Envelopment Analysis (DEA) model is adopted to investigate the sub-process efficiencies of the health care resources held by 23 cities and counties through stages I or II, where the outputs of the first stage serve the inputs of the second. The dataset was collected from the annual reports published by the Department of Health, Taiwan, China. Results Under the proposed framework, the efficiency of the whole process can be obtained from the product of productivity and allocative efficiency. Ten DMUs are efficient either in stages I or II, with only two DMUs being efficient with regard to both sub-processes. Conclusion The relational DEA model not only demonstrates the physical relationship between the whole process and the sub-process components, but also produces reliable outcomes in efficiency measurement among different stakeholders in Taiwan's NHI system.展开更多
基金Supported by the National Natural Science Foundation of China(No.70231010/70321001)the Bilateral Scientific and Technological Cooperation between China and Flanders (No.174B0201)
文摘This paper concentrates on the problem of data redundancy under the extended-possibility-based model. Based on the information gain in data classification, a measure - relation redundancy - is proposed to evaluate the degree of a given relation being redundant in whole. The properties of relation redundancy are also investigated. This new measure is useful in dealing with data redundancy.
文摘As there is datum redundancy in tradition database and temporal database in existence and the quantities of temporal database are increasing fleetly. We put forward compress storage tactics for temporal datum which combine compress technology in existence in order to settle datum redundancy in the course of temporal datum storage and temporal datum of slow acting domain and momentary acting domain are accessed by using each from independence clock method and mutual clock method .We also bring forward strategy of gridding storage to resolve the problems of temporal datum rising rapidly.
基金Supported by the National Natural Science Foun-dation of China (70371015)
文摘Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining al gorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.
文摘This paper proposes a security policy model for mandatory access control in class B1 database management system whose level of labeling is tuple. The relation hierarchical data model is extended to multilevel relation hierarchical data model. Based on the multilevel relation hierarchical data model, the concept of upper lower layer relational integrity is presented after we analyze and eliminate the covert channels caused by the database integrity. Two SQL statements are extended to process polyinstantiation in the multilevel secure environment. The system is based on the multilevel relation hierarchical data model and is capable of integratively storing and manipulating multilevel complicated objects ( e.g., multilevel spatial data) and multilevel conventional data ( e.g., integer, real number and character string).
文摘Analysis results of the average annual sea levels in the Caspian Sea obtained from ground and satellite observations, corresponding to solar activity characteristics, magnetic field data, and length of day are presented. Spectra of the indicated processes were investigated and their approximation models were also built. Previously assumed statistical relationships between space-geophysical processes and Caspian Sea level(CSL) changes were confirmed. A close connection was revealed between the low-frequency models of the solar and geomagnetic activity parameters and the CSL changes. Predictions extending into the next decades showed a high probability of an increase in the CSL and a decrease of the compared space-geophysical parameters.
文摘This paper presents the recent progress in our project of estimating near real-time electric fields and currents in the ionosphere through our computer system called the Geospace Environment Data Analysis System (GEDAS). We show a new technique in which data from ground magnetometers are collected by the system and used as input for the KRM and AMIE programs to calculate the distribution of ionospheric electric fields and currents, as well as of other ionospheric parameters, such as electric potential patterns. One of the goals of this project is to specify ionospheric processes. Examples of the near real-time calculation and the data flow of our scheme are presented.
基金supported by the National High-Tech Research and Development (863) Program of China (No. 2012AA012609)
文摘The rapid growth of structured data has presented new technological challenges in the research fields of big data and relational database. In this paper, we present an efficient system for managing and analyzing PB level structured data called Banian. Banian overcomes the storage structure limitation of relational database and effectively integrates interactive query with large-scale storage management. It provides a uniform query interface for cross-platform datasets and thus shows favorable compatibility and scalability. Banian's system architecture mainly includes three layers:(1) a storage layer using HDFS for the distributed storage of massive data;(2) a scheduling and execution layer employing the splitting and scheduling technology of parallel database; and(3)an application layer providing a cross-platform query interface and supporting standard SQL. We evaluate Banian using PB level Internet data and the TPC-H benchmark. The results show that when compared with Hive, Banian improves the query performance to a maximum of 30 times and achieves better scalability and concurrency.
文摘In this paper, an attempt has been made to find out the vertical distribution of RH at levels of 850, 700 and 500 hPa by using satellite-derived radiation parameters (i.e., albedo, outgoing longwave fluxes, absorb- ed solar radiation and net radiation). For this purpose, multiple regression equations are derived from MONEX-79 upsonde and dropsonde data over the Arabian Sea for the period 11--20 June 1979. Satellite- estimated RH fields have been compared with ECMWF RH fields obtained from FGGE level ⅢB data. The RMS error and error variance for satellite-estimated RH fields have been found to be less than for those of ECMWF. Satellite-estimated isohygric patterns show good agreement with the cloudiness patterns of GOES satellite, whereas ECMWF isohygric patterns do not show much resemblance with the cloudiness patterns. The results of the study suggest that satellite-estimated RH fields could be more useful than ECMWF RH fields and they can be used with some confidence in NWP models.
文摘Background The "National" Health Insurance (NHI) in Taiwan, China is a single-payer system that was introduced in 1995 to provide universal health care. It is worth noting that three stakeholders are involved in Taiwan's NHI, which can be seen as a triangular governance regime between the Bureau of "National" Health Insurance (BNHI), the insured and providers. Accordingly, this study intended to assess the efficiency of various different production processes that occur among these stakeholders in Taiwan's NHI system. Methods A two-stage relational Data Envelopment Analysis (DEA) model is adopted to investigate the sub-process efficiencies of the health care resources held by 23 cities and counties through stages I or II, where the outputs of the first stage serve the inputs of the second. The dataset was collected from the annual reports published by the Department of Health, Taiwan, China. Results Under the proposed framework, the efficiency of the whole process can be obtained from the product of productivity and allocative efficiency. Ten DMUs are efficient either in stages I or II, with only two DMUs being efficient with regard to both sub-processes. Conclusion The relational DEA model not only demonstrates the physical relationship between the whole process and the sub-process components, but also produces reliable outcomes in efficiency measurement among different stakeholders in Taiwan's NHI system.