To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,al...To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.展开更多
With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over...With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced.展开更多
Currently,ocean data portals are being developed around the world based on Geographic Information Systems(GIS) as a source of ocean data and information.However,given the relatively high temporal frequency and the int...Currently,ocean data portals are being developed around the world based on Geographic Information Systems(GIS) as a source of ocean data and information.However,given the relatively high temporal frequency and the intrinsic spatial nature of ocean data and information,no current GIS software is adequate to deal effectively and efficiently with spatiotemporal data.Furthermore,while existing ocean data portals are generally designed to meet the basic needs of a broad range of users,they are sometimes very complicated for general audiences,especially for those without training in GIS.In this paper,a new technical architecture for an ocean data integration and service system is put forward that consists of four layers:the operation layer,the extract,transform,and load(ETL) layer,the data warehouse layer,and the presentation layer.The integration technology based on the XML,ontology,and spatiotemporal data organization scheme for the data warehouse layer is then discussed.In addition,the ocean observing data service technology realized in the presentation layer is also discussed in detail,including the development of the web portal and ocean data sharing platform.The application on the Taiwan Strait shows that the technology studied in this paper can facilitate sharing,access,and use of ocean observation data.The paper is based on an ongoing research project for the development of an ocean observing information system for the Taiwan Strait that will facilitate the prevention of ocean disasters.展开更多
In e-commerce the multidimensional data analysis for OLAP (on-line analytical processing) based on the web data needs integrating various data sources such as XML (extensible markup language) data and relational data ...In e-commerce the multidimensional data analysis for OLAP (on-line analytical processing) based on the web data needs integrating various data sources such as XML (extensible markup language) data and relational data on the conceptual level. A conceptual data description approach of multidimensional data model was presented in order to conduct multidimensional data analysis of OLAP for multiple subjects. The UML (unified modeling language) galaxy diagram, describing the multidimensional structure of the conceptual integrating data at the conceptual level, was constructed. The approach was illuminated using a case of 2__roots UML galaxy diagram that takes one retailer and several suppliers of PC products into consideration.展开更多
In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to mul...In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to multidimensional data model the UML galaxy diagram is presented in order to conduct multidimensional data analysis for multiple subjects. The approach is illuminated using a case of 2_roots UML galaxy diagram that takes marketing analysis of TV products involved one retailer and several suppliers into consideration.展开更多
Guyana’s capacity to address the impacts of climate change on its coastal environment requires the ability to mon-itor,quantify and understand coastal change over short-,medium-and long-term.Understanding the drivers...Guyana’s capacity to address the impacts of climate change on its coastal environment requires the ability to mon-itor,quantify and understand coastal change over short-,medium-and long-term.Understanding the drivers of change in coastal and marine environment can be achieved through the accurate measurement and critical anal-yses of morphologies,flows,processes and responses.This manuscript presents a strategy developed to create a central resource,database and web-based platform to integrate data and information on the drivers and the changes within Guyana coastal and marine environment.The strategy involves four complimentary work pack-ages including data collection,development of a platform for data integration,application of the data for coastal change analyses and consultation with stakeholders.The last aims to assess the role of the integrated data sys-tems to support strategic governance and sustainable decision-making.It is hoped that the output of this strategy would support the country’s climate-focused agencies,organisations,decision-makers,and researchers in their tasks and endeavours.展开更多
Accurately evaluating the lifespan of the Printed Circuit Board(PCB)in airborne equipment is an essential issue for aircraft design and operation in the marine atmospheric environment.This paper presents a novel evalu...Accurately evaluating the lifespan of the Printed Circuit Board(PCB)in airborne equipment is an essential issue for aircraft design and operation in the marine atmospheric environment.This paper presents a novel evaluation method by fusing Accelerated Degradation Testing(ADT)data,degradation data,and life data of small samples based on the uncertainty degradation process.An uncertain life model of PCB in airborne equipment is constructed by employing the uncertain distribution that considers the accelerated factor of multiple environmental conditions such as temperature,humidity,and salinity.In addition,a degradation process model of PCB in airborne equipment is constructed by employing the uncertain process of fusing ADT data and field data,in which the performance characteristics of dynamic cumulative change are included.Based on minimizing the pth sample moments,an integrated method for parameter estimation of the PCB in airborne equipment is proposed by fusing the multi-source data of life,degradation,and ADT.An engineering case illustrates the effectiveness and advantage of the proposed method.展开更多
Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we...Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.展开更多
Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-s...Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.展开更多
With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for clou...With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.展开更多
Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This a...Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This article presents an effective mechanism to preserve the secrecy and integrity of data stored on the public cloud by leveraging blockchain technology,smart contracts,and cryptographic primitives.The proposed approach utilizes a Solidity-based smart contract as an auditor for maintaining and verifying the integrity of outsourced data.To preserve data secrecy,symmetric encryption systems are employed to encrypt user data before outsourcing it.An extensive performance analysis is conducted to illustrate the efficiency of the proposed mechanism.Additionally,a rigorous assessment is conducted to ensure that the developed smart contract is free from vulnerabilities and to measure its associated running costs.The security analysis of the proposed system confirms that our approach can securely maintain the confidentiality and integrity of cloud storage,even in the presence of malicious entities.The proposed mechanism contributes to enhancing data security in cloud computing environments and can be used as a foundation for developing more secure cloud storage systems.展开更多
Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when ...Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when users perform file operations,the semantic integrity of the data can be compromised.Ensuring both data integrity and semantic correctness has become a critical issue that requires attention.We introduce a pioneering solution called Sec-Auditor,the first of its kind with the ability to verify data integrity and semantic correctness simultaneously,while maintaining a constant communication cost independent of the audited data volume.Sec-Auditor also supports public auditing,enabling anyone with access to public information to conduct data audits.This feature makes Sec-Auditor highly adaptable to open data environments,such as the cloud.In Sec-Auditor,users are assigned specific rules that are utilized to verify the accuracy of data semantic.Furthermore,users are given the flexibility to update their own rules as needed.We conduct in-depth analyses of the correctness and security of Sec-Auditor.We also compare several important security attributes with existing schemes,demonstrating the superior properties of Sec-Auditor.Evaluation results demonstrate that even for time-consuming file upload operations,our solution is more efficient than the comparison one.展开更多
With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The networ...With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.展开更多
Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted ...Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.展开更多
Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data...Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it.展开更多
Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science.However,the semantic heterogeneity of geospatia...Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science.However,the semantic heterogeneity of geospatial data is a major problem that significantly hinders geospatial data integration and sharing.Ontologies are regarded as a promising way to solve semantic problems by providing a formalized representation of geographic entities and relationships between them in a manner understandable to machines.Thus,many efforts have been made to explore ontology-based geospatial data integration and sharing.However,there is a lack of a specialized ontology that would provide a unified description for geospatial data.In this paper,with a focus on the characteristics of geospatial data,we propose a unified framework for geospatial data ontology,denoted GeoDataOnt,to establish a semantic foundation for geospatial data integration and sharing.First,we provide a characteristics hierarchy of geospatial data.Next,we analyze the semantic problems for each characteristic of geospatial data.Subsequently,we propose the general framework of GeoDataOnt,targeting these problems according to the characteristics of geospatial data.GeoDataOnt is then divided into multiple modules,and we show a detailed design and implementation for each module.Key limitations and challenges of GeoDataOnt are identified,and broad applications of GeoDataOnt are discussed.展开更多
New challenges including how to share information on heterogeneous devices appear in data-intensive pervasive computing environments. Data integration is a practical approach to these applications. Dealing with incons...New challenges including how to share information on heterogeneous devices appear in data-intensive pervasive computing environments. Data integration is a practical approach to these applications. Dealing with inconsistencies is one of the important problems in data integration. In this paper we motivate the problem of data inconsistency solution for data integration in pervasive environments. We define data qualit~ criteria and expense quality criteria for data sources to solve data inconsistency. In our solution, firstly, data sources needing high expense to obtain data from them are discarded by using expense quality criteria and utility function. Since it is difficult to obtain the actual quality of data sources in pervasive computing environment, we introduce fuzzy multi-attribute group decision making approach to selecting the appropriate data sources. The experimental results show that our solution has ideal effectiveness.展开更多
In this paper we propose a service-oriented architecture for spatial data integration (SOA-SDI) in the context of a large number of available spatial data sources that are physically sitting at different places, and d...In this paper we propose a service-oriented architecture for spatial data integration (SOA-SDI) in the context of a large number of available spatial data sources that are physically sitting at different places, and develop web-based GIS systems based on SOA-SDI, allowing client applications to pull in, analyze and present spatial data from those available spatial data sources. The proposed architecture logically includes 4 layers or components; they are layer of multiple data provider services, layer of data in-tegration, layer of backend services, and front-end graphical user interface (GUI) for spatial data presentation. On the basis of the 4-layered SOA-SDI framework, WebGIS applications can be quickly deployed, which proves that SOA-SDI has the potential to reduce the input of software development and shorten the development period.展开更多
Background: More and more high-throughput datasets are available from multiple levels of measuring gene regulations. The reverse engineering of gene regulatory networks from these data offers a valuable research para...Background: More and more high-throughput datasets are available from multiple levels of measuring gene regulations. The reverse engineering of gene regulatory networks from these data offers a valuable research paradigm to decipher regulatory mechanisms. So far, numerous methods have been developed for reconstructing gene regulatory networks. Results: In this paper, we provide a review of bioinformatics methods for inferring gene regulatory network from omics data. To achieve the precision reconstruction of gene regulatory networks, an intuitive alternative is to integrate these available resources in a rational framework. We also provide computational perspectives in the endeavors of inferring gene regulatory networks from heterogeneous data. We highlight the importance of multi-omics data integration with prior knowledge in gene regulatory network inferences. Conclusions: We provide computational perspectives of inferring gene regulatory networks from multiple omics data and present theoretical analyses of existing challenges and possible solutions. We emphasize on prior knowledge and data integration in network inferences owing to their abilities of identifying regulatory causality.展开更多
Land cover is recognized as one of the fundamental terrestrial datasets required in land system change and other ecosystem related researches across the globe. The regional differentiation and spatial-temporal variati...Land cover is recognized as one of the fundamental terrestrial datasets required in land system change and other ecosystem related researches across the globe. The regional differentiation and spatial-temporal variation of land cover has significant impact on regional natural environment and socio-economic sustainable development. Under this context, we reconstructed the history land cover data in Siberia to provide a comparable datasets to the land cover datasets in China and abroad. In this paper, the European Space Agency(ESA) Global Land Cover Map(GlobCover), Landsat Thematic Mapper(TM), Enhanced Thematic Mapper(ETM), Multispectral Scanner(MSS) images, Google Earth images and other additional data were used to produce the land cover datasets in 1975 and 2010 in Siberia. Data evaluation show that the total user′s accuracy of land cover data in 2010 was 86.96%, which was higher than ESA GlobCover data in Siberia. The analysis on the land cover changes found that there were no big land cover changes in Siberia from 1975 to 2010 with only a few conversions between different natural forest types. The mainly changes are the conversion from deciduous needleleaf forest to deciduous broadleaf forest, deciduous needleleaf forest to mixed forest, savannas to deciduous needleleaf forest etc., indicating that the dominant driving factor of land cover changes in Siberia was natural element rather than human activities at some extent, which was very different from China. However, our purpose was not just to produce the land cover datasets at two time period or explore the driving factors of land cover changes in Siberia, we also paid attention on the significance and application of the datasets in various fields such as global climate change, geopolitics, cross-border cooperation and so on.展开更多
基金Weaponry Equipment Pre-Research Foundation of PLA Equipment Ministry (No. 9140A06050409JB8102)Pre-Research Foundation of PLA University of Science and Technology (No. 2009JSJ11)
文摘To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.
基金Supportted by the Natural Science Foundation ofChina (60573091 ,60273018) National Basic Research and Develop-ment Programof China (2003CB317000) the Key Project of Minis-try of Education of China (03044) .
文摘With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced.
基金Supported by National High Technology Research and Development Program of China (863 Program) (Nos. 2009AA12Z225,2009AA12Z208)the National Natural Science Foundation of China (No. 61074132)
文摘Currently,ocean data portals are being developed around the world based on Geographic Information Systems(GIS) as a source of ocean data and information.However,given the relatively high temporal frequency and the intrinsic spatial nature of ocean data and information,no current GIS software is adequate to deal effectively and efficiently with spatiotemporal data.Furthermore,while existing ocean data portals are generally designed to meet the basic needs of a broad range of users,they are sometimes very complicated for general audiences,especially for those without training in GIS.In this paper,a new technical architecture for an ocean data integration and service system is put forward that consists of four layers:the operation layer,the extract,transform,and load(ETL) layer,the data warehouse layer,and the presentation layer.The integration technology based on the XML,ontology,and spatiotemporal data organization scheme for the data warehouse layer is then discussed.In addition,the ocean observing data service technology realized in the presentation layer is also discussed in detail,including the development of the web portal and ocean data sharing platform.The application on the Taiwan Strait shows that the technology studied in this paper can facilitate sharing,access,and use of ocean observation data.The paper is based on an ongoing research project for the development of an ocean observing information system for the Taiwan Strait that will facilitate the prevention of ocean disasters.
文摘In e-commerce the multidimensional data analysis for OLAP (on-line analytical processing) based on the web data needs integrating various data sources such as XML (extensible markup language) data and relational data on the conceptual level. A conceptual data description approach of multidimensional data model was presented in order to conduct multidimensional data analysis of OLAP for multiple subjects. The UML (unified modeling language) galaxy diagram, describing the multidimensional structure of the conceptual integrating data at the conceptual level, was constructed. The approach was illuminated using a case of 2__roots UML galaxy diagram that takes one retailer and several suppliers of PC products into consideration.
基金This project was supported by China Postdoctoral Science Foundation (2005037506) and the National Natural ScienceFoundation of China (70472029)
文摘In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to multidimensional data model the UML galaxy diagram is presented in order to conduct multidimensional data analysis for multiple subjects. The approach is illuminated using a case of 2_roots UML galaxy diagram that takes marketing analysis of TV products involved one retailer and several suppliers into consideration.
基金We appreciate United Nations Development Programme-Indonesia and Archipelagic&Island States(AIS)Forum for the 2021 Archipelagic&Island States Innovation Challenges Award given for this idea on Joint Research Programme in Climate Change Mitigation and Adaptation.
文摘Guyana’s capacity to address the impacts of climate change on its coastal environment requires the ability to mon-itor,quantify and understand coastal change over short-,medium-and long-term.Understanding the drivers of change in coastal and marine environment can be achieved through the accurate measurement and critical anal-yses of morphologies,flows,processes and responses.This manuscript presents a strategy developed to create a central resource,database and web-based platform to integrate data and information on the drivers and the changes within Guyana coastal and marine environment.The strategy involves four complimentary work pack-ages including data collection,development of a platform for data integration,application of the data for coastal change analyses and consultation with stakeholders.The last aims to assess the role of the integrated data sys-tems to support strategic governance and sustainable decision-making.It is hoped that the output of this strategy would support the country’s climate-focused agencies,organisations,decision-makers,and researchers in their tasks and endeavours.
基金supported by the National Natural Science Foundation of China(No.62073009).
文摘Accurately evaluating the lifespan of the Printed Circuit Board(PCB)in airborne equipment is an essential issue for aircraft design and operation in the marine atmospheric environment.This paper presents a novel evaluation method by fusing Accelerated Degradation Testing(ADT)data,degradation data,and life data of small samples based on the uncertainty degradation process.An uncertain life model of PCB in airborne equipment is constructed by employing the uncertain distribution that considers the accelerated factor of multiple environmental conditions such as temperature,humidity,and salinity.In addition,a degradation process model of PCB in airborne equipment is constructed by employing the uncertain process of fusing ADT data and field data,in which the performance characteristics of dynamic cumulative change are included.Based on minimizing the pth sample moments,an integrated method for parameter estimation of the PCB in airborne equipment is proposed by fusing the multi-source data of life,degradation,and ADT.An engineering case illustrates the effectiveness and advantage of the proposed method.
基金supported by the National Natural Science Foundation of China (No.32070656)the Nanjing University Deng Feng Scholars Program+1 种基金the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions,China Postdoctoral Science Foundation funded project (No.2022M711563)Jiangsu Funding Program for Excellent Postdoctoral Talent (No.2022ZB50)
文摘Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.
基金funding within the Wheat BigData Project(German Federal Ministry of Food and Agriculture,FKZ2818408B18)。
文摘Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.
基金sponsored by the National Natural Science Foundation of China under grant number No. 62172353, No. 62302114, No. U20B2046 and No. 62172115Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education No.1331007 and No. 1311022+1 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions Grant No. 17KJB520044Six Talent Peaks Project in Jiangsu Province No.XYDXX-108
文摘With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.
文摘Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This article presents an effective mechanism to preserve the secrecy and integrity of data stored on the public cloud by leveraging blockchain technology,smart contracts,and cryptographic primitives.The proposed approach utilizes a Solidity-based smart contract as an auditor for maintaining and verifying the integrity of outsourced data.To preserve data secrecy,symmetric encryption systems are employed to encrypt user data before outsourcing it.An extensive performance analysis is conducted to illustrate the efficiency of the proposed mechanism.Additionally,a rigorous assessment is conducted to ensure that the developed smart contract is free from vulnerabilities and to measure its associated running costs.The security analysis of the proposed system confirms that our approach can securely maintain the confidentiality and integrity of cloud storage,even in the presence of malicious entities.The proposed mechanism contributes to enhancing data security in cloud computing environments and can be used as a foundation for developing more secure cloud storage systems.
基金This research was supported by the Qinghai Provincial High-End Innovative and Entrepreneurial Talents Project.
文摘Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when users perform file operations,the semantic integrity of the data can be compromised.Ensuring both data integrity and semantic correctness has become a critical issue that requires attention.We introduce a pioneering solution called Sec-Auditor,the first of its kind with the ability to verify data integrity and semantic correctness simultaneously,while maintaining a constant communication cost independent of the audited data volume.Sec-Auditor also supports public auditing,enabling anyone with access to public information to conduct data audits.This feature makes Sec-Auditor highly adaptable to open data environments,such as the cloud.In Sec-Auditor,users are assigned specific rules that are utilized to verify the accuracy of data semantic.Furthermore,users are given the flexibility to update their own rules as needed.We conduct in-depth analyses of the correctness and security of Sec-Auditor.We also compare several important security attributes with existing schemes,demonstrating the superior properties of Sec-Auditor.Evaluation results demonstrate that even for time-consuming file upload operations,our solution is more efficient than the comparison one.
基金This work was supported by the National Natural Science Foundation of China(U2133208,U20A20161).
文摘With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.
基金supported in part by the MOST Major Research and Development Project(Grant No.2021YFB2900204)the National Natural Science Foundation of China(NSFC)(Grant No.62201123,No.62132004,No.61971102)+3 种基金China Postdoctoral Science Foundation(Grant No.2022TQ0056)in part by the financial support of the Sichuan Science and Technology Program(Grant No.2022YFH0022)Sichuan Major R&D Project(Grant No.22QYCX0168)the Municipal Government of Quzhou(Grant No.2022D031)。
文摘Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.
文摘Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it.
基金This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences[grant number XDA23100100]National Natural Science Foundation of China[grant number 41771430],[grant number 41631177]China Scholarship Council[grant number 201804910732].
文摘Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science.However,the semantic heterogeneity of geospatial data is a major problem that significantly hinders geospatial data integration and sharing.Ontologies are regarded as a promising way to solve semantic problems by providing a formalized representation of geographic entities and relationships between them in a manner understandable to machines.Thus,many efforts have been made to explore ontology-based geospatial data integration and sharing.However,there is a lack of a specialized ontology that would provide a unified description for geospatial data.In this paper,with a focus on the characteristics of geospatial data,we propose a unified framework for geospatial data ontology,denoted GeoDataOnt,to establish a semantic foundation for geospatial data integration and sharing.First,we provide a characteristics hierarchy of geospatial data.Next,we analyze the semantic problems for each characteristic of geospatial data.Subsequently,we propose the general framework of GeoDataOnt,targeting these problems according to the characteristics of geospatial data.GeoDataOnt is then divided into multiple modules,and we show a detailed design and implementation for each module.Key limitations and challenges of GeoDataOnt are identified,and broad applications of GeoDataOnt are discussed.
基金supported by the National Natural Science Foundation of China under Grant No. 60970010the National Basic Research 973 Program of China under Grant No. 2009CB320705the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20090073110026
文摘New challenges including how to share information on heterogeneous devices appear in data-intensive pervasive computing environments. Data integration is a practical approach to these applications. Dealing with inconsistencies is one of the important problems in data integration. In this paper we motivate the problem of data inconsistency solution for data integration in pervasive environments. We define data qualit~ criteria and expense quality criteria for data sources to solve data inconsistency. In our solution, firstly, data sources needing high expense to obtain data from them are discarded by using expense quality criteria and utility function. Since it is difficult to obtain the actual quality of data sources in pervasive computing environment, we introduce fuzzy multi-attribute group decision making approach to selecting the appropriate data sources. The experimental results show that our solution has ideal effectiveness.
基金Supported by the Research Fund of Key GIS Lab of the Education Ministry (No. 200610)
文摘In this paper we propose a service-oriented architecture for spatial data integration (SOA-SDI) in the context of a large number of available spatial data sources that are physically sitting at different places, and develop web-based GIS systems based on SOA-SDI, allowing client applications to pull in, analyze and present spatial data from those available spatial data sources. The proposed architecture logically includes 4 layers or components; they are layer of multiple data provider services, layer of data in-tegration, layer of backend services, and front-end graphical user interface (GUI) for spatial data presentation. On the basis of the 4-layered SOA-SDI framework, WebGIS applications can be quickly deployed, which proves that SOA-SDI has the potential to reduce the input of software development and shorten the development period.
基金Thanks are due to the three anonymous reviewers for their constructive comments. This work was partially supported by the National Natural Science Foundation of China (Nos. 61572287 and 61533011), the Shandong Provincial Key Research and Development Program (2018GSF 118043), the Natural Science Foundation of Shandong Province, China (ZR2015FQ001), the Fundamental Research Funds of Shandong University (Nos. 2015QY001 and 2016JC007), the Scientific Research Foundation for the Returned Overseas Chinese Scholars, Ministry of Education of China.
文摘Background: More and more high-throughput datasets are available from multiple levels of measuring gene regulations. The reverse engineering of gene regulatory networks from these data offers a valuable research paradigm to decipher regulatory mechanisms. So far, numerous methods have been developed for reconstructing gene regulatory networks. Results: In this paper, we provide a review of bioinformatics methods for inferring gene regulatory network from omics data. To achieve the precision reconstruction of gene regulatory networks, an intuitive alternative is to integrate these available resources in a rational framework. We also provide computational perspectives in the endeavors of inferring gene regulatory networks from heterogeneous data. We highlight the importance of multi-omics data integration with prior knowledge in gene regulatory network inferences. Conclusions: We provide computational perspectives of inferring gene regulatory networks from multiple omics data and present theoretical analyses of existing challenges and possible solutions. We emphasize on prior knowledge and data integration in network inferences owing to their abilities of identifying regulatory causality.
基金Under the auspices of National Natural Science Foundation of China(No.41271416)Strategic Priority Research Program of Chinese Academy of Sciences(No.XDA05090310)
文摘Land cover is recognized as one of the fundamental terrestrial datasets required in land system change and other ecosystem related researches across the globe. The regional differentiation and spatial-temporal variation of land cover has significant impact on regional natural environment and socio-economic sustainable development. Under this context, we reconstructed the history land cover data in Siberia to provide a comparable datasets to the land cover datasets in China and abroad. In this paper, the European Space Agency(ESA) Global Land Cover Map(GlobCover), Landsat Thematic Mapper(TM), Enhanced Thematic Mapper(ETM), Multispectral Scanner(MSS) images, Google Earth images and other additional data were used to produce the land cover datasets in 1975 and 2010 in Siberia. Data evaluation show that the total user′s accuracy of land cover data in 2010 was 86.96%, which was higher than ESA GlobCover data in Siberia. The analysis on the land cover changes found that there were no big land cover changes in Siberia from 1975 to 2010 with only a few conversions between different natural forest types. The mainly changes are the conversion from deciduous needleleaf forest to deciduous broadleaf forest, deciduous needleleaf forest to mixed forest, savannas to deciduous needleleaf forest etc., indicating that the dominant driving factor of land cover changes in Siberia was natural element rather than human activities at some extent, which was very different from China. However, our purpose was not just to produce the land cover datasets at two time period or explore the driving factors of land cover changes in Siberia, we also paid attention on the significance and application of the datasets in various fields such as global climate change, geopolitics, cross-border cooperation and so on.