With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over...With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced.展开更多
In e-commerce the multidimensional data analysis for OLAP (on-line analytical processing) based on the web data needs integrating various data sources such as XML (extensible markup language) data and relational data ...In e-commerce the multidimensional data analysis for OLAP (on-line analytical processing) based on the web data needs integrating various data sources such as XML (extensible markup language) data and relational data on the conceptual level. A conceptual data description approach of multidimensional data model was presented in order to conduct multidimensional data analysis of OLAP for multiple subjects. The UML (unified modeling language) galaxy diagram, describing the multidimensional structure of the conceptual integrating data at the conceptual level, was constructed. The approach was illuminated using a case of 2__roots UML galaxy diagram that takes one retailer and several suppliers of PC products into consideration.展开更多
In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to mul...In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to multidimensional data model the UML galaxy diagram is presented in order to conduct multidimensional data analysis for multiple subjects. The approach is illuminated using a case of 2_roots UML galaxy diagram that takes marketing analysis of TV products involved one retailer and several suppliers into consideration.展开更多
Guyana’s capacity to address the impacts of climate change on its coastal environment requires the ability to mon-itor,quantify and understand coastal change over short-,medium-and long-term.Understanding the drivers...Guyana’s capacity to address the impacts of climate change on its coastal environment requires the ability to mon-itor,quantify and understand coastal change over short-,medium-and long-term.Understanding the drivers of change in coastal and marine environment can be achieved through the accurate measurement and critical anal-yses of morphologies,flows,processes and responses.This manuscript presents a strategy developed to create a central resource,database and web-based platform to integrate data and information on the drivers and the changes within Guyana coastal and marine environment.The strategy involves four complimentary work pack-ages including data collection,development of a platform for data integration,application of the data for coastal change analyses and consultation with stakeholders.The last aims to assess the role of the integrated data sys-tems to support strategic governance and sustainable decision-making.It is hoped that the output of this strategy would support the country’s climate-focused agencies,organisations,decision-makers,and researchers in their tasks and endeavours.展开更多
Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we...Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.展开更多
Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-s...Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.展开更多
With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for clou...With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.展开更多
Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This a...Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This article presents an effective mechanism to preserve the secrecy and integrity of data stored on the public cloud by leveraging blockchain technology,smart contracts,and cryptographic primitives.The proposed approach utilizes a Solidity-based smart contract as an auditor for maintaining and verifying the integrity of outsourced data.To preserve data secrecy,symmetric encryption systems are employed to encrypt user data before outsourcing it.An extensive performance analysis is conducted to illustrate the efficiency of the proposed mechanism.Additionally,a rigorous assessment is conducted to ensure that the developed smart contract is free from vulnerabilities and to measure its associated running costs.The security analysis of the proposed system confirms that our approach can securely maintain the confidentiality and integrity of cloud storage,even in the presence of malicious entities.The proposed mechanism contributes to enhancing data security in cloud computing environments and can be used as a foundation for developing more secure cloud storage systems.展开更多
Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when ...Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when users perform file operations,the semantic integrity of the data can be compromised.Ensuring both data integrity and semantic correctness has become a critical issue that requires attention.We introduce a pioneering solution called Sec-Auditor,the first of its kind with the ability to verify data integrity and semantic correctness simultaneously,while maintaining a constant communication cost independent of the audited data volume.Sec-Auditor also supports public auditing,enabling anyone with access to public information to conduct data audits.This feature makes Sec-Auditor highly adaptable to open data environments,such as the cloud.In Sec-Auditor,users are assigned specific rules that are utilized to verify the accuracy of data semantic.Furthermore,users are given the flexibility to update their own rules as needed.We conduct in-depth analyses of the correctness and security of Sec-Auditor.We also compare several important security attributes with existing schemes,demonstrating the superior properties of Sec-Auditor.Evaluation results demonstrate that even for time-consuming file upload operations,our solution is more efficient than the comparison one.展开更多
With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The networ...With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.展开更多
Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted ...Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.展开更多
Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data...Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it.展开更多
Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science.However,the semantic heterogeneity of geospatia...Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science.However,the semantic heterogeneity of geospatial data is a major problem that significantly hinders geospatial data integration and sharing.Ontologies are regarded as a promising way to solve semantic problems by providing a formalized representation of geographic entities and relationships between them in a manner understandable to machines.Thus,many efforts have been made to explore ontology-based geospatial data integration and sharing.However,there is a lack of a specialized ontology that would provide a unified description for geospatial data.In this paper,with a focus on the characteristics of geospatial data,we propose a unified framework for geospatial data ontology,denoted GeoDataOnt,to establish a semantic foundation for geospatial data integration and sharing.First,we provide a characteristics hierarchy of geospatial data.Next,we analyze the semantic problems for each characteristic of geospatial data.Subsequently,we propose the general framework of GeoDataOnt,targeting these problems according to the characteristics of geospatial data.GeoDataOnt is then divided into multiple modules,and we show a detailed design and implementation for each module.Key limitations and challenges of GeoDataOnt are identified,and broad applications of GeoDataOnt are discussed.展开更多
New challenges including how to share information on heterogeneous devices appear in data-intensive pervasive computing environments. Data integration is a practical approach to these applications. Dealing with incons...New challenges including how to share information on heterogeneous devices appear in data-intensive pervasive computing environments. Data integration is a practical approach to these applications. Dealing with inconsistencies is one of the important problems in data integration. In this paper we motivate the problem of data inconsistency solution for data integration in pervasive environments. We define data qualit~ criteria and expense quality criteria for data sources to solve data inconsistency. In our solution, firstly, data sources needing high expense to obtain data from them are discarded by using expense quality criteria and utility function. Since it is difficult to obtain the actual quality of data sources in pervasive computing environment, we introduce fuzzy multi-attribute group decision making approach to selecting the appropriate data sources. The experimental results show that our solution has ideal effectiveness.展开更多
Background: More and more high-throughput datasets are available from multiple levels of measuring gene regulations. The reverse engineering of gene regulatory networks from these data offers a valuable research para...Background: More and more high-throughput datasets are available from multiple levels of measuring gene regulations. The reverse engineering of gene regulatory networks from these data offers a valuable research paradigm to decipher regulatory mechanisms. So far, numerous methods have been developed for reconstructing gene regulatory networks. Results: In this paper, we provide a review of bioinformatics methods for inferring gene regulatory network from omics data. To achieve the precision reconstruction of gene regulatory networks, an intuitive alternative is to integrate these available resources in a rational framework. We also provide computational perspectives in the endeavors of inferring gene regulatory networks from heterogeneous data. We highlight the importance of multi-omics data integration with prior knowledge in gene regulatory network inferences. Conclusions: We provide computational perspectives of inferring gene regulatory networks from multiple omics data and present theoretical analyses of existing challenges and possible solutions. We emphasize on prior knowledge and data integration in network inferences owing to their abilities of identifying regulatory causality.展开更多
To construct mediators for data integration systems that integrate structured and semi-structured data, and to facilitate the reformulation and decomposition of the query, the presented system uses the XML processing ...To construct mediators for data integration systems that integrate structured and semi-structured data, and to facilitate the reformulation and decomposition of the query, the presented system uses the XML processing language (XPL) for the mediator. With XPL, it is easy to construct mediators for data integration based on XML, and it can accelerate the work in the mediator.展开更多
Inflammatory bowel disease(IBD)is a complex disease with variability in genetic,environmental,and lifestyle factors affecting disease presentation and course.Precision medicine has the potential to play a crucial role...Inflammatory bowel disease(IBD)is a complex disease with variability in genetic,environmental,and lifestyle factors affecting disease presentation and course.Precision medicine has the potential to play a crucial role in managing IBD by tailoring treatment plans based on the heterogeneity of clinical and temporal variability of patients.Precision medicine is a population-based approach to managing IBD by integrating environmental,genomic,epigenomic,transcriptomic,proteomic,and metabolomic factors.It is a recent and rapidly developing medicine.The widespread adoption of precision medicine worldwide has the potential to result in the early detection of diseases,optimal utilization of healthcare resources,enhanced patient outcomes,and,ultimately,improved quality of life for individuals with IBD.Though precision medicine is promising in terms of better quality of patient care,inadequacies exist in the ongoing research.There is discordance in study conduct,and data collection,utilization,interpretation,and analysis.This review aims to describe the current literature on precision medicine,its multiomics approach,and future directions for its application in IBD.展开更多
An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid in...An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid integration technology. Multimode output waveguides in the silica AWG with 2% refractive index difference are used to obtain fiat-top spectra. The output waveguide facet is polished to 45° bevel to change the light propagation direction into the mesa-type PIN PD, which simplifies the packaging process. The experimentM results show that the single channel I dB bandwidth of AWG ranges from 2.12nm to 3.06nm, the ROSA responsivity ranges from 0.097 A/W to 0.158A/W, and the 3dB bandwidth is up to 11 GHz. It is promising to be applied in the eight-lane WDM transmission system in data center interconnection.展开更多
Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF iron...Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF ironmaking are available, and the rapid development of data science and intelligent technology will provide an effective means to solve the uncertainty problem in the BF ironmaking process. This work focused on the application of artificial intelligence technology in BF ironmaking. The current intelligent BF ironmaking technology was summarized and analyzed from five aspects. These aspects include BF data management, the analyses of time delay and correlation, the prediction of BF key variables, the evaluation of BF status, and the multi-objective intelligent optimization of BF operations. Solutions and suggestions were offered for the problems in the current progress, and some outlooks for future prospects and technological breakthroughs were added. To effectively improve the BF data quality, we comprehensively considered the data problems and the characteristics of algorithms and selected the data processing method scientifically. For analyzing important BF characteristics, the effect of the delay was eliminated to ensure an accurate logical relationship between the BF parameters and economic indicators. As for BF parameter prediction and BF status evaluation,a BF intelligence model that integrates data information and process mechanism was built to effectively achieve the accurate prediction of BF key indexes and the scientific evaluation of BF status. During the optimization of BF parameters, low risk, low cost, and high return were used as the optimization criteria, and while pursuing the optimization effect, the feasibility and site operation cost were considered comprehensively.This work will help increase the process operator’s overall awareness and understanding of intelligent BF technology. Additionally, combining big data technology with the process will improve the practicality of data models in actual production and promote the application of intelligent technology in BF ironmaking.展开更多
Integrated data and energy transfer(IDET)is capable of simultaneously delivering on-demand data and energy to low-power Internet of Everything(Io E)devices.We propose a multi-carrier IDET transceiver relying on superp...Integrated data and energy transfer(IDET)is capable of simultaneously delivering on-demand data and energy to low-power Internet of Everything(Io E)devices.We propose a multi-carrier IDET transceiver relying on superposition waveforms consisting of multi-sinusoidal signals for wireless energy transfer(WET)and orthogonal-frequency-divisionmultiplexing(OFDM)signals for wireless data transfer(WDT).The outdated channel state information(CSI)in aging channels is employed by the transmitter to shape IDET waveforms.With the constraints of transmission power and WDT requirement,the amplitudes and phases of the IDET waveform at the transmitter and the power splitter at the receiver are jointly optimised for maximising the average directcurrent(DC)among a limited number of transmission frames with the existence of carrier-frequencyoffset(CFO).For the amplitude optimisation,the original non-convex problem can be transformed into a reversed geometric programming problem,then it can be effectively solved with existing tools.As for the phase optimisation,the artificial bee colony(ABC)algorithm is invoked in order to deal with the nonconvexity.Iteration between the amplitude optimisation and phase optimisation yields our joint design.Numerical results demonstrate the advantage of our joint design for the IDET waveform shaping with the existence of the CFO and the outdated CSI.展开更多
基金Supportted by the Natural Science Foundation ofChina (60573091 ,60273018) National Basic Research and Develop-ment Programof China (2003CB317000) the Key Project of Minis-try of Education of China (03044) .
文摘With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced.
文摘In e-commerce the multidimensional data analysis for OLAP (on-line analytical processing) based on the web data needs integrating various data sources such as XML (extensible markup language) data and relational data on the conceptual level. A conceptual data description approach of multidimensional data model was presented in order to conduct multidimensional data analysis of OLAP for multiple subjects. The UML (unified modeling language) galaxy diagram, describing the multidimensional structure of the conceptual integrating data at the conceptual level, was constructed. The approach was illuminated using a case of 2__roots UML galaxy diagram that takes one retailer and several suppliers of PC products into consideration.
基金This project was supported by China Postdoctoral Science Foundation (2005037506) and the National Natural ScienceFoundation of China (70472029)
文摘In e-commerce the multidimensional data analysis based on the Web data needs integrating various data sources such as XML data and relational data on the conceptual level. A conceptual data description approach to multidimensional data model the UML galaxy diagram is presented in order to conduct multidimensional data analysis for multiple subjects. The approach is illuminated using a case of 2_roots UML galaxy diagram that takes marketing analysis of TV products involved one retailer and several suppliers into consideration.
基金We appreciate United Nations Development Programme-Indonesia and Archipelagic&Island States(AIS)Forum for the 2021 Archipelagic&Island States Innovation Challenges Award given for this idea on Joint Research Programme in Climate Change Mitigation and Adaptation.
文摘Guyana’s capacity to address the impacts of climate change on its coastal environment requires the ability to mon-itor,quantify and understand coastal change over short-,medium-and long-term.Understanding the drivers of change in coastal and marine environment can be achieved through the accurate measurement and critical anal-yses of morphologies,flows,processes and responses.This manuscript presents a strategy developed to create a central resource,database and web-based platform to integrate data and information on the drivers and the changes within Guyana coastal and marine environment.The strategy involves four complimentary work pack-ages including data collection,development of a platform for data integration,application of the data for coastal change analyses and consultation with stakeholders.The last aims to assess the role of the integrated data sys-tems to support strategic governance and sustainable decision-making.It is hoped that the output of this strategy would support the country’s climate-focused agencies,organisations,decision-makers,and researchers in their tasks and endeavours.
基金supported by the National Natural Science Foundation of China (No.32070656)the Nanjing University Deng Feng Scholars Program+1 种基金the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions,China Postdoctoral Science Foundation funded project (No.2022M711563)Jiangsu Funding Program for Excellent Postdoctoral Talent (No.2022ZB50)
文摘Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.
基金funding within the Wheat BigData Project(German Federal Ministry of Food and Agriculture,FKZ2818408B18)。
文摘Genome-wide association mapping studies(GWAS)based on Big Data are a potential approach to improve marker-assisted selection in plant breeding.The number of available phenotypic and genomic data sets in which medium-sized populations of several hundred individuals have been studied is rapidly increasing.Combining these data and using them in GWAS could increase both the power of QTL discovery and the accuracy of estimation of underlying genetic effects,but is hindered by data heterogeneity and lack of interoperability.In this study,we used genomic and phenotypic data sets,focusing on Central European winter wheat populations evaluated for heading date.We explored strategies for integrating these data and subsequently the resulting potential for GWAS.Establishing interoperability between data sets was greatly aided by some overlapping genotypes and a linear relationship between the different phenotyping protocols,resulting in high quality integrated phenotypic data.In this context,genomic prediction proved to be a suitable tool to study relevance of interactions between genotypes and experimental series,which was low in our case.Contrary to expectations,fewer associations between markers and traits were found in the larger combined data than in the individual experimental series.However,the predictive power based on the marker-trait associations of the integrated data set was higher across data sets.Therefore,the results show that the integration of medium-sized to Big Data is an approach to increase the power to detect QTL in GWAS.The results encourage further efforts to standardize and share data in the plant breeding community.
基金sponsored by the National Natural Science Foundation of China under grant number No. 62172353, No. 62302114, No. U20B2046 and No. 62172115Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education No.1331007 and No. 1311022+1 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions Grant No. 17KJB520044Six Talent Peaks Project in Jiangsu Province No.XYDXX-108
文摘With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.
文摘Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This article presents an effective mechanism to preserve the secrecy and integrity of data stored on the public cloud by leveraging blockchain technology,smart contracts,and cryptographic primitives.The proposed approach utilizes a Solidity-based smart contract as an auditor for maintaining and verifying the integrity of outsourced data.To preserve data secrecy,symmetric encryption systems are employed to encrypt user data before outsourcing it.An extensive performance analysis is conducted to illustrate the efficiency of the proposed mechanism.Additionally,a rigorous assessment is conducted to ensure that the developed smart contract is free from vulnerabilities and to measure its associated running costs.The security analysis of the proposed system confirms that our approach can securely maintain the confidentiality and integrity of cloud storage,even in the presence of malicious entities.The proposed mechanism contributes to enhancing data security in cloud computing environments and can be used as a foundation for developing more secure cloud storage systems.
基金This research was supported by the Qinghai Provincial High-End Innovative and Entrepreneurial Talents Project.
文摘Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when users perform file operations,the semantic integrity of the data can be compromised.Ensuring both data integrity and semantic correctness has become a critical issue that requires attention.We introduce a pioneering solution called Sec-Auditor,the first of its kind with the ability to verify data integrity and semantic correctness simultaneously,while maintaining a constant communication cost independent of the audited data volume.Sec-Auditor also supports public auditing,enabling anyone with access to public information to conduct data audits.This feature makes Sec-Auditor highly adaptable to open data environments,such as the cloud.In Sec-Auditor,users are assigned specific rules that are utilized to verify the accuracy of data semantic.Furthermore,users are given the flexibility to update their own rules as needed.We conduct in-depth analyses of the correctness and security of Sec-Auditor.We also compare several important security attributes with existing schemes,demonstrating the superior properties of Sec-Auditor.Evaluation results demonstrate that even for time-consuming file upload operations,our solution is more efficient than the comparison one.
基金This work was supported by the National Natural Science Foundation of China(U2133208,U20A20161).
文摘With the popularization of the Internet and the development of technology,cyber threats are increasing day by day.Threats such as malware,hacking,and data breaches have had a serious impact on cybersecurity.The network security environment in the era of big data presents the characteristics of large amounts of data,high diversity,and high real-time requirements.Traditional security defense methods and tools have been unable to cope with the complex and changing network security threats.This paper proposes a machine-learning security defense algorithm based on metadata association features.Emphasize control over unauthorized users through privacy,integrity,and availability.The user model is established and the mapping between the user model and the metadata of the data source is generated.By analyzing the user model and its corresponding mapping relationship,the query of the user model can be decomposed into the query of various heterogeneous data sources,and the integration of heterogeneous data sources based on the metadata association characteristics can be realized.Define and classify customer information,automatically identify and perceive sensitive data,build a behavior audit and analysis platform,analyze user behavior trajectories,and complete the construction of a machine learning customer information security defense system.The experimental results show that when the data volume is 5×103 bit,the data storage integrity of the proposed method is 92%.The data accuracy is 98%,and the success rate of data intrusion is only 2.6%.It can be concluded that the data storage method in this paper is safe,the data accuracy is always at a high level,and the data disaster recovery performance is good.This method can effectively resist data intrusion and has high air traffic control security.It can not only detect all viruses in user data storage,but also realize integrated virus processing,and further optimize the security defense effect of user big data.
基金supported in part by the MOST Major Research and Development Project(Grant No.2021YFB2900204)the National Natural Science Foundation of China(NSFC)(Grant No.62201123,No.62132004,No.61971102)+3 种基金China Postdoctoral Science Foundation(Grant No.2022TQ0056)in part by the financial support of the Sichuan Science and Technology Program(Grant No.2022YFH0022)Sichuan Major R&D Project(Grant No.22QYCX0168)the Municipal Government of Quzhou(Grant No.2022D031)。
文摘Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.
文摘Building model data organization is often programmed to solve a specific problem,resulting in the inability to organize indoor and outdoor 3D scenes in an integrated manner.In this paper,existing building spatial data models are studied,and the characteristics of building information modeling standards(IFC),city geographic modeling language(CityGML),indoor modeling language(IndoorGML),and other models are compared and analyzed.CityGML and IndoorGML models face challenges in satisfying diverse application scenarios and requirements due to limitations in their expression capabilities.It is proposed to combine the semantic information of the model objects to effectively partition and organize the indoor and outdoor spatial 3D model data and to construct the indoor and outdoor data organization mechanism of“chunk-layer-subobject-entrances-area-detail object.”This method is verified by proposing a 3D data organization method for indoor and outdoor space and constructing a 3D visualization system based on it.
基金This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences[grant number XDA23100100]National Natural Science Foundation of China[grant number 41771430],[grant number 41631177]China Scholarship Council[grant number 201804910732].
文摘Effective integration and wide sharing of geospatial data is an important and basic premise to facilitate the research and applications of geographic information science.However,the semantic heterogeneity of geospatial data is a major problem that significantly hinders geospatial data integration and sharing.Ontologies are regarded as a promising way to solve semantic problems by providing a formalized representation of geographic entities and relationships between them in a manner understandable to machines.Thus,many efforts have been made to explore ontology-based geospatial data integration and sharing.However,there is a lack of a specialized ontology that would provide a unified description for geospatial data.In this paper,with a focus on the characteristics of geospatial data,we propose a unified framework for geospatial data ontology,denoted GeoDataOnt,to establish a semantic foundation for geospatial data integration and sharing.First,we provide a characteristics hierarchy of geospatial data.Next,we analyze the semantic problems for each characteristic of geospatial data.Subsequently,we propose the general framework of GeoDataOnt,targeting these problems according to the characteristics of geospatial data.GeoDataOnt is then divided into multiple modules,and we show a detailed design and implementation for each module.Key limitations and challenges of GeoDataOnt are identified,and broad applications of GeoDataOnt are discussed.
基金supported by the National Natural Science Foundation of China under Grant No. 60970010the National Basic Research 973 Program of China under Grant No. 2009CB320705the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20090073110026
文摘New challenges including how to share information on heterogeneous devices appear in data-intensive pervasive computing environments. Data integration is a practical approach to these applications. Dealing with inconsistencies is one of the important problems in data integration. In this paper we motivate the problem of data inconsistency solution for data integration in pervasive environments. We define data qualit~ criteria and expense quality criteria for data sources to solve data inconsistency. In our solution, firstly, data sources needing high expense to obtain data from them are discarded by using expense quality criteria and utility function. Since it is difficult to obtain the actual quality of data sources in pervasive computing environment, we introduce fuzzy multi-attribute group decision making approach to selecting the appropriate data sources. The experimental results show that our solution has ideal effectiveness.
基金Thanks are due to the three anonymous reviewers for their constructive comments. This work was partially supported by the National Natural Science Foundation of China (Nos. 61572287 and 61533011), the Shandong Provincial Key Research and Development Program (2018GSF 118043), the Natural Science Foundation of Shandong Province, China (ZR2015FQ001), the Fundamental Research Funds of Shandong University (Nos. 2015QY001 and 2016JC007), the Scientific Research Foundation for the Returned Overseas Chinese Scholars, Ministry of Education of China.
文摘Background: More and more high-throughput datasets are available from multiple levels of measuring gene regulations. The reverse engineering of gene regulatory networks from these data offers a valuable research paradigm to decipher regulatory mechanisms. So far, numerous methods have been developed for reconstructing gene regulatory networks. Results: In this paper, we provide a review of bioinformatics methods for inferring gene regulatory network from omics data. To achieve the precision reconstruction of gene regulatory networks, an intuitive alternative is to integrate these available resources in a rational framework. We also provide computational perspectives in the endeavors of inferring gene regulatory networks from heterogeneous data. We highlight the importance of multi-omics data integration with prior knowledge in gene regulatory network inferences. Conclusions: We provide computational perspectives of inferring gene regulatory networks from multiple omics data and present theoretical analyses of existing challenges and possible solutions. We emphasize on prior knowledge and data integration in network inferences owing to their abilities of identifying regulatory causality.
文摘To construct mediators for data integration systems that integrate structured and semi-structured data, and to facilitate the reformulation and decomposition of the query, the presented system uses the XML processing language (XPL) for the mediator. With XPL, it is easy to construct mediators for data integration based on XML, and it can accelerate the work in the mediator.
文摘Inflammatory bowel disease(IBD)is a complex disease with variability in genetic,environmental,and lifestyle factors affecting disease presentation and course.Precision medicine has the potential to play a crucial role in managing IBD by tailoring treatment plans based on the heterogeneity of clinical and temporal variability of patients.Precision medicine is a population-based approach to managing IBD by integrating environmental,genomic,epigenomic,transcriptomic,proteomic,and metabolomic factors.It is a recent and rapidly developing medicine.The widespread adoption of precision medicine worldwide has the potential to result in the early detection of diseases,optimal utilization of healthcare resources,enhanced patient outcomes,and,ultimately,improved quality of life for individuals with IBD.Though precision medicine is promising in terms of better quality of patient care,inadequacies exist in the ongoing research.There is discordance in study conduct,and data collection,utilization,interpretation,and analysis.This review aims to describe the current literature on precision medicine,its multiomics approach,and future directions for its application in IBD.
基金Supported by the National High Technology Research and Development Program of China under Grant No 2015AA016902the National Natural Science Foundation of China under Grant Nos 61435013 and 61405188the K.C.Wong Education Foundation
文摘An 8×10 GHz receiver optical sub-assembly (ROSA) consisting of an 8-channel arrayed waveguide grating (AWG) and an 8-channel PIN photodetector (PD) array is designed and fabricated based on silica hybrid integration technology. Multimode output waveguides in the silica AWG with 2% refractive index difference are used to obtain fiat-top spectra. The output waveguide facet is polished to 45° bevel to change the light propagation direction into the mesa-type PIN PD, which simplifies the packaging process. The experimentM results show that the single channel I dB bandwidth of AWG ranges from 2.12nm to 3.06nm, the ROSA responsivity ranges from 0.097 A/W to 0.158A/W, and the 3dB bandwidth is up to 11 GHz. It is promising to be applied in the eight-lane WDM transmission system in data center interconnection.
基金financially supported by the General Program of the National Natural Science Foundation of China(No.52274326)the Fundamental Research Funds for the Central Universities (Nos.2125018 and 2225008)China Baowu Low Carbon Metallurgy Innovation Foundation(BWLCF202109)。
文摘Blast furnace (BF) ironmaking is the most typical “black box” process, and its complexity and uncertainty bring forth great challenges for furnace condition judgment and BF operation. Rich data resources for BF ironmaking are available, and the rapid development of data science and intelligent technology will provide an effective means to solve the uncertainty problem in the BF ironmaking process. This work focused on the application of artificial intelligence technology in BF ironmaking. The current intelligent BF ironmaking technology was summarized and analyzed from five aspects. These aspects include BF data management, the analyses of time delay and correlation, the prediction of BF key variables, the evaluation of BF status, and the multi-objective intelligent optimization of BF operations. Solutions and suggestions were offered for the problems in the current progress, and some outlooks for future prospects and technological breakthroughs were added. To effectively improve the BF data quality, we comprehensively considered the data problems and the characteristics of algorithms and selected the data processing method scientifically. For analyzing important BF characteristics, the effect of the delay was eliminated to ensure an accurate logical relationship between the BF parameters and economic indicators. As for BF parameter prediction and BF status evaluation,a BF intelligence model that integrates data information and process mechanism was built to effectively achieve the accurate prediction of BF key indexes and the scientific evaluation of BF status. During the optimization of BF parameters, low risk, low cost, and high return were used as the optimization criteria, and while pursuing the optimization effect, the feasibility and site operation cost were considered comprehensively.This work will help increase the process operator’s overall awareness and understanding of intelligent BF technology. Additionally, combining big data technology with the process will improve the practicality of data models in actual production and promote the application of intelligent technology in BF ironmaking.
基金financial support of Natural Science Foundation of China(No.61971102,62132004)MOST Major Research and Development Project(No.2021YFB2900204)+1 种基金Sichuan Science and Technology Program(No.2022YFH0022)Key Research and Development Program of Zhejiang Province(No.2022C01093)。
文摘Integrated data and energy transfer(IDET)is capable of simultaneously delivering on-demand data and energy to low-power Internet of Everything(Io E)devices.We propose a multi-carrier IDET transceiver relying on superposition waveforms consisting of multi-sinusoidal signals for wireless energy transfer(WET)and orthogonal-frequency-divisionmultiplexing(OFDM)signals for wireless data transfer(WDT).The outdated channel state information(CSI)in aging channels is employed by the transmitter to shape IDET waveforms.With the constraints of transmission power and WDT requirement,the amplitudes and phases of the IDET waveform at the transmitter and the power splitter at the receiver are jointly optimised for maximising the average directcurrent(DC)among a limited number of transmission frames with the existence of carrier-frequencyoffset(CFO).For the amplitude optimisation,the original non-convex problem can be transformed into a reversed geometric programming problem,then it can be effectively solved with existing tools.As for the phase optimisation,the artificial bee colony(ABC)algorithm is invoked in order to deal with the nonconvexity.Iteration between the amplitude optimisation and phase optimisation yields our joint design.Numerical results demonstrate the advantage of our joint design for the IDET waveform shaping with the existence of the CFO and the outdated CSI.