Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we...Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.展开更多
Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Da...Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms,which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases.Importantly,integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile.In this review,we first summarize data mining studies utilizing datasets from the individual type of omics analysis,including epigenetics/epigenomics,transcriptomics,proteomics,metabolomics,lipidomics,and spatial omics,pertaining to Alzheimer's disease,Parkinson's disease,and multiple sclerosis.We then discuss multi-omics integration approaches,including independent biological integration and unsupervised integration methods,for more intuitive and informative interpretation of the biological data obtained across different omics layers.We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks.Finally,we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery,therapeutic development,and elucidation of disease mechanisms.We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.展开更多
Background Mastitis caused by multiple factors remains one of the most common and costly disease of the dairy industry.Multi-omics approaches enable the comprehensive investigation of the complex interactions between ...Background Mastitis caused by multiple factors remains one of the most common and costly disease of the dairy industry.Multi-omics approaches enable the comprehensive investigation of the complex interactions between mul-tiple layers of information to provide a more holistic view of disease pathogenesis.Therefore,this study investigated the genomic and epigenomic signatures and the possible regulatory mechanisms underlying subclinical mastitis by integrating RNA sequencing data(mRNA and lncRNA),small RNA sequencing data(miRNA)and DNA methylation sequencing data of milk somatic cells from 10 healthy cows and 20 cows with naturally occurring subclinical mastitis caused by Staphylococcus aureus or Staphylococcus chromogenes.Results Functional investigation of the data sets through gene set analysis uncovered 3458 biological process GO terms and 170 KEGG pathways with altered activities during subclinical mastitis,provided further insights into subclin-ical mastitis and revealed the involvement of multi-omics signatures in the altered immune responses and impaired mammary gland productivity during subclinical mastitis.The abundant genomic and epigenomic signatures with sig-nificant alterations related to subclinical mastitis were observed,including 30,846,2552,1276 and 57 differential methylation haplotype blocks(dMHBs),differentially expressed genes(DEGs),lncRNAs(DELs)and miRNAs(DEMs),respectively.Next,5 factors presenting the principal variation of differential multi-omics signatures were identified.The important roles of Factor 1(DEG,DEM and DEL)and Factor 2(dMHB and DEM),in the regulation of immune defense and impaired mammary gland functions during subclinical mastitis were revealed.Each of the omics within Factors 1 and 2 explained about 20%of the source of variation in subclinical mastitis.Also,networks of impor-tant functional gene sets with the involvement of multi-omics signatures were demonstrated,which contributed to a comprehensive view of the possible regulatory mechanisms underlying subclinical mastitis.Furthermore,multi-omics integration enabled the association of the epigenomic regulatory factors(dMHBs,DELs and DEMs)of altered genes in important pathways,such as‘Staphylococcus aureus infection pathway’and‘natural killer cell mediated cyto-toxicity pathway’,etc.,which provides further insights into mastitis regulatory mechanisms.Moreover,few multi-omics signatures(14 dMHBs,25 DEGs,18 DELs and 5 DEMs)were identified as candidate discriminant signatures with capac-ity of distinguishing subclinical mastitis cows from healthy cows.Conclusion The integration of genomic and epigenomic data by multi-omics approaches in this study provided a better understanding of the molecular mechanisms underlying subclinical mastitis and identified multi-omics candidate discriminant signatures for subclinical mastitis,which may ultimately lead to the development of more effective mastitis control and management strategies.展开更多
Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This a...Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This article presents an effective mechanism to preserve the secrecy and integrity of data stored on the public cloud by leveraging blockchain technology,smart contracts,and cryptographic primitives.The proposed approach utilizes a Solidity-based smart contract as an auditor for maintaining and verifying the integrity of outsourced data.To preserve data secrecy,symmetric encryption systems are employed to encrypt user data before outsourcing it.An extensive performance analysis is conducted to illustrate the efficiency of the proposed mechanism.Additionally,a rigorous assessment is conducted to ensure that the developed smart contract is free from vulnerabilities and to measure its associated running costs.The security analysis of the proposed system confirms that our approach can securely maintain the confidentiality and integrity of cloud storage,even in the presence of malicious entities.The proposed mechanism contributes to enhancing data security in cloud computing environments and can be used as a foundation for developing more secure cloud storage systems.展开更多
Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when ...Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when users perform file operations,the semantic integrity of the data can be compromised.Ensuring both data integrity and semantic correctness has become a critical issue that requires attention.We introduce a pioneering solution called Sec-Auditor,the first of its kind with the ability to verify data integrity and semantic correctness simultaneously,while maintaining a constant communication cost independent of the audited data volume.Sec-Auditor also supports public auditing,enabling anyone with access to public information to conduct data audits.This feature makes Sec-Auditor highly adaptable to open data environments,such as the cloud.In Sec-Auditor,users are assigned specific rules that are utilized to verify the accuracy of data semantic.Furthermore,users are given the flexibility to update their own rules as needed.We conduct in-depth analyses of the correctness and security of Sec-Auditor.We also compare several important security attributes with existing schemes,demonstrating the superior properties of Sec-Auditor.Evaluation results demonstrate that even for time-consuming file upload operations,our solution is more efficient than the comparison one.展开更多
Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted ...Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.展开更多
Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision...Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.展开更多
To enhance the safety of road traffic operations,this paper proposed a model based on stacking integrated learning utilizing American road traffic accident statistics.Initially,the process involved data cleaning,trans...To enhance the safety of road traffic operations,this paper proposed a model based on stacking integrated learning utilizing American road traffic accident statistics.Initially,the process involved data cleaning,transformation,and normalization.Subsequently,various classification models were constructed,including logistic regression,k-nearest neighbors,gradient boosting,decision trees,AdaBoost,and extra trees models.Evaluation metrics such as accuracy,precision,recall,F1 score,and Hamming loss were employed.Upon analysis,the passive-aggressive classifier model exhibited superior comprehensive indices compared to other models.Based on the model’s output results,an in-depth examination of the factors influencing traffic accidents was conducted.Additionally,measures and suggestions aimed at reducing the incidence of severe traffic accidents were presented.These findings served as a valuable reference for mitigating the occurrence of traffic accidents.展开更多
To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,al...To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.展开更多
Inflammatory bowel disease(IBD)is a complex disease with variability in genetic,environmental,and lifestyle factors affecting disease presentation and course.Precision medicine has the potential to play a crucial role...Inflammatory bowel disease(IBD)is a complex disease with variability in genetic,environmental,and lifestyle factors affecting disease presentation and course.Precision medicine has the potential to play a crucial role in managing IBD by tailoring treatment plans based on the heterogeneity of clinical and temporal variability of patients.Precision medicine is a population-based approach to managing IBD by integrating environmental,genomic,epigenomic,transcriptomic,proteomic,and metabolomic factors.It is a recent and rapidly developing medicine.The widespread adoption of precision medicine worldwide has the potential to result in the early detection of diseases,optimal utilization of healthcare resources,enhanced patient outcomes,and,ultimately,improved quality of life for individuals with IBD.Though precision medicine is promising in terms of better quality of patient care,inadequacies exist in the ongoing research.There is discordance in study conduct,and data collection,utilization,interpretation,and analysis.This review aims to describe the current literature on precision medicine,its multiomics approach,and future directions for its application in IBD.展开更多
Data Integrity is a critical component of Data lifecycle management. Its importance increases even more in a complex and dynamic landscape. Actions like unauthorized access, unauthorized modifications, data manipulati...Data Integrity is a critical component of Data lifecycle management. Its importance increases even more in a complex and dynamic landscape. Actions like unauthorized access, unauthorized modifications, data manipulations, audit tampering, data backdating, data falsification, phishing and spoofing are no longer restricted to rogue individuals but in fact also prevalent in systematic organizations and states as well. Therefore, data security requires strong data integrity measures and associated technical controls in place. Without proper customized framework in place, organizations are prone to high risk of financial, reputational, revenue losses, bankruptcies, and legal penalties which we shall discuss further throughout this paper. We will also explore some of the improvised and innovative techniques in product development to better tackle the challenges and requirements of data security and integrity.展开更多
Terminal devices deployed in outdoor environments are facing a thorny problem of power supply.Data and energy integrated network(DEIN)is a promising technology to solve the problem,which simultaneously transfers data ...Terminal devices deployed in outdoor environments are facing a thorny problem of power supply.Data and energy integrated network(DEIN)is a promising technology to solve the problem,which simultaneously transfers data and energy through radio frequency signals.State-of-the-art researches mostly focus on theoretical aspects.By contrast,we provide a complete design and implementation of a fully functioning DEIN system with the support of an unmanned aerial vehicle(UAV).The UAV can be dispatched to areas of interest to remotely recharge batteryless terminals,while collecting essential information from them.Then,the UAV uploads the information to remote base stations.Our system verifies the feasibility of the DEIN in practical applications.展开更多
Integrated data and energy transfer(IDET)is capable of simultaneously delivering on-demand data and energy to low-power Internet of Everything(Io E)devices.We propose a multi-carrier IDET transceiver relying on superp...Integrated data and energy transfer(IDET)is capable of simultaneously delivering on-demand data and energy to low-power Internet of Everything(Io E)devices.We propose a multi-carrier IDET transceiver relying on superposition waveforms consisting of multi-sinusoidal signals for wireless energy transfer(WET)and orthogonal-frequency-divisionmultiplexing(OFDM)signals for wireless data transfer(WDT).The outdated channel state information(CSI)in aging channels is employed by the transmitter to shape IDET waveforms.With the constraints of transmission power and WDT requirement,the amplitudes and phases of the IDET waveform at the transmitter and the power splitter at the receiver are jointly optimised for maximising the average directcurrent(DC)among a limited number of transmission frames with the existence of carrier-frequencyoffset(CFO).For the amplitude optimisation,the original non-convex problem can be transformed into a reversed geometric programming problem,then it can be effectively solved with existing tools.As for the phase optimisation,the artificial bee colony(ABC)algorithm is invoked in order to deal with the nonconvexity.Iteration between the amplitude optimisation and phase optimisation yields our joint design.Numerical results demonstrate the advantage of our joint design for the IDET waveform shaping with the existence of the CFO and the outdated CSI.展开更多
Industrial big data integration and sharing(IBDIS)is of great significance in managing and providing data for big data analysis in manufacturing systems.A novel fog-computing-based IBDIS approach called Fog-IBDIS is p...Industrial big data integration and sharing(IBDIS)is of great significance in managing and providing data for big data analysis in manufacturing systems.A novel fog-computing-based IBDIS approach called Fog-IBDIS is proposed in order to integrate and share industrial big data with high raw data security and low network traffic loads by moving the integration task from the cloud to the edge of networks.First,a task flow graph(TFG)is designed to model the data analysis process.The TFG is composed of several tasks,which are executed by the data owners through the Fog-IBDIS platform in order to protect raw data privacy.Second,the function of Fog-IBDIS to enable data integration and sharing is presented in five modules:TFG management,compilation and running control,the data integration model,the basic algorithm library,and the management component.Finally,a case study is presented to illustrate the implementation of Fog-IBDIS,which ensures raw data security by deploying the analysis tasks executed by the data generators,and eases the network traffic load by greatly reducing the volume of transmitted data.展开更多
With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over...With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced.展开更多
This paper analyzes the status of existing resources through extensive research and international cooperation on the basis of four typical global monthly surface temperature datasets including the climate research dat...This paper analyzes the status of existing resources through extensive research and international cooperation on the basis of four typical global monthly surface temperature datasets including the climate research dataset of the University of East Anglia(CRUTEM3), the dataset of the U.S. National Climatic Data Center(GHCN-V3), the dataset of the U.S. National Aeronautics and Space Administration(GISSTMP), and the Berkeley Earth surface temperature dataset(Berkeley). China's first global monthly temperature dataset over land was developed by integrating the four aforementioned global temperature datasets and several regional datasets from major countries or regions. This dataset contains information from 9,519 stations worldwide of at least 20 years for monthly mean temperature, 7,073 for maximum temperature, and 6,587 for minimum temperature. Compared with CRUTEM3 and GHCN-V3, the station density is much higher particularly for South America, Africa,and Asia. Moreover, data from significantly more stations were available after the year 1990 which dramatically reduced the uncertainty of the estimated global temperature trend during 1990e2011. The integrated dataset can serve as a reliable data source for global climate change research.展开更多
Background:Presently,multi-omics data(e.g.,genomics,transcriptomics,proteomics,and metabolomics)are available to improve genomic predictors.Omics data not only offers new data layers for genomic prediction but also pr...Background:Presently,multi-omics data(e.g.,genomics,transcriptomics,proteomics,and metabolomics)are available to improve genomic predictors.Omics data not only offers new data layers for genomic prediction but also provides a bridge between organismal phenotypes and genome variation that cannot be readily captured at the genome sequence level.Therefore,using multi-omics data to select feature markers is a feasible strategy to improve the accuracy of genomic prediction.In this study,simultaneously using whole-genome sequencing(WGS)and gene expression level data,four strategies for single-nucleotide polymorphism(SNP)preselection were investigated for genomic predictions in the Drosophila Genetic Reference Panel.Results:Using genomic best linear unbiased prediction(GBLUP)with complete WGS data,the prediction accuracies were 0.208±0.020(0.181±0.022)for the startle response and 0.272±0.017(0.307±0.015)for starvation resistance in the female(male)lines.Compared with GBLUP using complete WGS data,both GBLUP and the genomic feature BLUP(GFBLUP)did not improve the prediction accuracy using SNPs preselected from complete WGS data based on the results of genome-wide association studies(GWASs)or transcriptome-wide association studies(TWASs).Furthermore,by using SNPs preselected from the WGS data based on the results of the expression quantitative trait locus(eQTL)mapping of all genes,only the startle response had greater accuracy than GBLUP with the complete WGS data.The best accuracy values in the female and male lines were 0.243±0.020 and 0.220±0.022,respectively.Importantly,by using SNPs preselected based on the results of the eQTL mapping of significant genes from TWAS,both GBLUP and GFBLUP resulted in great accuracy and small bias of genomic prediction.Compared with the GBLUP using complete WGS data,the best accuracy values represented increases of 60.66%and 39.09%for the starvation resistance and 27.40%and 35.36%for startle response in the female and male lines,respectively.Conclusions:Overall,multi-omics data can assist genomic feature preselection and improve the performance of genomic prediction.The new knowledge gained from this study will enrich the use of multi-omics in genomic prediction.展开更多
Data protection in databases is critical for any organization,as unauthorized access or manipulation can have severe negative consequences.Intrusion detection systems are essential for keeping databases secure.Advance...Data protection in databases is critical for any organization,as unauthorized access or manipulation can have severe negative consequences.Intrusion detection systems are essential for keeping databases secure.Advancements in technology will lead to significant changes in the medical field,improving healthcare services through real-time information sharing.However,reliability and consistency still need to be solved.Safeguards against cyber-attacks are necessary due to the risk of unauthorized access to sensitive information and potential data corruption.Dis-ruptions to data items can propagate throughout the database,making it crucial to reverse fraudulent transactions without delay,especially in the healthcare industry,where real-time data access is vital.This research presents a role-based access control architecture for an anomaly detection technique.Additionally,the Structured Query Language(SQL)queries are stored in a new data structure called Pentaplet.These pentaplets allow us to maintain the correlation between SQL statements within the same transaction by employing the transaction-log entry information,thereby increasing detection accuracy,particularly for individuals within the company exhibiting unusual behavior.To identify anomalous queries,this system employs a supervised machine learning technique called Support Vector Machine(SVM).According to experimental findings,the proposed model performed well in terms of detection accuracy,achieving 99.92%through SVM with One Hot Encoding and Principal Component Analysis(PCA).展开更多
With the deepening informationization of Resources & Environment Remote Sensing geological survey conducted,some potential problems and deficiency are:(1) shortage of unified-planed running environment;(2) inconsi...With the deepening informationization of Resources & Environment Remote Sensing geological survey conducted,some potential problems and deficiency are:(1) shortage of unified-planed running environment;(2) inconsistent methods of data integration;and(3) disadvantages of different performing ways of data integration.This paper solves the above problems through overall planning and design,constructs unified running environment, consistent methods of data integration and system structure in order to advance the informationization展开更多
Nowadays,numerous applications are associated with cloud and user data gets collected globally and stored in cloud units.In addition to shared data storage,cloud computing technique offers multiple advantages for the ...Nowadays,numerous applications are associated with cloud and user data gets collected globally and stored in cloud units.In addition to shared data storage,cloud computing technique offers multiple advantages for the user through different distribution designs like hybrid cloud,public cloud,community cloud and private cloud.Though cloud-based computing solutions are highly con-venient to the users,it also brings a challenge i.e.,security of the data shared.Hence,in current research paper,blockchain with data integrity authentication technique is developed for an efficient and secure operation with user authentica-tion process.Blockchain technology is utilized in this study to enable efficient and secure operation which not only empowers cloud security but also avoids threats and attacks.Additionally,the data integrity authentication technique is also uti-lized to limit the unwanted access of data in cloud storage unit.The major objec-tive of the projected technique is to empower data security and user authentication in cloud computing environment.To improve the proposed authentication pro-cess,cuckoofilter and Merkle Hash Tree(MHT)are utilized.The proposed meth-odology was validated using few performance metrics such as processing time,uploading time,downloading time,authentication time,consensus time,waiting time,initialization time,in addition to storage overhead.The proposed method was compared with conventional cloud security techniques and the outcomes establish the supremacy of the proposed method.展开更多
基金supported by the National Natural Science Foundation of China (No.32070656)the Nanjing University Deng Feng Scholars Program+1 种基金the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions,China Postdoctoral Science Foundation funded project (No.2022M711563)Jiangsu Funding Program for Excellent Postdoctoral Talent (No.2022ZB50)
文摘Plant morphogenesis relies on precise gene expression programs at the proper time and position which is orchestrated by transcription factors(TFs)in intricate regulatory networks in a cell-type specific manner.Here we introduced a comprehensive single-cell transcriptomic atlas of Arabidopsis seedlings.This atlas is the result of meticulous integration of 63 previously published scRNA-seq datasets,addressing batch effects and conserving biological variance.This integration spans a broad spectrum of tissues,including both below-and above-ground parts.Utilizing a rigorous approach for cell type annotation,we identified 47 distinct cell types or states,largely expanding our current view of plant cell compositions.We systematically constructed cell-type specific gene regulatory networks and uncovered key regulators that act in a coordinated manner to control cell-type specific gene expression.Taken together,our study not only offers extensive plant cell atlas exploration that serves as a valuable resource,but also provides molecular insights into gene-regulatory programs that varies from different cell types.
基金supported by a Lee Kong Chian School of Medicine Dean’s Postdoctoral Fellowship(021207-00001)from Nanyang Technological University(NTU)Singapore and a Mistletoe Research Fellowship(022522-00001)from the Momental Foundation USA.Jialiu Zeng is supported by a Presidential Postdoctoral Fellowship(021229-00001)from NTU Singapore and an Open Fund Young Investigator Research Grant(OF-YIRG)(MOH-001147)from the National Medical Research Council(NMRC)SingaporeSu Bin Lim is supported by the National Research Foundation(NRF)of Korea(Grant Nos.:2020R1A6A1A03043539,2020M3A9D8037604,2022R1C1C1004756)a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute(KHIDI),funded by the Ministry of Health&Welfare,Republic of Korea(Grant No.:HR22C1734).
文摘Bioinformatic analysis of large and complex omics datasets has become increasingly useful in modern day biology by providing a great depth of information,with its application to neuroscience termed neuroinformatics.Data mining of omics datasets has enabled the generation of new hypotheses based on differentially regulated biological molecules associated with disease mechanisms,which can be tested experimentally for improved diagnostic and therapeutic targeting of neurodegenerative diseases.Importantly,integrating multi-omics data using a systems bioinformatics approach will advance the understanding of the layered and interactive network of biological regulation that exchanges systemic knowledge to facilitate the development of a comprehensive human brain profile.In this review,we first summarize data mining studies utilizing datasets from the individual type of omics analysis,including epigenetics/epigenomics,transcriptomics,proteomics,metabolomics,lipidomics,and spatial omics,pertaining to Alzheimer's disease,Parkinson's disease,and multiple sclerosis.We then discuss multi-omics integration approaches,including independent biological integration and unsupervised integration methods,for more intuitive and informative interpretation of the biological data obtained across different omics layers.We further assess studies that integrate multi-omics in data mining which provide convoluted biological insights and offer proof-of-concept proposition towards systems bioinformatics in the reconstruction of brain networks.Finally,we recommend a combination of high dimensional bioinformatics analysis with experimental validation to achieve translational neuroscience applications including biomarker discovery,therapeutic development,and elucidation of disease mechanisms.We conclude by providing future perspectives and opportunities in applying integrative multi-omics and systems bioinformatics to achieve precision phenotyping of neurodegenerative diseases and towards personalized medicine.
基金The help and support of owners of the dairy farms enrolled in this study is gratefully acknowledged.The financial support from the program of China Scholarship Council during the PhD study of Mengqi Wang in Canada is acknowledged(No.202008880009).
文摘Background Mastitis caused by multiple factors remains one of the most common and costly disease of the dairy industry.Multi-omics approaches enable the comprehensive investigation of the complex interactions between mul-tiple layers of information to provide a more holistic view of disease pathogenesis.Therefore,this study investigated the genomic and epigenomic signatures and the possible regulatory mechanisms underlying subclinical mastitis by integrating RNA sequencing data(mRNA and lncRNA),small RNA sequencing data(miRNA)and DNA methylation sequencing data of milk somatic cells from 10 healthy cows and 20 cows with naturally occurring subclinical mastitis caused by Staphylococcus aureus or Staphylococcus chromogenes.Results Functional investigation of the data sets through gene set analysis uncovered 3458 biological process GO terms and 170 KEGG pathways with altered activities during subclinical mastitis,provided further insights into subclin-ical mastitis and revealed the involvement of multi-omics signatures in the altered immune responses and impaired mammary gland productivity during subclinical mastitis.The abundant genomic and epigenomic signatures with sig-nificant alterations related to subclinical mastitis were observed,including 30,846,2552,1276 and 57 differential methylation haplotype blocks(dMHBs),differentially expressed genes(DEGs),lncRNAs(DELs)and miRNAs(DEMs),respectively.Next,5 factors presenting the principal variation of differential multi-omics signatures were identified.The important roles of Factor 1(DEG,DEM and DEL)and Factor 2(dMHB and DEM),in the regulation of immune defense and impaired mammary gland functions during subclinical mastitis were revealed.Each of the omics within Factors 1 and 2 explained about 20%of the source of variation in subclinical mastitis.Also,networks of impor-tant functional gene sets with the involvement of multi-omics signatures were demonstrated,which contributed to a comprehensive view of the possible regulatory mechanisms underlying subclinical mastitis.Furthermore,multi-omics integration enabled the association of the epigenomic regulatory factors(dMHBs,DELs and DEMs)of altered genes in important pathways,such as‘Staphylococcus aureus infection pathway’and‘natural killer cell mediated cyto-toxicity pathway’,etc.,which provides further insights into mastitis regulatory mechanisms.Moreover,few multi-omics signatures(14 dMHBs,25 DEGs,18 DELs and 5 DEMs)were identified as candidate discriminant signatures with capac-ity of distinguishing subclinical mastitis cows from healthy cows.Conclusion The integration of genomic and epigenomic data by multi-omics approaches in this study provided a better understanding of the molecular mechanisms underlying subclinical mastitis and identified multi-omics candidate discriminant signatures for subclinical mastitis,which may ultimately lead to the development of more effective mastitis control and management strategies.
文摘Cloud computing has emerged as a viable alternative to traditional computing infrastructures,offering various benefits.However,the adoption of cloud storage poses significant risks to data secrecy and integrity.This article presents an effective mechanism to preserve the secrecy and integrity of data stored on the public cloud by leveraging blockchain technology,smart contracts,and cryptographic primitives.The proposed approach utilizes a Solidity-based smart contract as an auditor for maintaining and verifying the integrity of outsourced data.To preserve data secrecy,symmetric encryption systems are employed to encrypt user data before outsourcing it.An extensive performance analysis is conducted to illustrate the efficiency of the proposed mechanism.Additionally,a rigorous assessment is conducted to ensure that the developed smart contract is free from vulnerabilities and to measure its associated running costs.The security analysis of the proposed system confirms that our approach can securely maintain the confidentiality and integrity of cloud storage,even in the presence of malicious entities.The proposed mechanism contributes to enhancing data security in cloud computing environments and can be used as a foundation for developing more secure cloud storage systems.
基金This research was supported by the Qinghai Provincial High-End Innovative and Entrepreneurial Talents Project.
文摘Currently,there is a growing trend among users to store their data in the cloud.However,the cloud is vulnerable to persistent data corruption risks arising from equipment failures and hacker attacks.Additionally,when users perform file operations,the semantic integrity of the data can be compromised.Ensuring both data integrity and semantic correctness has become a critical issue that requires attention.We introduce a pioneering solution called Sec-Auditor,the first of its kind with the ability to verify data integrity and semantic correctness simultaneously,while maintaining a constant communication cost independent of the audited data volume.Sec-Auditor also supports public auditing,enabling anyone with access to public information to conduct data audits.This feature makes Sec-Auditor highly adaptable to open data environments,such as the cloud.In Sec-Auditor,users are assigned specific rules that are utilized to verify the accuracy of data semantic.Furthermore,users are given the flexibility to update their own rules as needed.We conduct in-depth analyses of the correctness and security of Sec-Auditor.We also compare several important security attributes with existing schemes,demonstrating the superior properties of Sec-Auditor.Evaluation results demonstrate that even for time-consuming file upload operations,our solution is more efficient than the comparison one.
基金supported in part by the MOST Major Research and Development Project(Grant No.2021YFB2900204)the National Natural Science Foundation of China(NSFC)(Grant No.62201123,No.62132004,No.61971102)+3 种基金China Postdoctoral Science Foundation(Grant No.2022TQ0056)in part by the financial support of the Sichuan Science and Technology Program(Grant No.2022YFH0022)Sichuan Major R&D Project(Grant No.22QYCX0168)the Municipal Government of Quzhou(Grant No.2022D031)。
文摘Integrated data and energy transfer(IDET)enables the electromagnetic waves to transmit wireless energy at the same time of data delivery for lowpower devices.In this paper,an energy harvesting modulation(EHM)assisted multi-user IDET system is studied,where all the received signals at the users are exploited for energy harvesting without the degradation of wireless data transfer(WDT)performance.The joint IDET performance is then analysed theoretically by conceiving a practical time-dependent wireless channel.With the aid of the AO based algorithm,the average effective data rate among users are maximized by ensuring the BER and the wireless energy transfer(WET)performance.Simulation results validate and evaluate the IDET performance of the EHM assisted system,which also demonstrates that the optimal number of user clusters and IDET time slots should be allocated,in order to improve the WET and WDT performance.
文摘Microsoft Excel is essential for the End-User Approach (EUA), offering versatility in data organization, analysis, and visualization, as well as widespread accessibility. It fosters collaboration and informed decision-making across diverse domains. Conversely, Python is indispensable for professional programming due to its versatility, readability, extensive libraries, and robust community support. It enables efficient development, advanced data analysis, data mining, and automation, catering to diverse industries and applications. However, one primary issue when using Microsoft Excel with Python libraries is compatibility and interoperability. While Excel is a widely used tool for data storage and analysis, it may not seamlessly integrate with Python libraries, leading to challenges in reading and writing data, especially in complex or large datasets. Additionally, manipulating Excel files with Python may not always preserve formatting or formulas accurately, potentially affecting data integrity. Moreover, dependency on Excel’s graphical user interface (GUI) for automation can limit scalability and reproducibility compared to Python’s scripting capabilities. This paper covers the integration solution of empowering non-programmers to leverage Python’s capabilities within the familiar Excel environment. This enables users to perform advanced data analysis and automation tasks without requiring extensive programming knowledge. Based on Soliciting feedback from non-programmers who have tested the integration solution, the case study shows how the solution evaluates the ease of implementation, performance, and compatibility of Python with Excel versions.
文摘To enhance the safety of road traffic operations,this paper proposed a model based on stacking integrated learning utilizing American road traffic accident statistics.Initially,the process involved data cleaning,transformation,and normalization.Subsequently,various classification models were constructed,including logistic regression,k-nearest neighbors,gradient boosting,decision trees,AdaBoost,and extra trees models.Evaluation metrics such as accuracy,precision,recall,F1 score,and Hamming loss were employed.Upon analysis,the passive-aggressive classifier model exhibited superior comprehensive indices compared to other models.Based on the model’s output results,an in-depth examination of the factors influencing traffic accidents was conducted.Additionally,measures and suggestions aimed at reducing the incidence of severe traffic accidents were presented.These findings served as a valuable reference for mitigating the occurrence of traffic accidents.
基金Weaponry Equipment Pre-Research Foundation of PLA Equipment Ministry (No. 9140A06050409JB8102)Pre-Research Foundation of PLA University of Science and Technology (No. 2009JSJ11)
文摘To solve the query processing correctness problem for semantic-based relational data integration,the semantics of SAPRQL(simple protocol and RDF query language) queries is defined.In the course of query rewriting,all relative tables are found and decomposed into minimal connectable units.Minimal connectable units are joined according to semantic queries to produce the semantically correct query plans.Algorithms for query rewriting and transforming are presented.Computational complexity of the algorithms is discussed.Under the worst case,the query decomposing algorithm can be finished in O(n2) time and the query rewriting algorithm requires O(nm) time.And the performance of the algorithms is verified by experiments,and experimental results show that when the length of query is less than 8,the query processing algorithms can provide satisfactory performance.
文摘Inflammatory bowel disease(IBD)is a complex disease with variability in genetic,environmental,and lifestyle factors affecting disease presentation and course.Precision medicine has the potential to play a crucial role in managing IBD by tailoring treatment plans based on the heterogeneity of clinical and temporal variability of patients.Precision medicine is a population-based approach to managing IBD by integrating environmental,genomic,epigenomic,transcriptomic,proteomic,and metabolomic factors.It is a recent and rapidly developing medicine.The widespread adoption of precision medicine worldwide has the potential to result in the early detection of diseases,optimal utilization of healthcare resources,enhanced patient outcomes,and,ultimately,improved quality of life for individuals with IBD.Though precision medicine is promising in terms of better quality of patient care,inadequacies exist in the ongoing research.There is discordance in study conduct,and data collection,utilization,interpretation,and analysis.This review aims to describe the current literature on precision medicine,its multiomics approach,and future directions for its application in IBD.
文摘Data Integrity is a critical component of Data lifecycle management. Its importance increases even more in a complex and dynamic landscape. Actions like unauthorized access, unauthorized modifications, data manipulations, audit tampering, data backdating, data falsification, phishing and spoofing are no longer restricted to rogue individuals but in fact also prevalent in systematic organizations and states as well. Therefore, data security requires strong data integrity measures and associated technical controls in place. Without proper customized framework in place, organizations are prone to high risk of financial, reputational, revenue losses, bankruptcies, and legal penalties which we shall discuss further throughout this paper. We will also explore some of the improvised and innovative techniques in product development to better tackle the challenges and requirements of data security and integrity.
基金partly funded by Natural Science Foundation of China(No.61971102 and 62132004)Sichuan Science and Technology Program(No.22QYCX0168)the Municipal Government of Quzhou(Grant No.2021D003)。
文摘Terminal devices deployed in outdoor environments are facing a thorny problem of power supply.Data and energy integrated network(DEIN)is a promising technology to solve the problem,which simultaneously transfers data and energy through radio frequency signals.State-of-the-art researches mostly focus on theoretical aspects.By contrast,we provide a complete design and implementation of a fully functioning DEIN system with the support of an unmanned aerial vehicle(UAV).The UAV can be dispatched to areas of interest to remotely recharge batteryless terminals,while collecting essential information from them.Then,the UAV uploads the information to remote base stations.Our system verifies the feasibility of the DEIN in practical applications.
基金financial support of Natural Science Foundation of China(No.61971102,62132004)MOST Major Research and Development Project(No.2021YFB2900204)+1 种基金Sichuan Science and Technology Program(No.2022YFH0022)Key Research and Development Program of Zhejiang Province(No.2022C01093)。
文摘Integrated data and energy transfer(IDET)is capable of simultaneously delivering on-demand data and energy to low-power Internet of Everything(Io E)devices.We propose a multi-carrier IDET transceiver relying on superposition waveforms consisting of multi-sinusoidal signals for wireless energy transfer(WET)and orthogonal-frequency-divisionmultiplexing(OFDM)signals for wireless data transfer(WDT).The outdated channel state information(CSI)in aging channels is employed by the transmitter to shape IDET waveforms.With the constraints of transmission power and WDT requirement,the amplitudes and phases of the IDET waveform at the transmitter and the power splitter at the receiver are jointly optimised for maximising the average directcurrent(DC)among a limited number of transmission frames with the existence of carrier-frequencyoffset(CFO).For the amplitude optimisation,the original non-convex problem can be transformed into a reversed geometric programming problem,then it can be effectively solved with existing tools.As for the phase optimisation,the artificial bee colony(ABC)algorithm is invoked in order to deal with the nonconvexity.Iteration between the amplitude optimisation and phase optimisation yields our joint design.Numerical results demonstrate the advantage of our joint design for the IDET waveform shaping with the existence of the CFO and the outdated CSI.
基金This work was supported in part by the National Natural Science Foundation of China(51435009)Shanghai Sailing Program(19YF1401500)the Fundamental Research Funds for the Central Universities(2232019D3-34).
文摘Industrial big data integration and sharing(IBDIS)is of great significance in managing and providing data for big data analysis in manufacturing systems.A novel fog-computing-based IBDIS approach called Fog-IBDIS is proposed in order to integrate and share industrial big data with high raw data security and low network traffic loads by moving the integration task from the cloud to the edge of networks.First,a task flow graph(TFG)is designed to model the data analysis process.The TFG is composed of several tasks,which are executed by the data owners through the Fog-IBDIS platform in order to protect raw data privacy.Second,the function of Fog-IBDIS to enable data integration and sharing is presented in five modules:TFG management,compilation and running control,the data integration model,the basic algorithm library,and the management component.Finally,a case study is presented to illustrate the implementation of Fog-IBDIS,which ensures raw data security by deploying the analysis tasks executed by the data generators,and eases the network traffic load by greatly reducing the volume of transmitted data.
基金Supportted by the Natural Science Foundation ofChina (60573091 ,60273018) National Basic Research and Develop-ment Programof China (2003CB317000) the Key Project of Minis-try of Education of China (03044) .
文摘With the rapid development of Web, there are more and more Web databases available for users to access. At the same time, job searchers often have difficulties in first finding the right sources and then querying over them, providing such an integrated job search system over Web databases has become a Web application in high demand. Based on such consideration, we build a deep Web data integration system that supports unified access for users to multiple job Web sites as a job meta-search engine. In this paper, the architecture of the system is given first, and the key components in the system are introduced.
基金supported by the China Meteorological Administration Special Public Welfare Research Fund (GYHY201206012, GYHY201406016)the Climate Change Foundation of the China Meteorological Administration (CCSF201338)
文摘This paper analyzes the status of existing resources through extensive research and international cooperation on the basis of four typical global monthly surface temperature datasets including the climate research dataset of the University of East Anglia(CRUTEM3), the dataset of the U.S. National Climatic Data Center(GHCN-V3), the dataset of the U.S. National Aeronautics and Space Administration(GISSTMP), and the Berkeley Earth surface temperature dataset(Berkeley). China's first global monthly temperature dataset over land was developed by integrating the four aforementioned global temperature datasets and several regional datasets from major countries or regions. This dataset contains information from 9,519 stations worldwide of at least 20 years for monthly mean temperature, 7,073 for maximum temperature, and 6,587 for minimum temperature. Compared with CRUTEM3 and GHCN-V3, the station density is much higher particularly for South America, Africa,and Asia. Moreover, data from significantly more stations were available after the year 1990 which dramatically reduced the uncertainty of the estimated global temperature trend during 1990e2011. The integrated dataset can serve as a reliable data source for global climate change research.
基金supported by the National Natural Science Foundation of China(31772556)the Local Innovative and Research Teams Project of Guangdong Province(2019BT02N630)+1 种基金the grants from the earmarked fund for China Agriculture Research System(CARS-35)the Science and Technology Innovation Strategy projects of Guangdong Province(Grant No.2018B020203002).
文摘Background:Presently,multi-omics data(e.g.,genomics,transcriptomics,proteomics,and metabolomics)are available to improve genomic predictors.Omics data not only offers new data layers for genomic prediction but also provides a bridge between organismal phenotypes and genome variation that cannot be readily captured at the genome sequence level.Therefore,using multi-omics data to select feature markers is a feasible strategy to improve the accuracy of genomic prediction.In this study,simultaneously using whole-genome sequencing(WGS)and gene expression level data,four strategies for single-nucleotide polymorphism(SNP)preselection were investigated for genomic predictions in the Drosophila Genetic Reference Panel.Results:Using genomic best linear unbiased prediction(GBLUP)with complete WGS data,the prediction accuracies were 0.208±0.020(0.181±0.022)for the startle response and 0.272±0.017(0.307±0.015)for starvation resistance in the female(male)lines.Compared with GBLUP using complete WGS data,both GBLUP and the genomic feature BLUP(GFBLUP)did not improve the prediction accuracy using SNPs preselected from complete WGS data based on the results of genome-wide association studies(GWASs)or transcriptome-wide association studies(TWASs).Furthermore,by using SNPs preselected from the WGS data based on the results of the expression quantitative trait locus(eQTL)mapping of all genes,only the startle response had greater accuracy than GBLUP with the complete WGS data.The best accuracy values in the female and male lines were 0.243±0.020 and 0.220±0.022,respectively.Importantly,by using SNPs preselected based on the results of the eQTL mapping of significant genes from TWAS,both GBLUP and GFBLUP resulted in great accuracy and small bias of genomic prediction.Compared with the GBLUP using complete WGS data,the best accuracy values represented increases of 60.66%and 39.09%for the starvation resistance and 27.40%and 35.36%for startle response in the female and male lines,respectively.Conclusions:Overall,multi-omics data can assist genomic feature preselection and improve the performance of genomic prediction.The new knowledge gained from this study will enrich the use of multi-omics in genomic prediction.
基金thankful to the Dean of Scientific Research at Najran University for funding this work under the Research Groups Funding Program,Grant Code(NU/RG/SERC/12/6).
文摘Data protection in databases is critical for any organization,as unauthorized access or manipulation can have severe negative consequences.Intrusion detection systems are essential for keeping databases secure.Advancements in technology will lead to significant changes in the medical field,improving healthcare services through real-time information sharing.However,reliability and consistency still need to be solved.Safeguards against cyber-attacks are necessary due to the risk of unauthorized access to sensitive information and potential data corruption.Dis-ruptions to data items can propagate throughout the database,making it crucial to reverse fraudulent transactions without delay,especially in the healthcare industry,where real-time data access is vital.This research presents a role-based access control architecture for an anomaly detection technique.Additionally,the Structured Query Language(SQL)queries are stored in a new data structure called Pentaplet.These pentaplets allow us to maintain the correlation between SQL statements within the same transaction by employing the transaction-log entry information,thereby increasing detection accuracy,particularly for individuals within the company exhibiting unusual behavior.To identify anomalous queries,this system employs a supervised machine learning technique called Support Vector Machine(SVM).According to experimental findings,the proposed model performed well in terms of detection accuracy,achieving 99.92%through SVM with One Hot Encoding and Principal Component Analysis(PCA).
文摘With the deepening informationization of Resources & Environment Remote Sensing geological survey conducted,some potential problems and deficiency are:(1) shortage of unified-planed running environment;(2) inconsistent methods of data integration;and(3) disadvantages of different performing ways of data integration.This paper solves the above problems through overall planning and design,constructs unified running environment, consistent methods of data integration and system structure in order to advance the informationization
文摘Nowadays,numerous applications are associated with cloud and user data gets collected globally and stored in cloud units.In addition to shared data storage,cloud computing technique offers multiple advantages for the user through different distribution designs like hybrid cloud,public cloud,community cloud and private cloud.Though cloud-based computing solutions are highly con-venient to the users,it also brings a challenge i.e.,security of the data shared.Hence,in current research paper,blockchain with data integrity authentication technique is developed for an efficient and secure operation with user authentica-tion process.Blockchain technology is utilized in this study to enable efficient and secure operation which not only empowers cloud security but also avoids threats and attacks.Additionally,the data integrity authentication technique is also uti-lized to limit the unwanted access of data in cloud storage unit.The major objec-tive of the projected technique is to empower data security and user authentication in cloud computing environment.To improve the proposed authentication pro-cess,cuckoofilter and Merkle Hash Tree(MHT)are utilized.The proposed meth-odology was validated using few performance metrics such as processing time,uploading time,downloading time,authentication time,consensus time,waiting time,initialization time,in addition to storage overhead.The proposed method was compared with conventional cloud security techniques and the outcomes establish the supremacy of the proposed method.