Neural stem cells,which are capable of multi-potential differentiation and self-renewal,have recently been shown to have clinical potential for repairing central nervous system tissue damage.However,the theme trends a...Neural stem cells,which are capable of multi-potential differentiation and self-renewal,have recently been shown to have clinical potential for repairing central nervous system tissue damage.However,the theme trends and knowledge structures for human neural stem cells have not yet been studied bibliometrically.In this study,we retrieved 2742 articles from the PubMed database from 2013 to 2018 using "Neural Stem Cells" as the retrieval word.Co-word analysis was conducted to statistically quantify the characteristics and popular themes of human neural stem cell-related studies.Bibliographic data matrices were generated with the Bibliographic Item Co-Occurrence Matrix Builder.We identified 78 high-frequency Medical Subject Heading(MeSH)terms.A visual matrix was built with the repeated bisection method in gCLUTO software.A social network analysis network was generated with Ucinet 6.0 software and GraphPad Prism 5 software.The analyses demonstrated that in the 6-year period,hot topics were clustered into five categories.As suggested by the constructed strategic diagram,studies related to cytology and physiology were well-developed,whereas those related to neural stem cell applications,tissue engineering,metabolism and cell signaling,and neural stem cell pathology and virology remained immature.Neural stem cell therapy for stroke and Parkinson’s disease,the genetics of microRNAs and brain neoplasms,as well as neuroprotective agents,Zika virus,Notch receptor,neural crest and embryonic stem cells were identified as emerging hot spots.These undeveloped themes and popular topics are potential points of focus for new studies on human neural stem cells.展开更多
Analysis of gene expression data can help to find the time-lagged co-regulation of gene cluster. However, existing method just solve the problem under the condition when the data is discrete number. In this paper, we ...Analysis of gene expression data can help to find the time-lagged co-regulation of gene cluster. However, existing method just solve the problem under the condition when the data is discrete number. In this paper, we propose efficient algorithm to indentify time-lagged co-regulated gene cluster based on real number.展开更多
Commercial aircraft crews have experienced a trend from five-person crew to dual-pilot crew.Arised from both technological and market demands,Single Pilot Operations(SPO)is considered an important development directio...Commercial aircraft crews have experienced a trend from five-person crew to dual-pilot crew.Arised from both technological and market demands,Single Pilot Operations(SPO)is considered an important development direction in modern aviation technology.In this paper,starting from Dual-Pilot Operations(DPO),the piloting process,decision-making process and decisionmaking mode of DPO for commercial aircraft are studied to obtain the operational requirements of SPO.Then,based on above analysis,the operational mechanism of SPO is studied and the core technology of SPO mode is proposed.Next,a new closed frequent bicluster mining algorithm named FsCluster is proposed for the optimization of the SPO model,and the other efficient bicluster mining algorithm named TsCluster is proposed for the analysis and verification of the SPO model.Finally,a typical flight phase scenario is modelled by Magic System of System,and combined with the proposed algorithms for analysis and verification to determine whether the SPO mode can meet the DPO requirements.展开更多
The massive growth of online commercial data has raised the request for an automatic recommender system to benefit both users and merchants.One of the most frequently used recommendation methods is collaborative filte...The massive growth of online commercial data has raised the request for an automatic recommender system to benefit both users and merchants.One of the most frequently used recommendation methods is collaborative filtering,but its accuracy is limited by the sparsity of the rating dataset.Most existing collaborative filtering methods consider all features when calculating user/item similarity and ignore much local information.In collaborative filtering,selecting neighbors and determining users’similarities are the most important parts.For the selection of better neighbors,this study proposes a novel biclustering method based on modified fuzzy adaptive resonance theory.To reflect the similarity between users,a new measure that considers the effect of the number of users’common items is proposed.Specifically,the proposed novel biclustering method is first adopted to obtain local similarity and local prediction.Second,item-based collaborative filtering is used to generate global predictions.Finally,the two resultant predictions are fused to obtain a final one.Experiment results demonstrate that the proposed method outperforms state-of-the-art models in terms of several aspects on three benchmark datasets.展开更多
Unlike traditional clustering analysis,the biclustering algorithm works simultaneously on two dimensions of samples(row)and variables(column).In recent years,biclustering methods have been developed rapidly and widely...Unlike traditional clustering analysis,the biclustering algorithm works simultaneously on two dimensions of samples(row)and variables(column).In recent years,biclustering methods have been developed rapidly and widely applied in biological data analysis,text clustering,recommendation system and other fields.The traditional clustering algorithms cannot be well adapted to process high-dimensional data and/or large-scale data.At present,most of the biclustering algorithms are designed for the differentially expressed big biological data.However,there is little discussion on binary data clustering mining such as miRNA-targeted gene data.Here,we propose a novel biclustering method for miRNA-targeted gene data based on graph autoencoder named as GAEBic.GAEBic applies graph autoencoder to capture the similarity of sample sets or variable sets,and takes a new irregular clustering strategy to mine biclusters with excellent generalization.Based on the miRNA-targeted gene data of soybean,we benchmark several different types of the biclustering algorithm,and find that GAEBic performs better than Bimax,Bibit and the Spectral Biclustering algorithm in terms of target gene enrichment.This biclustering method achieves comparable performance on the high throughput miRNA data of soybean and it can also be used for other species.展开更多
Biclustering is a method of grouping objects and attributes simultaneously in order to find multiple hidden patterns.When dealing with a long time series,there is a low possibility of finding meaningful clusters of wh...Biclustering is a method of grouping objects and attributes simultaneously in order to find multiple hidden patterns.When dealing with a long time series,there is a low possibility of finding meaningful clusters of whole time sequence.However,we may find more significant clusters containing partial time sequence by applying a biclustering method.This paper proposed a new biclustering algorithm for time series data following an autoregressive moving average (ARMA) model.We assumed the plaid model but modified the algorithm to incorporate the sequential nature of time series data.The maximum likelihood estimation (MLE) method was used to estimate coefficients of ARMA in each bicluster.We applied the proposed method to several synthetic data which were generated from different ARMA orders.Results from the experiments showed that the proposed method compares favorably with other biclustering methods for time series data.展开更多
Currently, genome-wide association studies have been proved to be a powerful approach to identify risk loci. However, the molecular regulatory mechanisms of complex diseases are still not clearly understood. It is the...Currently, genome-wide association studies have been proved to be a powerful approach to identify risk loci. However, the molecular regulatory mechanisms of complex diseases are still not clearly understood. It is therefore important to consider the interplay between genetic factors and biological networks in elucidating the mechanisms of complex disease pathogenesis. In this paper, we first conducted a genome-wide association analysis by using the SNP genotype data and phenotype data provided by Genetic Analysis Workshop 17, in order to filter significant SNPs associated with the diseases. Second, we conducted a bioinformatics analysis of gene-phenotype association matrix to identify gene modules (biclusters). Third, we performed a KEGG enrichment test of genes involved in biclusters to find evidence to support their functional consensus. This method can be used for better understanding complex diseases.展开更多
Microarray contains a large matrix of information and has been widely used by biologists and bio data scientist for monitoring combinations of genes in different organisms.The coherent patterns in all continuous colum...Microarray contains a large matrix of information and has been widely used by biologists and bio data scientist for monitoring combinations of genes in different organisms.The coherent patterns in all continuous columns are mined in gene microarray data matrices.It is investigated,in this study,the coherent patterns in all continuous columns in gene microarray data matrix by developing the time series similarity measure for the coherent patterns in all continuous columns,as well as the evaluation function for verifying the proposed algorithm and the corresponding biclusters.The continuous time changes are taken into account in the coherent patterns in all continuous columns,and co-expression patterns in time series are searched.In order to use all the common information between sequences,a similarity measure for the coherent patterns in continuous columns is defined in this paper.To validate the efficiency of the similarity measure to mine biological information at continuous time points,an evaluation function is defined to measure biclusters,and an effective algorithm is proposed to mine the biclusters.Simulation experiments are conducted to verify the biological significance of the biclusters,which include synthetic datasets and real gene microarray datasets.The performance of the algorithm is analyzed,and the results show that the algorithm is highly efficient.展开更多
With the continuous advancement of the avionics system,crew members are correspondingly reduced,and Single Pilot Operations(SPO)has attracted widespread attention from scholars.To meet the flight requirements in SPO m...With the continuous advancement of the avionics system,crew members are correspondingly reduced,and Single Pilot Operations(SPO)has attracted widespread attention from scholars.To meet the flight requirements in SPO mode,it is necessary to further strengthen air-ground coordination system integration,but at the same time,there will be some safety issues caused by resource integration,function fusion,and task synthesis.Aimed at the safety problems caused by task synthesis,an efficient differential bicluster mining algorithm--DFCluster algorithm is proposed in this paper to discover potential hazardous elements or propagation mechanisms through mining the resource-function matrixes.To mine efficiently,several pruning techniques are designed for generating maximal biclusters without candidate maintenance.The experimental results show that the DFCluster algorithm is more efficient than the existing differential biclustering algorithms under different scales of artificial datasets and public datasets.Then,a typical flight scenario is designed based on SPO air-ground collaborative system architecture,and combined with our proposed DFCluster algorithm for task synthesis safety analysis.Based on the mining results,the SPO airground collaborative system architecture is modified,which ultimately improves the safety of the SPO system.展开更多
In complex multivariate data sets,different features usually include diverse associations with different variables,and different variables are associated within different regions.Therefore,exploring the associations b...In complex multivariate data sets,different features usually include diverse associations with different variables,and different variables are associated within different regions.Therefore,exploring the associations between variables and voxels locally becomes necessary to better understand the underlying phenomena.In this paper,we propose a co-analysis framework based on biclusters,which are two subsets of variables and voxels with close scalar-value relationships,to guide the process of visually exploring multivariate data.We first automatically extract all meaningful biclusters,each of which only contains voxels with a similar scalar-value pattern over a subset of variables.These biclusters are organized according to their variable sets,and biclusters in each variable set are further grouped by a similarity metric to reduce redundancy and support diversity during visual exploration.Biclusters are visually represented in coordinated views to facilitate interactive exploration of multivariate data from the similarity between biclusters and the correlation of scalar values with different variables.Experiments on several representative multivariate scientific data sets demonstrate the effectiveness of our framework in exploring local relationships among variables,biclusters and scalar values in the data.展开更多
基金supported by the National Natural Science Foundation of China,No.81471308(to JL)the Stem Cell Clinical Research Project in China,No.CMR-20161129-1003(to JL)the Innovation Technology Funding of Dalian in China,No.2018J11CY025(to JL)
文摘Neural stem cells,which are capable of multi-potential differentiation and self-renewal,have recently been shown to have clinical potential for repairing central nervous system tissue damage.However,the theme trends and knowledge structures for human neural stem cells have not yet been studied bibliometrically.In this study,we retrieved 2742 articles from the PubMed database from 2013 to 2018 using "Neural Stem Cells" as the retrieval word.Co-word analysis was conducted to statistically quantify the characteristics and popular themes of human neural stem cell-related studies.Bibliographic data matrices were generated with the Bibliographic Item Co-Occurrence Matrix Builder.We identified 78 high-frequency Medical Subject Heading(MeSH)terms.A visual matrix was built with the repeated bisection method in gCLUTO software.A social network analysis network was generated with Ucinet 6.0 software and GraphPad Prism 5 software.The analyses demonstrated that in the 6-year period,hot topics were clustered into five categories.As suggested by the constructed strategic diagram,studies related to cytology and physiology were well-developed,whereas those related to neural stem cell applications,tissue engineering,metabolism and cell signaling,and neural stem cell pathology and virology remained immature.Neural stem cell therapy for stroke and Parkinson’s disease,the genetics of microRNAs and brain neoplasms,as well as neuroprotective agents,Zika virus,Notch receptor,neural crest and embryonic stem cells were identified as emerging hot spots.These undeveloped themes and popular topics are potential points of focus for new studies on human neural stem cells.
文摘Analysis of gene expression data can help to find the time-lagged co-regulation of gene cluster. However, existing method just solve the problem under the condition when the data is discrete number. In this paper, we propose efficient algorithm to indentify time-lagged co-regulated gene cluster based on real number.
基金sponsored by the Natural Science Foundation of Shanghai(No.20ZR1427800)the New Young Teachers Launch Program of Shanghai Jiaotong University,China(No.20X100040036)+1 种基金the National Natural Science Foundation of China(No.61971273)the Development Program in Shaanxi Province of China(No.2021GY-032)。
文摘Commercial aircraft crews have experienced a trend from five-person crew to dual-pilot crew.Arised from both technological and market demands,Single Pilot Operations(SPO)is considered an important development direction in modern aviation technology.In this paper,starting from Dual-Pilot Operations(DPO),the piloting process,decision-making process and decisionmaking mode of DPO for commercial aircraft are studied to obtain the operational requirements of SPO.Then,based on above analysis,the operational mechanism of SPO is studied and the core technology of SPO mode is proposed.Next,a new closed frequent bicluster mining algorithm named FsCluster is proposed for the optimization of the SPO model,and the other efficient bicluster mining algorithm named TsCluster is proposed for the analysis and verification of the SPO model.Finally,a typical flight phase scenario is modelled by Magic System of System,and combined with the proposed algorithms for analysis and verification to determine whether the SPO mode can meet the DPO requirements.
基金This work was supported by Ningbo Natural Science Foundation(No.202003N4057)the National Natural Science Foundation of China(Nos.62172336 and 62032018).
文摘The massive growth of online commercial data has raised the request for an automatic recommender system to benefit both users and merchants.One of the most frequently used recommendation methods is collaborative filtering,but its accuracy is limited by the sparsity of the rating dataset.Most existing collaborative filtering methods consider all features when calculating user/item similarity and ignore much local information.In collaborative filtering,selecting neighbors and determining users’similarities are the most important parts.For the selection of better neighbors,this study proposes a novel biclustering method based on modified fuzzy adaptive resonance theory.To reflect the similarity between users,a new measure that considers the effect of the number of users’common items is proposed.Specifically,the proposed novel biclustering method is first adopted to obtain local similarity and local prediction.Second,item-based collaborative filtering is used to generate global predictions.Finally,the two resultant predictions are fused to obtain a final one.Experiment results demonstrate that the proposed method outperforms state-of-the-art models in terms of several aspects on three benchmark datasets.
基金This work was supported by the National Natural Science Foundation of China under Grant No.62072210the Project of the Development and Reform Commission of Jilin Province of China under Grant No.2019C053-6.
文摘Unlike traditional clustering analysis,the biclustering algorithm works simultaneously on two dimensions of samples(row)and variables(column).In recent years,biclustering methods have been developed rapidly and widely applied in biological data analysis,text clustering,recommendation system and other fields.The traditional clustering algorithms cannot be well adapted to process high-dimensional data and/or large-scale data.At present,most of the biclustering algorithms are designed for the differentially expressed big biological data.However,there is little discussion on binary data clustering mining such as miRNA-targeted gene data.Here,we propose a novel biclustering method for miRNA-targeted gene data based on graph autoencoder named as GAEBic.GAEBic applies graph autoencoder to capture the similarity of sample sets or variable sets,and takes a new irregular clustering strategy to mine biclusters with excellent generalization.Based on the miRNA-targeted gene data of soybean,we benchmark several different types of the biclustering algorithm,and find that GAEBic performs better than Bimax,Bibit and the Spectral Biclustering algorithm in terms of target gene enrichment.This biclustering method achieves comparable performance on the high throughput miRNA data of soybean and it can also be used for other species.
基金Project (No.2010-0016800) supported by the Basic Science Research Program through the National Research Foundation (NRF) funded by the Ministry of Education,Science and Technology,Korea
文摘Biclustering is a method of grouping objects and attributes simultaneously in order to find multiple hidden patterns.When dealing with a long time series,there is a low possibility of finding meaningful clusters of whole time sequence.However,we may find more significant clusters containing partial time sequence by applying a biclustering method.This paper proposed a new biclustering algorithm for time series data following an autoregressive moving average (ARMA) model.We assumed the plaid model but modified the algorithm to incorporate the sequential nature of time series data.The maximum likelihood estimation (MLE) method was used to estimate coefficients of ARMA in each bicluster.We applied the proposed method to several synthetic data which were generated from different ARMA orders.Results from the experiments showed that the proposed method compares favorably with other biclustering methods for time series data.
文摘Currently, genome-wide association studies have been proved to be a powerful approach to identify risk loci. However, the molecular regulatory mechanisms of complex diseases are still not clearly understood. It is therefore important to consider the interplay between genetic factors and biological networks in elucidating the mechanisms of complex disease pathogenesis. In this paper, we first conducted a genome-wide association analysis by using the SNP genotype data and phenotype data provided by Genetic Analysis Workshop 17, in order to filter significant SNPs associated with the diseases. Second, we conducted a bioinformatics analysis of gene-phenotype association matrix to identify gene modules (biclusters). Third, we performed a KEGG enrichment test of genes involved in biclusters to find evidence to support their functional consensus. This method can be used for better understanding complex diseases.
基金supported by China Scholarship Council,Guangdong Science and Technology Department under Grant no.2016A010101020,2016A010101021,2016A010101022Guangzhou Science and Information Bureau under Grant no 201802010033.
文摘Microarray contains a large matrix of information and has been widely used by biologists and bio data scientist for monitoring combinations of genes in different organisms.The coherent patterns in all continuous columns are mined in gene microarray data matrices.It is investigated,in this study,the coherent patterns in all continuous columns in gene microarray data matrix by developing the time series similarity measure for the coherent patterns in all continuous columns,as well as the evaluation function for verifying the proposed algorithm and the corresponding biclusters.The continuous time changes are taken into account in the coherent patterns in all continuous columns,and co-expression patterns in time series are searched.In order to use all the common information between sequences,a similarity measure for the coherent patterns in continuous columns is defined in this paper.To validate the efficiency of the similarity measure to mine biological information at continuous time points,an evaluation function is defined to measure biclusters,and an effective algorithm is proposed to mine the biclusters.Simulation experiments are conducted to verify the biological significance of the biclusters,which include synthetic datasets and real gene microarray datasets.The performance of the algorithm is analyzed,and the results show that the algorithm is highly efficient.
基金supported by National Program on Key Basic Research Project(2014CB744903)National Natural Science Foundation of China(61673270)+5 种基金Natural Science Foundation of Shanghai(20ZR1427800)New Young Teachers Launch Program of Shanghai Jiaotong University(20X100040036)Shanghai Pujiang Program(16PJD028)Shanghai Industrial Strengthening Project(GYQJ-2017-5-08)Shanghai Science and Technology Committee Research Project(17DZ1204304)Shanghai Engineering Research Center of Civil Aircraft Flight Testing。
文摘With the continuous advancement of the avionics system,crew members are correspondingly reduced,and Single Pilot Operations(SPO)has attracted widespread attention from scholars.To meet the flight requirements in SPO mode,it is necessary to further strengthen air-ground coordination system integration,but at the same time,there will be some safety issues caused by resource integration,function fusion,and task synthesis.Aimed at the safety problems caused by task synthesis,an efficient differential bicluster mining algorithm--DFCluster algorithm is proposed in this paper to discover potential hazardous elements or propagation mechanisms through mining the resource-function matrixes.To mine efficiently,several pruning techniques are designed for generating maximal biclusters without candidate maintenance.The experimental results show that the DFCluster algorithm is more efficient than the existing differential biclustering algorithms under different scales of artificial datasets and public datasets.Then,a typical flight scenario is designed based on SPO air-ground collaborative system architecture,and combined with our proposed DFCluster algorithm for task synthesis safety analysis.Based on the mining results,the SPO airground collaborative system architecture is modified,which ultimately improves the safety of the SPO system.
基金This work was supported by the National Key Research&Development Program of China(2017YFB0202203)National Natural Science Foundation of China(61472354 and 61672452)NSFC-Guangdong Joint Fund(U1611263).
文摘In complex multivariate data sets,different features usually include diverse associations with different variables,and different variables are associated within different regions.Therefore,exploring the associations between variables and voxels locally becomes necessary to better understand the underlying phenomena.In this paper,we propose a co-analysis framework based on biclusters,which are two subsets of variables and voxels with close scalar-value relationships,to guide the process of visually exploring multivariate data.We first automatically extract all meaningful biclusters,each of which only contains voxels with a similar scalar-value pattern over a subset of variables.These biclusters are organized according to their variable sets,and biclusters in each variable set are further grouped by a similarity metric to reduce redundancy and support diversity during visual exploration.Biclusters are visually represented in coordinated views to facilitate interactive exploration of multivariate data from the similarity between biclusters and the correlation of scalar values with different variables.Experiments on several representative multivariate scientific data sets demonstrate the effectiveness of our framework in exploring local relationships among variables,biclusters and scalar values in the data.