Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the m...Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the mismatch between the knowledge of the learned policy and the reality of the underlying environment.Recent works usually handle this in a too pessimistic manner to avoid out-of-distribution(OOD)queries as much as possible,but this can influence the robustness of the agents at unseen states.In this paper,we propose a simple but effective method to address this issue.The key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions,and we propose to find those regions by simulating them with modified Generative Adversarial Nets(GAN)such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves,with regard to the behavior policy or some other reference policy.We then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these regions.Extensive experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method.展开更多
Human limb movement imagery,which can be used in limb neural disorders rehabilitation and brain-controlled external devices,has become a significant control paradigm in the domain of brain-computer interface(BCI).Alth...Human limb movement imagery,which can be used in limb neural disorders rehabilitation and brain-controlled external devices,has become a significant control paradigm in the domain of brain-computer interface(BCI).Although numerous pioneering studies have been devoted to motor imagery classification based on electroencephalography(EEG)signal,their performance is somewhat limited due to insufficient analysis of key effective frequency bands of EEG signals.In this paper,we propose a model of multiband decomposition and spectral discriminative analysis for motor imagery classification,which is called variational sample-long short term memory(VS-LSTM)network.Specifically,we first use a channel fusion operator to reduce the signal channels of the raw EEG signal.Then,we use the variational mode decomposition(VMD)model to decompose the EEG signal into six band-limited intrinsic mode functions(BIMFs)for further signal noise reduction.In order to select discriminative frequency bands,we calculate the sample entropy(SampEn)value of each frequency band and select the maximum value.Finally,to predict the classification of motor imagery,a LSTM model is used to predict the class of frequency band with the largest SampEn value.An open-access public data is used to evaluated the effectiveness of the proposed model.In the data,15 subjects performed motor imagery tasks with elbow flexion/extension,forearm supination/pronation and hand open/close of right upper limb.The experiment results show that the average classification result of seven kinds of motor imagery was 76.2%,the average accuracy of motor imagery binary classification is 96.6%(imagery vs.rest),respectively,which outperforms the state-of-the-art deep learning-based models.This framework significantly improves the accuracy of motor imagery by selecting effective frequency bands.This research is very meaningful for BCIs,and it is inspiring for end-to-end learning research.展开更多
Deep learning performs as a powerful paradigm in many real-world applications;however,its mechanism remains much of a mystery.To gain insights about nonlinear hierarchical deep networks,we theoretically describe the c...Deep learning performs as a powerful paradigm in many real-world applications;however,its mechanism remains much of a mystery.To gain insights about nonlinear hierarchical deep networks,we theoretically describe the coupled nonlinear learning dynamic of the two-layer neural network with quadratic activations,extending existing results from the linear case.The quadratic activation,although rarely used in practice,shares convexity with the widely used ReLU activation,thus producing similar dynamics.In this work,we focus on the case of a canonical regression problem under the standard normal distribution and use a coupled dynamical system to mimic the gradient descent method in the sense of a continuous-time limit,then use the high order moment tensor of the normal distribution to simplify these ordinary differential equations.The simplified system yields unexpected fixed points.The existence of these non-global-optimal stable points leads to the existence of saddle points in the loss surface of the quadratic networks.Our analysis shows there are conserved quantities during the training of the quadratic networks.Such quantities might result in a failed learning process if the network is initialized improperly.Finally,We illustrate the comparison between the numerical learning curves and the theoretical one,which reveals the two alternately appearing stages of the learning process.展开更多
One of the most significant challenges in the neuroscience community is to understand how the human brain works.Recent progress in neuroimaging techniques have validated that it is possible to decode a person′s thoug...One of the most significant challenges in the neuroscience community is to understand how the human brain works.Recent progress in neuroimaging techniques have validated that it is possible to decode a person′s thoughts,memories,and emotions via functional magnetic resonance imaging(i.e.,fMRI)since it can measure the neural activation of human brains with satisfied spatiotemporal resolutions.However,the unprecedented scale and complexity of the fMRI data have presented critical computational bottlenecks requiring new scientific analytic tools.Given the increasingly important role of machine learning in neuroscience,a great many machine learning algorithms are presented to analyze brain activities from the fMRI data.In this paper,we mainly provide a comprehensive and up-to-date review of machine learning methods for analyzing neural activities with the following three aspects,i.e.,brain image functional alignment,brain activity pattern analysis,and visual stimuli reconstruction.In addition,online resources and open research problems on brain pattern analysis are also provided for the convenience of future research.展开更多
Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the...Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the observed data inevitably involve some censored instances,and thus brings a unique challenge that distinguishes from the general regression problems.In addition,survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems.To address these challenges,we propose a novel multi-task regression learning model,i.e.,prior information guided transductive matrix completion(PigTMC)model,to predict the survival status of the new instances.Specifically,we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples,and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data.In addition,we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis.Furthermore,we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model.Finally,the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods.展开更多
Self-supervised learning(SSL),as a new unsupervised representation learning paradigm in machine learning,recently has received extensive attention,which is also regarded as the future of machine learning by the Turing...Self-supervised learning(SSL),as a new unsupervised representation learning paradigm in machine learning,recently has received extensive attention,which is also regarded as the future of machine learning by the Turing Award winner,LeCunn[1].SSL learns representation from unlabeled data using"pretext"tasks that provide free supervision,with the aim of performing well on semantic relation agnostic downstream(supervised)tasks.It usually divides two stages:first learning as general/invariant representation/feature as possible with the auto-annotation pretext tasks(the core),then transferring the learned knowledge to downstream tasks(the ultimate goal)[2].展开更多
Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new per...Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new perspectives on the cellular diversity of potential superficially homogeneous populations.Although various sequencing techniques have decreased the amplification bias and improved capture efficiency caused by the low amount of starting material,the technical noise and biological variation are inevitably introduced into experimental process,resulting in high dropout events,which greatly hinder the downstream analysis.Considering the bimodal expression pattern and the right-skewed characteristic existed in normalized scRNA-seq data,we propose a customized autoencoder based on a twopart-generalized-gamma distribution(AE-TPGG)for scRNAseq data analysis,which takes mixed discrete-continuous random variables of scRNA-seq data into account using a twopart model and utilizes the generalized gamma(GG)distribution,for fitting the positive and right-skewed continuous data.The adopted autoencoder enables AE-TPGG to captures the inherent relationship between genes.In addition to the ability of achieving low-dimensional representation,the AETPGG model also provides a denoised imputation according to statistical characteristic of gene expression.Results on real datasets demonstrate that our proposed model is competitive to current imputation methods and ameliorates a diverse set of typical scRNA-seq data analyses.展开更多
In the past decade,multimodal neuroimaging and genomic techniques have been increasingly developed.As an interdiscip-linary topic,brain imaging genomics is devoted to evaluating and characterizing genetic variants in ...In the past decade,multimodal neuroimaging and genomic techniques have been increasingly developed.As an interdiscip-linary topic,brain imaging genomics is devoted to evaluating and characterizing genetic variants in individuals that influence phenotyp-ic measures derived from structural and functional brain imaging.This technique is capable of revealing the complex mechanisms by macroscopic intermediates from the genetic level to cognition and psychiatric disorders in humans.It is well known that machine learn-ing is a powerful tool in the data-driven association studies,which can fully utilize priori knowledge(intercorrelated structure informa-tion among imaging and genetic data)for association modelling.In addition,the association study is able to find the association between risk genes and brain structure or function so that a better mechanistic understanding of behaviors or disordered brain functions is ex-plored.In this paper,the related background and fundamental work in imaging genomics are first reviewed.Then,we show the univari-ate learning approaches for association analysis,summarize the main idea and modelling in genetic-imaging association studies based on multivariate machine learning,and present methods for joint association analysis and outcome prediction.Finally,this paper discusses some prospects for future work.展开更多
Principal component analysis(PCA)is a widely used method for multivariate data analysis that projects the original high-dimensional data onto a low-dimensional subspace with maximum variance.However,in practice,we wou...Principal component analysis(PCA)is a widely used method for multivariate data analysis that projects the original high-dimensional data onto a low-dimensional subspace with maximum variance.However,in practice,we would be more likely to obtain a few compressed sensing(CS)measurements than the complete high-dimensional data due to the high cost of data acquisition and storage.In this paper,we propose a novel Bayesian algorithm for learning the solutions of PCA for the original data just from these CS measurements.To this end,we utilize a generative latent variable model incorporated with a structure prior to model both sparsity of the original data and effective dimensionality of the latent space.The proposed algorithm enjoys two important advantages:1)The effective dimensionality of the latent space can be determined automatically with no need to be pre-specified;2)The sparsity modeling makes us unnecessary to employ multiple measurement matrices to maintain the original data space but a single one,thus being storage efficient.Experimental results on synthetic and real-world datasets show that the proposed algorithm can accurately learn the solutions of PCA for the original data,which can in turn be applied in reconstruction task with favorable results.展开更多
In this paper,we study the partial multi-label(PML)image classification problem,where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels.Existing PML metho...In this paper,we study the partial multi-label(PML)image classification problem,where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels.Existing PML methods typically design a disambiguation strategy to filter out noisy labels by utilizing prior knowledge with extra assumptions,which unfortunately is unavailable in many real tasks.Furthermore,because the objective function for disambiguation is usually elaborately designed on the whole training set,it can hardly be optimized in a deep model with stochastic gradient descent(SGD)on mini-batches.In this paper,for the first time,we propose a deep model for PML to enhance the representation and discrimination ability.On the one hand,we propose a novel curriculum-based disambiguation strategy to progressively identify ground-truth labels by incorporating the varied difficulties of different classes.On the other hand,consistency regularization is introduced for model training to balance fitting identified easy labels and exploiting potential relevant labels.Extensive experimental results on the commonly used benchmark datasets show that the proposed method significantlyoutperforms the SOTA methods.展开更多
Supervised learning often requires a large number of labeled examples,which has become a critical bottleneck in the case that manual annotating the class labels is costly.To mitigate this issue,a new framework called ...Supervised learning often requires a large number of labeled examples,which has become a critical bottleneck in the case that manual annotating the class labels is costly.To mitigate this issue,a new framework called pairwise comparison(Pcomp)classification is proposed to allow training examples only weakly annotated with pairwise comparison,i.e.,which one of two examples is more likely to be positive.The previous study solves Pcomp problems by minimizing the classification error,which may lead to less robust model due to its sensitivity to class distribution.In this paper,we propose a robust learning framework for Pcomp data along with a pairwise surrogate loss called Pcomp-AUC.It provides an unbiased estimator to equivalently maximize AUC without accessing the precise class labels.Theoretically,we prove the consistency with respect to AUC and further provide the estimation error bound for the proposed method.Empirical studies on multiple datasets validate the effectiveness of the proposed method.展开更多
The domain adversarial neural network(DANN)methods have been successfully proposed and attracted much attention recently.In DANNs,a discriminator is trained to discriminate the domain labels of features generated by a...The domain adversarial neural network(DANN)methods have been successfully proposed and attracted much attention recently.In DANNs,a discriminator is trained to discriminate the domain labels of features generated by a generator,whereas the generator attempts to confuse it such that the distributions between domains are aligned.As a result,it actually encourages the whole alignment or transfer between domains,while the inter-class discriminative information across domains is not considered.In this paper,we present a Discrimination-Aware Domain Adversarial Neural Network(DA2NN)method to introduce the discriminative information or the discrepancy of inter-class instances across domains into deep domain adaptation.DA2NN considers both the alignment within the same class and the separation among different classes across domains in knowledge transfer via multiple discriminators.Empirical results show that DA2NN can achieve better classification performance compared with the DANN methods.展开更多
Deep learning has been probed for the airfoil performance prediction in recent years.Compared with the expensive CFD simulations and wind tunnel experiments,deep learning models can be leveraged to somewhat mitigate s...Deep learning has been probed for the airfoil performance prediction in recent years.Compared with the expensive CFD simulations and wind tunnel experiments,deep learning models can be leveraged to somewhat mitigate such expenses with proper means.Nevertheless,effective training of the data-driven models in deep learning severely hinges on the data in diversity and quantity.In this paper,we present a novel data augmented Generative Adversarial Network(GAN),daGAN,for rapid and accurate flow filed prediction,allowing the adaption to the task with sparse data.The presented approach consists of two modules,pre-training module and fine-tuning module.The pre-training module utilizes a conditional GAN(cGAN)to preliminarily estimate the distribution of the training data.In the fine-tuning module,we propose a novel adversarial architecture with two generators one of which fulfils a promising data augmentation operation,so that the complement data is adequately incorporated to boost the generalization of the model.We use numerical simulation data to verify the generalization of daGAN on airfoils and flow conditions with sparse training data.The results show that daGAN is a promising tool for rapid and accurate evaluation of detailed flow field without the requirement for big training data.展开更多
In multi-label learning,it is rather expensive to label instances since they are simultaneously associated with multiple labels.Therefore,active learning,which reduces the labeling cost by actively querying the labels...In multi-label learning,it is rather expensive to label instances since they are simultaneously associated with multiple labels.Therefore,active learning,which reduces the labeling cost by actively querying the labels of the most valuable data,becomes particularly important for multi-label learning.A good multi-label active learning algorithm usually consists of two crucial elements:a reasonable criterion to evaluate the gain of querying the label for an instance,and an effective classification model,based on whose prediction the criterion can be accurately computed.In this paper,we first introduce an effective multi-label classification model by combining label ranking with threshold learning,which is incrementally trained to avoid retraining from scratch after every query.Based on this model,we then propose to exploit both uncertainty and diversity in the instance space as well as the label space,and actively query the instance-label pairs which can improve the classification model most.Extensive experiments on 20 datasets demonstrate the superiority of the proposed approach to state-of-the-art methods.展开更多
1 Introduction and main contributions Differential transcript usage(DTU),which refers to the event that the relative transcript abundance within a gene changes between conditions.To detect DTU,various methods have bee...1 Introduction and main contributions Differential transcript usage(DTU),which refers to the event that the relative transcript abundance within a gene changes between conditions.To detect DTU,various methods have been proposed,which can be classified into exon-based models and gene-based models.These approaches either cannot estimate the relative transcript abundance,or they cannot deal properly with the multi-source mapping problems of reads.Besides,few methods currently consider sample-to-sample variability under multiple conditions[1].展开更多
Background Brain network describing interconnections between brain regions contains abundant topological information.It is a challenge for the existing statistical methods(e.g.,t test)to investigate the topological di...Background Brain network describing interconnections between brain regions contains abundant topological information.It is a challenge for the existing statistical methods(e.g.,t test)to investigate the topological differences of brain networks.Methods We proposed a kernel based statistic framework for identifying topological differences in brain networks.In our framework,the topological similarities between paired brain networks were measured by graph kernels.Then,graph kernels are embedded into maximum mean discrepancy for calculating kernel based test statistic.Based on this test statistic,we adopted conditional Monte Carlo simulation to compute the statistical significance(i.e.,P value)and statistical power.We recruited 33 patients with Alzheimer’s disease(AD),33 patients with early mild cognitive impairment(EMCI),33 patients with late mild cognitive impairment(LMCI)and 33 normal controls(NC)in our experiment.There are no statistical differences in demographic information between patients and NC.The compared state-of-the-art statistical methods include t test,t squared test,two-sample permutation test and non-normal test.Results We applied the proposed shortest path matched kernel to our framework for investigating the statistical differences of shortest path topological structures in brain networks of AD and NC.We compared our method with the existing state-of-the-art statistical methods in brain network characteristic including clustering coefficient and functional connection among EMCI,LMCI,AD,and NC.The results indicate that our framework can capture the statistically discriminative shortest path topological structures,such as shortest path from right rolandic operculum to right supplementary motor area(P=0.00314,statistical power=0.803).In clustering coefficient and functional connection,our framework outperforms the state-of-the-art statistical methods,such as P=0.0013 and statistical power=0.83 in the analysis of AD and NC.Conclusion Our proposed kernel based statistic framework not only can be used to investigate the topological differences of brain network,but also can be used to investigate the static characteristics(e.g.,clustering coefficient and functional connection)of brain network.展开更多
Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.Existing MTL works mainly focus on the scenario where label sets among multiple tasks(MTs)are ...Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.Existing MTL works mainly focus on the scenario where label sets among multiple tasks(MTs)are usually the same,thus they can be utilized for learning across the tasks.However,the real world has more general scenarios in which each task has only a small number of training samples and their label sets are just partially overlapped or even not.Learning such MTs is more challenging because of less correlation information available among these tasks.For this,we propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks and the information shared among those partiallyoverlapped tasks.In our implementation of using the same neural network architecture of the learnt auxiliary task to learn individual tasks,the key idea is to utilize available label information to adaptively prune the hidden layer neurons of the auxiliary network to construct corresponding network for each task,while accompanying a joint learning across individual tasks.Extensive experimental results demonstrate that our proposed method is significantly competitive compared to state-of-the-art methods.展开更多
Multi-task learning(MTL)can boost the performance of individual tasks by mutual learning among multiple related tasks.However,when these tasks assume diverse complexities,their corresponding losses involved in the MTL...Multi-task learning(MTL)can boost the performance of individual tasks by mutual learning among multiple related tasks.However,when these tasks assume diverse complexities,their corresponding losses involved in the MTL objective inevitably compete with each other and ultimately make the learning biased towards simple tasks rather than complex ones.To address this imbalanced learning problem,we propose a novel MTL method that can equip multiple existing deep MTL model architectures with a sequential cooperative distillation(SCD)module.Specifically,we first introduce an efficient mechanism to measure the similarity between tasks,and group similar tasks into the same block to allow their cooperative learning from each other.Based on this,the grouped task blocks are sorted in a queue to determine the learning sequence of the tasks according to their complexities estimated with the defined performance indicator.Finally,a distillation between the individual task-specific models and the MTL model is performed block by block from complex to simple manner,achieving a balance between competition and cooperation among learning multiple tasks.Extensive experiments demonstrate that our method is significantly more competitive compared with state-of-the-art methods,ranking No.1 with average performances across multiple datasets by improving 12.95%and 3.72%compared with OMTL and MTLKD,respectively.展开更多
基金supported by the National Key R&D program of China under Grant No.2021ZD0113203National Science Foundation of China under Grant No.61976115.
文摘Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the mismatch between the knowledge of the learned policy and the reality of the underlying environment.Recent works usually handle this in a too pessimistic manner to avoid out-of-distribution(OOD)queries as much as possible,but this can influence the robustness of the agents at unseen states.In this paper,we propose a simple but effective method to address this issue.The key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions,and we propose to find those regions by simulating them with modified Generative Adversarial Nets(GAN)such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves,with regard to the behavior policy or some other reference policy.We then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these regions.Extensive experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method.
基金This work was supported in part by the National Natural Science Foundation of China(Grant Nos.61876082,61861130366,61732006)National Key R&D Program of China(2018YFC2001600,2018YFC2001602).
文摘Human limb movement imagery,which can be used in limb neural disorders rehabilitation and brain-controlled external devices,has become a significant control paradigm in the domain of brain-computer interface(BCI).Although numerous pioneering studies have been devoted to motor imagery classification based on electroencephalography(EEG)signal,their performance is somewhat limited due to insufficient analysis of key effective frequency bands of EEG signals.In this paper,we propose a model of multiband decomposition and spectral discriminative analysis for motor imagery classification,which is called variational sample-long short term memory(VS-LSTM)network.Specifically,we first use a channel fusion operator to reduce the signal channels of the raw EEG signal.Then,we use the variational mode decomposition(VMD)model to decompose the EEG signal into six band-limited intrinsic mode functions(BIMFs)for further signal noise reduction.In order to select discriminative frequency bands,we calculate the sample entropy(SampEn)value of each frequency band and select the maximum value.Finally,to predict the classification of motor imagery,a LSTM model is used to predict the class of frequency band with the largest SampEn value.An open-access public data is used to evaluated the effectiveness of the proposed model.In the data,15 subjects performed motor imagery tasks with elbow flexion/extension,forearm supination/pronation and hand open/close of right upper limb.The experiment results show that the average classification result of seven kinds of motor imagery was 76.2%,the average accuracy of motor imagery binary classification is 96.6%(imagery vs.rest),respectively,which outperforms the state-of-the-art deep learning-based models.This framework significantly improves the accuracy of motor imagery by selecting effective frequency bands.This research is very meaningful for BCIs,and it is inspiring for end-to-end learning research.
基金The authors would like to thank the support from National Natural Science Foundation of China(Grant No.61672281).
文摘Deep learning performs as a powerful paradigm in many real-world applications;however,its mechanism remains much of a mystery.To gain insights about nonlinear hierarchical deep networks,we theoretically describe the coupled nonlinear learning dynamic of the two-layer neural network with quadratic activations,extending existing results from the linear case.The quadratic activation,although rarely used in practice,shares convexity with the widely used ReLU activation,thus producing similar dynamics.In this work,we focus on the case of a canonical regression problem under the standard normal distribution and use a coupled dynamical system to mimic the gradient descent method in the sense of a continuous-time limit,then use the high order moment tensor of the normal distribution to simplify these ordinary differential equations.The simplified system yields unexpected fixed points.The existence of these non-global-optimal stable points leads to the existence of saddle points in the loss surface of the quadratic networks.Our analysis shows there are conserved quantities during the training of the quadratic networks.Such quantities might result in a failed learning process if the network is initialized improperly.Finally,We illustrate the comparison between the numerical learning curves and the theoretical one,which reveals the two alternately appearing stages of the learning process.
基金This work was supported by National Natural Science Foundation of China(Nos.61876082,61861130366,6173-2006 and 61902183)National Key Research and Development Program of China(Nos.2018 YFC2001600,2018YFC 2001602)+1 种基金the Royal Society-Academy of Medical Sciences Newton Advanced Fellowship(No.NAF\R1\180371)China Postdoctoral Science Foundation funded project(No.2019M661831).
文摘One of the most significant challenges in the neuroscience community is to understand how the human brain works.Recent progress in neuroimaging techniques have validated that it is possible to decode a person′s thoughts,memories,and emotions via functional magnetic resonance imaging(i.e.,fMRI)since it can measure the neural activation of human brains with satisfied spatiotemporal resolutions.However,the unprecedented scale and complexity of the fMRI data have presented critical computational bottlenecks requiring new scientific analytic tools.Given the increasingly important role of machine learning in neuroscience,a great many machine learning algorithms are presented to analyze brain activities from the fMRI data.In this paper,we mainly provide a comprehensive and up-to-date review of machine learning methods for analyzing neural activities with the following three aspects,i.e.,brain image functional alignment,brain activity pattern analysis,and visual stimuli reconstruction.In addition,online resources and open research problems on brain pattern analysis are also provided for the convenience of future research.
基金This work was supported in part by the National Natural Science Foundation of China(Grant Nos.61872190,61772285,61572263 and 61906098)in part by the Natural Science Foundation of Jiangsu Province(BK20161516)in part by the Open Fund of MIIT Key Laboratory of Pattern Analysis and Machine Intelligence of NUAA.
文摘Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the observed data inevitably involve some censored instances,and thus brings a unique challenge that distinguishes from the general regression problems.In addition,survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems.To address these challenges,we propose a novel multi-task regression learning model,i.e.,prior information guided transductive matrix completion(PigTMC)model,to predict the survival status of the new instances.Specifically,we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples,and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data.In addition,we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis.Furthermore,we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model.Finally,the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods.
基金This work was supported by the National Natural Science Foundation of China(Grant No.62076124).
文摘Self-supervised learning(SSL),as a new unsupervised representation learning paradigm in machine learning,recently has received extensive attention,which is also regarded as the future of machine learning by the Turing Award winner,LeCunn[1].SSL learns representation from unlabeled data using"pretext"tasks that provide free supervision,with the aim of performing well on semantic relation agnostic downstream(supervised)tasks.It usually divides two stages:first learning as general/invariant representation/feature as possible with the auto-annotation pretext tasks(the core),then transferring the learned knowledge to downstream tasks(the ultimate goal)[2].
基金This research was supported by the National Natural Science Foundation of China(Grant Nos.62136004,61802193)the National Key R&D Program of China(2018YFC2001600,2018YFC2001602)+1 种基金the Natural Science Foundation of Jiangsu Province(BK20170934)the Fundamental Research Funds for the Central Universities(NJ2020023)。
文摘Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new perspectives on the cellular diversity of potential superficially homogeneous populations.Although various sequencing techniques have decreased the amplification bias and improved capture efficiency caused by the low amount of starting material,the technical noise and biological variation are inevitably introduced into experimental process,resulting in high dropout events,which greatly hinder the downstream analysis.Considering the bimodal expression pattern and the right-skewed characteristic existed in normalized scRNA-seq data,we propose a customized autoencoder based on a twopart-generalized-gamma distribution(AE-TPGG)for scRNAseq data analysis,which takes mixed discrete-continuous random variables of scRNA-seq data into account using a twopart model and utilizes the generalized gamma(GG)distribution,for fitting the positive and right-skewed continuous data.The adopted autoencoder enables AE-TPGG to captures the inherent relationship between genes.In addition to the ability of achieving low-dimensional representation,the AETPGG model also provides a denoised imputation according to statistical characteristic of gene expression.Results on real datasets demonstrate that our proposed model is competitive to current imputation methods and ameliorates a diverse set of typical scRNA-seq data analyses.
基金supported by National Natural Science Foundation of China(Nos.62106104,62136004,61902183,61876082,61861130366 and 61732006)the Project funded by China Postdoctoral Science Foundation(No.2022T150320)the National Key Research and Development Program of China(Nos.2018YFC2001600 and 2018YFC2001602).
文摘In the past decade,multimodal neuroimaging and genomic techniques have been increasingly developed.As an interdiscip-linary topic,brain imaging genomics is devoted to evaluating and characterizing genetic variants in individuals that influence phenotyp-ic measures derived from structural and functional brain imaging.This technique is capable of revealing the complex mechanisms by macroscopic intermediates from the genetic level to cognition and psychiatric disorders in humans.It is well known that machine learn-ing is a powerful tool in the data-driven association studies,which can fully utilize priori knowledge(intercorrelated structure informa-tion among imaging and genetic data)for association modelling.In addition,the association study is able to find the association between risk genes and brain structure or function so that a better mechanistic understanding of behaviors or disordered brain functions is ex-plored.In this paper,the related background and fundamental work in imaging genomics are first reviewed.Then,we show the univari-ate learning approaches for association analysis,summarize the main idea and modelling in genetic-imaging association studies based on multivariate machine learning,and present methods for joint association analysis and outcome prediction.Finally,this paper discusses some prospects for future work.
基金This work was supported by the Key Program of the National Natural Science Foundation of China(NSFC)(Grant No.61732006).
文摘Principal component analysis(PCA)is a widely used method for multivariate data analysis that projects the original high-dimensional data onto a low-dimensional subspace with maximum variance.However,in practice,we would be more likely to obtain a few compressed sensing(CS)measurements than the complete high-dimensional data due to the high cost of data acquisition and storage.In this paper,we propose a novel Bayesian algorithm for learning the solutions of PCA for the original data just from these CS measurements.To this end,we utilize a generative latent variable model incorporated with a structure prior to model both sparsity of the original data and effective dimensionality of the latent space.The proposed algorithm enjoys two important advantages:1)The effective dimensionality of the latent space can be determined automatically with no need to be pre-specified;2)The sparsity modeling makes us unnecessary to employ multiple measurement matrices to maintain the original data space but a single one,thus being storage efficient.Experimental results on synthetic and real-world datasets show that the proposed algorithm can accurately learn the solutions of PCA for the original data,which can in turn be applied in reconstruction task with favorable results.
文摘In this paper,we study the partial multi-label(PML)image classification problem,where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels.Existing PML methods typically design a disambiguation strategy to filter out noisy labels by utilizing prior knowledge with extra assumptions,which unfortunately is unavailable in many real tasks.Furthermore,because the objective function for disambiguation is usually elaborately designed on the whole training set,it can hardly be optimized in a deep model with stochastic gradient descent(SGD)on mini-batches.In this paper,for the first time,we propose a deep model for PML to enhance the representation and discrimination ability.On the one hand,we propose a novel curriculum-based disambiguation strategy to progressively identify ground-truth labels by incorporating the varied difficulties of different classes.On the other hand,consistency regularization is introduced for model training to balance fitting identified easy labels and exploiting potential relevant labels.Extensive experimental results on the commonly used benchmark datasets show that the proposed method significantlyoutperforms the SOTA methods.
基金Natural Science Foundation of Jiangsu Province,China(BK20222012,BK20211517)National Key R&D Program of China(2020AAA0107000)National Natural Science Foundation of China(Grant No.62222605)。
文摘Supervised learning often requires a large number of labeled examples,which has become a critical bottleneck in the case that manual annotating the class labels is costly.To mitigate this issue,a new framework called pairwise comparison(Pcomp)classification is proposed to allow training examples only weakly annotated with pairwise comparison,i.e.,which one of two examples is more likely to be positive.The previous study solves Pcomp problems by minimizing the classification error,which may lead to less robust model due to its sensitivity to class distribution.In this paper,we propose a robust learning framework for Pcomp data along with a pairwise surrogate loss called Pcomp-AUC.It provides an unbiased estimator to equivalently maximize AUC without accessing the precise class labels.Theoretically,we prove the consistency with respect to AUC and further provide the estimation error bound for the proposed method.Empirical studies on multiple datasets validate the effectiveness of the proposed method.
基金The work was supported by the National Natural Science Foundation of China under Grant Nos.61876091 and 61772284the China Postdoctoral Science Foundation under Grant No.2019M651918the Open Foundation of Key Laboratory of Pattern Analysis and Machine Intelligence of Ministry of Industry and Information Technology of China.
文摘The domain adversarial neural network(DANN)methods have been successfully proposed and attracted much attention recently.In DANNs,a discriminator is trained to discriminate the domain labels of features generated by a generator,whereas the generator attempts to confuse it such that the distributions between domains are aligned.As a result,it actually encourages the whole alignment or transfer between domains,while the inter-class discriminative information across domains is not considered.In this paper,we present a Discrimination-Aware Domain Adversarial Neural Network(DA2NN)method to introduce the discriminative information or the discrepancy of inter-class instances across domains into deep domain adaptation.DA2NN considers both the alignment within the same class and the separation among different classes across domains in knowledge transfer via multiple discriminators.Empirical results show that DA2NN can achieve better classification performance compared with the DANN methods.
基金supported by the funding of the Key Laboratory of Aerodynamic Noise Control(No.ANCL20190103)the State Key Laboratory of Aerodynamics,China(No.SKLA20180102)+1 种基金the Aeronautical Science Foundation of China(Nos.2018ZA52002,2019ZA052011)the Priority Academic Program Development of Jiangsu Higher Education Institutions,China(PAPD).
文摘Deep learning has been probed for the airfoil performance prediction in recent years.Compared with the expensive CFD simulations and wind tunnel experiments,deep learning models can be leveraged to somewhat mitigate such expenses with proper means.Nevertheless,effective training of the data-driven models in deep learning severely hinges on the data in diversity and quantity.In this paper,we present a novel data augmented Generative Adversarial Network(GAN),daGAN,for rapid and accurate flow filed prediction,allowing the adaption to the task with sparse data.The presented approach consists of two modules,pre-training module and fine-tuning module.The pre-training module utilizes a conditional GAN(cGAN)to preliminarily estimate the distribution of the training data.In the fine-tuning module,we propose a novel adversarial architecture with two generators one of which fulfils a promising data augmentation operation,so that the complement data is adequately incorporated to boost the generalization of the model.We use numerical simulation data to verify the generalization of daGAN on airfoils and flow conditions with sparse training data.The results show that daGAN is a promising tool for rapid and accurate evaluation of detailed flow field without the requirement for big training data.
基金This research was supported by the National Natural Science Foundation of China under Grant No.61906089the Aerospace Power Funds of China under Grant No.6141B09050342+1 种基金the Fundamental Research Funds for the Central Universities of China under Grant No.NE2019104the Jiangsu Foundation under Grant No.BK20190408.
文摘In multi-label learning,it is rather expensive to label instances since they are simultaneously associated with multiple labels.Therefore,active learning,which reduces the labeling cost by actively querying the labels of the most valuable data,becomes particularly important for multi-label learning.A good multi-label active learning algorithm usually consists of two crucial elements:a reasonable criterion to evaluate the gain of querying the label for an instance,and an effective classification model,based on whose prediction the criterion can be accurately computed.In this paper,we first introduce an effective multi-label classification model by combining label ranking with threshold learning,which is incrementally trained to avoid retraining from scratch after every query.Based on this model,we then propose to exploit both uncertainty and diversity in the instance space as well as the label space,and actively query the instance-label pairs which can improve the classification model most.Extensive experiments on 20 datasets demonstrate the superiority of the proposed approach to state-of-the-art methods.
基金supported by the National Key R&D Program of China(2018 YFC2001600,2018 YFC2001602)。
文摘1 Introduction and main contributions Differential transcript usage(DTU),which refers to the event that the relative transcript abundance within a gene changes between conditions.To detect DTU,various methods have been proposed,which can be classified into exon-based models and gene-based models.These approaches either cannot estimate the relative transcript abundance,or they cannot deal properly with the multi-source mapping problems of reads.Besides,few methods currently consider sample-to-sample variability under multiple conditions[1].
基金supported by the National Natural Science Foundation of China(Grant Nos.61876082,61732006,and 61861130366)the National Key R&D Program of China(Grant Nos.2018YFC2001600,2018YFC2001602,and 2018ZX10201002)the Royal Society Academy of Medical Sciences Newton Advanced Fellowship(Grant No.NAF\R1\180371).
文摘Background Brain network describing interconnections between brain regions contains abundant topological information.It is a challenge for the existing statistical methods(e.g.,t test)to investigate the topological differences of brain networks.Methods We proposed a kernel based statistic framework for identifying topological differences in brain networks.In our framework,the topological similarities between paired brain networks were measured by graph kernels.Then,graph kernels are embedded into maximum mean discrepancy for calculating kernel based test statistic.Based on this test statistic,we adopted conditional Monte Carlo simulation to compute the statistical significance(i.e.,P value)and statistical power.We recruited 33 patients with Alzheimer’s disease(AD),33 patients with early mild cognitive impairment(EMCI),33 patients with late mild cognitive impairment(LMCI)and 33 normal controls(NC)in our experiment.There are no statistical differences in demographic information between patients and NC.The compared state-of-the-art statistical methods include t test,t squared test,two-sample permutation test and non-normal test.Results We applied the proposed shortest path matched kernel to our framework for investigating the statistical differences of shortest path topological structures in brain networks of AD and NC.We compared our method with the existing state-of-the-art statistical methods in brain network characteristic including clustering coefficient and functional connection among EMCI,LMCI,AD,and NC.The results indicate that our framework can capture the statistically discriminative shortest path topological structures,such as shortest path from right rolandic operculum to right supplementary motor area(P=0.00314,statistical power=0.803).In clustering coefficient and functional connection,our framework outperforms the state-of-the-art statistical methods,such as P=0.0013 and statistical power=0.83 in the analysis of AD and NC.Conclusion Our proposed kernel based statistic framework not only can be used to investigate the topological differences of brain network,but also can be used to investigate the static characteristics(e.g.,clustering coefficient and functional connection)of brain network.
基金supported by the NSFC(Grant No.61672281)the Key Program of NSFC(No.61732006).
文摘Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.Existing MTL works mainly focus on the scenario where label sets among multiple tasks(MTs)are usually the same,thus they can be utilized for learning across the tasks.However,the real world has more general scenarios in which each task has only a small number of training samples and their label sets are just partially overlapped or even not.Learning such MTs is more challenging because of less correlation information available among these tasks.For this,we propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks and the information shared among those partiallyoverlapped tasks.In our implementation of using the same neural network architecture of the learnt auxiliary task to learn individual tasks,the key idea is to utilize available label information to adaptively prune the hidden layer neurons of the auxiliary network to construct corresponding network for each task,while accompanying a joint learning across individual tasks.Extensive experimental results demonstrate that our proposed method is significantly competitive compared to state-of-the-art methods.
基金supported by the National Science and Technology Major Project of China under Grant No.J2019-IV-0018-0086the National Natural Science Foundation of China under Grant No.62076124.
文摘Multi-task learning(MTL)can boost the performance of individual tasks by mutual learning among multiple related tasks.However,when these tasks assume diverse complexities,their corresponding losses involved in the MTL objective inevitably compete with each other and ultimately make the learning biased towards simple tasks rather than complex ones.To address this imbalanced learning problem,we propose a novel MTL method that can equip multiple existing deep MTL model architectures with a sequential cooperative distillation(SCD)module.Specifically,we first introduce an efficient mechanism to measure the similarity between tasks,and group similar tasks into the same block to allow their cooperative learning from each other.Based on this,the grouped task blocks are sorted in a queue to determine the learning sequence of the tasks according to their complexities estimated with the defined performance indicator.Finally,a distillation between the individual task-specific models and the MTL model is performed block by block from complex to simple manner,achieving a balance between competition and cooperation among learning multiple tasks.Extensive experiments demonstrate that our method is significantly more competitive compared with state-of-the-art methods,ranking No.1 with average performances across multiple datasets by improving 12.95%and 3.72%compared with OMTL and MTLKD,respectively.