期刊文献+
共找到18篇文章
< 1 >
每页显示 20 50 100
Boundary Data Augmentation for Offline Reinforcement Learning
1
作者 SHEN Jiahao JIANG Ke TAN Xiaoyang 《ZTE Communications》 2023年第3期29-36,共8页
Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the m... Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online interaction.One of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the mismatch between the knowledge of the learned policy and the reality of the underlying environment.Recent works usually handle this in a too pessimistic manner to avoid out-of-distribution(OOD)queries as much as possible,but this can influence the robustness of the agents at unseen states.In this paper,we propose a simple but effective method to address this issue.The key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions,and we propose to find those regions by simulating them with modified Generative Adversarial Nets(GAN)such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves,with regard to the behavior policy or some other reference policy.We then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these regions.Extensive experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method. 展开更多
关键词 offline reinforcement learning out‐of‐distribution state ROBUSTNESS UNCERTAINTY
下载PDF
Multiband decomposition and spectral discriminative analysis for motor imagery BCI via deep neural network 被引量:1
2
作者 Pengpai WANG Mingliang WANG +2 位作者 Yueying ZHOU Ziming XU Daoqiang ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第5期71-83,共13页
Human limb movement imagery,which can be used in limb neural disorders rehabilitation and brain-controlled external devices,has become a significant control paradigm in the domain of brain-computer interface(BCI).Alth... Human limb movement imagery,which can be used in limb neural disorders rehabilitation and brain-controlled external devices,has become a significant control paradigm in the domain of brain-computer interface(BCI).Although numerous pioneering studies have been devoted to motor imagery classification based on electroencephalography(EEG)signal,their performance is somewhat limited due to insufficient analysis of key effective frequency bands of EEG signals.In this paper,we propose a model of multiband decomposition and spectral discriminative analysis for motor imagery classification,which is called variational sample-long short term memory(VS-LSTM)network.Specifically,we first use a channel fusion operator to reduce the signal channels of the raw EEG signal.Then,we use the variational mode decomposition(VMD)model to decompose the EEG signal into six band-limited intrinsic mode functions(BIMFs)for further signal noise reduction.In order to select discriminative frequency bands,we calculate the sample entropy(SampEn)value of each frequency band and select the maximum value.Finally,to predict the classification of motor imagery,a LSTM model is used to predict the class of frequency band with the largest SampEn value.An open-access public data is used to evaluated the effectiveness of the proposed model.In the data,15 subjects performed motor imagery tasks with elbow flexion/extension,forearm supination/pronation and hand open/close of right upper limb.The experiment results show that the average classification result of seven kinds of motor imagery was 76.2%,the average accuracy of motor imagery binary classification is 96.6%(imagery vs.rest),respectively,which outperforms the state-of-the-art deep learning-based models.This framework significantly improves the accuracy of motor imagery by selecting effective frequency bands.This research is very meaningful for BCIs,and it is inspiring for end-to-end learning research. 展开更多
关键词 brain computer interface EEG long short-term memory VMD sample entropy motor imagery
原文传递
On the learning dynamics of two-layer quadratic neural networks for understanding deep learning
3
作者 Zhenghao TAN Songcan CHEN 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第3期77-82,共6页
Deep learning performs as a powerful paradigm in many real-world applications;however,its mechanism remains much of a mystery.To gain insights about nonlinear hierarchical deep networks,we theoretically describe the c... Deep learning performs as a powerful paradigm in many real-world applications;however,its mechanism remains much of a mystery.To gain insights about nonlinear hierarchical deep networks,we theoretically describe the coupled nonlinear learning dynamic of the two-layer neural network with quadratic activations,extending existing results from the linear case.The quadratic activation,although rarely used in practice,shares convexity with the widely used ReLU activation,thus producing similar dynamics.In this work,we focus on the case of a canonical regression problem under the standard normal distribution and use a coupled dynamical system to mimic the gradient descent method in the sense of a continuous-time limit,then use the high order moment tensor of the normal distribution to simplify these ordinary differential equations.The simplified system yields unexpected fixed points.The existence of these non-global-optimal stable points leads to the existence of saddle points in the loss surface of the quadratic networks.Our analysis shows there are conserved quantities during the training of the quadratic networks.Such quantities might result in a failed learning process if the network is initialized improperly.Finally,We illustrate the comparison between the numerical learning curves and the theoretical one,which reveals the two alternately appearing stages of the learning process. 展开更多
关键词 learning dynamic quadratic network ordinary differential equations
原文传递
fMRI-based Decoding of Visual Information from Human Brain Activity: A Brief Review 被引量:3
4
作者 Shuo Huang Wei Shao +1 位作者 Mei-Ling Wang Dao-Qiang Zhang 《International Journal of Automation and computing》 EI CSCD 2021年第2期170-184,共15页
One of the most significant challenges in the neuroscience community is to understand how the human brain works.Recent progress in neuroimaging techniques have validated that it is possible to decode a person′s thoug... One of the most significant challenges in the neuroscience community is to understand how the human brain works.Recent progress in neuroimaging techniques have validated that it is possible to decode a person′s thoughts,memories,and emotions via functional magnetic resonance imaging(i.e.,fMRI)since it can measure the neural activation of human brains with satisfied spatiotemporal resolutions.However,the unprecedented scale and complexity of the fMRI data have presented critical computational bottlenecks requiring new scientific analytic tools.Given the increasingly important role of machine learning in neuroscience,a great many machine learning algorithms are presented to analyze brain activities from the fMRI data.In this paper,we mainly provide a comprehensive and up-to-date review of machine learning methods for analyzing neural activities with the following three aspects,i.e.,brain image functional alignment,brain activity pattern analysis,and visual stimuli reconstruction.In addition,online resources and open research problems on brain pattern analysis are also provided for the convenience of future research. 展开更多
关键词 Functional magnetic resonance imaging(fMRI) functional alignment brain activity brain decoding visual stimuli reconstruction
原文传递
Multi-task regression learning for survival analysis via prior information guided transductive matrix completion 被引量:1
5
作者 Lei Chen Kai Shao +1 位作者 Xianzhong Long Lingsheng Wang 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第5期99-112,共14页
Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the... Survival analysis aims to predict the occurrence time of a particular event of interest,which is crucial for the prognosis analysis of diseases.Currently,due to the limited study period and potential losing tracks,the observed data inevitably involve some censored instances,and thus brings a unique challenge that distinguishes from the general regression problems.In addition,survival analysis also suffers from other inherent challenges such as the high-dimension and small-sample-size problems.To address these challenges,we propose a novel multi-task regression learning model,i.e.,prior information guided transductive matrix completion(PigTMC)model,to predict the survival status of the new instances.Specifically,we use the multi-label transductive matrix completion framework to leverage the censored instances together with the uncensored instances as the training samples,and simultaneously employ the multi-task transductive feature selection scheme to alleviate the overfitting issue caused by high-dimension and small-sample-size data.In addition,we employ the prior temporal stability of the survival statuses at adjacent time intervals to guide survival analysis.Furthermore,we design an optimization algorithm with guaranteed convergence to solve the proposed PigTMC model.Finally,the extensive experiments performed on the real microarray gene expression datasets demonstrate that our proposed model outperforms the previously widely used competing methods. 展开更多
关键词 survival analysis matrix completion multi-task regression transductive learning multi-task feature selection
原文传递
A comprehensive perspective of contrastive self-supervised learning 被引量:1
6
作者 Songcan CHEN Chuanxing GENG 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第4期1-3,共3页
Self-supervised learning(SSL),as a new unsupervised representation learning paradigm in machine learning,recently has received extensive attention,which is also regarded as the future of machine learning by the Turing... Self-supervised learning(SSL),as a new unsupervised representation learning paradigm in machine learning,recently has received extensive attention,which is also regarded as the future of machine learning by the Turing Award winner,LeCunn[1].SSL learns representation from unlabeled data using"pretext"tasks that provide free supervision,with the aim of performing well on semantic relation agnostic downstream(supervised)tasks.It usually divides two stages:first learning as general/invariant representation/feature as possible with the auto-annotation pretext tasks(the core),then transferring the learned knowledge to downstream tasks(the ultimate goal)[2]. 展开更多
关键词 SSL REPRESENTATION TURING
原文传递
AE-TPGG:a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction
7
作者 Shuchang ZHAO Li ZHANG Xuejun LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第3期217-234,共18页
Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new per... Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new perspectives on the cellular diversity of potential superficially homogeneous populations.Although various sequencing techniques have decreased the amplification bias and improved capture efficiency caused by the low amount of starting material,the technical noise and biological variation are inevitably introduced into experimental process,resulting in high dropout events,which greatly hinder the downstream analysis.Considering the bimodal expression pattern and the right-skewed characteristic existed in normalized scRNA-seq data,we propose a customized autoencoder based on a twopart-generalized-gamma distribution(AE-TPGG)for scRNAseq data analysis,which takes mixed discrete-continuous random variables of scRNA-seq data into account using a twopart model and utilizes the generalized gamma(GG)distribution,for fitting the positive and right-skewed continuous data.The adopted autoencoder enables AE-TPGG to captures the inherent relationship between genes.In addition to the ability of achieving low-dimensional representation,the AETPGG model also provides a denoised imputation according to statistical characteristic of gene expression.Results on real datasets demonstrate that our proposed model is competitive to current imputation methods and ameliorates a diverse set of typical scRNA-seq data analyses. 展开更多
关键词 scRNA-seq autoencoder TPGG data imputation dimensionality reduction
原文传递
Machine Learning for Brain Imaging Genomics Methods:A Review
8
作者 Mei-Ling Wang Wei Shao +1 位作者 Xiao-Ke Hao Dao-Qiang Zhang 《Machine Intelligence Research》 EI CSCD 2023年第1期57-78,共22页
In the past decade,multimodal neuroimaging and genomic techniques have been increasingly developed.As an interdiscip-linary topic,brain imaging genomics is devoted to evaluating and characterizing genetic variants in ... In the past decade,multimodal neuroimaging and genomic techniques have been increasingly developed.As an interdiscip-linary topic,brain imaging genomics is devoted to evaluating and characterizing genetic variants in individuals that influence phenotyp-ic measures derived from structural and functional brain imaging.This technique is capable of revealing the complex mechanisms by macroscopic intermediates from the genetic level to cognition and psychiatric disorders in humans.It is well known that machine learn-ing is a powerful tool in the data-driven association studies,which can fully utilize priori knowledge(intercorrelated structure informa-tion among imaging and genetic data)for association modelling.In addition,the association study is able to find the association between risk genes and brain structure or function so that a better mechanistic understanding of behaviors or disordered brain functions is ex-plored.In this paper,the related background and fundamental work in imaging genomics are first reviewed.Then,we show the univari-ate learning approaches for association analysis,summarize the main idea and modelling in genetic-imaging association studies based on multivariate machine learning,and present methods for joint association analysis and outcome prediction.Finally,this paper discusses some prospects for future work. 展开更多
关键词 Brain imaging genomics machine learning multivariate analysis association analysis outcome prediction
原文传递
Bayesian compressive principal component analysis
9
作者 Di MA Songcan CHEN 《Frontiers of Computer Science》 SCIE EI CSCD 2020年第4期29-38,共10页
Principal component analysis(PCA)is a widely used method for multivariate data analysis that projects the original high-dimensional data onto a low-dimensional subspace with maximum variance.However,in practice,we wou... Principal component analysis(PCA)is a widely used method for multivariate data analysis that projects the original high-dimensional data onto a low-dimensional subspace with maximum variance.However,in practice,we would be more likely to obtain a few compressed sensing(CS)measurements than the complete high-dimensional data due to the high cost of data acquisition and storage.In this paper,we propose a novel Bayesian algorithm for learning the solutions of PCA for the original data just from these CS measurements.To this end,we utilize a generative latent variable model incorporated with a structure prior to model both sparsity of the original data and effective dimensionality of the latent space.The proposed algorithm enjoys two important advantages:1)The effective dimensionality of the latent space can be determined automatically with no need to be pre-specified;2)The sparsity modeling makes us unnecessary to employ multiple measurement matrices to maintain the original data space but a single one,thus being storage efficient.Experimental results on synthetic and real-world datasets show that the proposed algorithm can accurately learn the solutions of PCA for the original data,which can in turn be applied in reconstruction task with favorable results. 展开更多
关键词 compressed sensing principal component analysis Bayesian learning sparsity modeling
原文传递
A Deep Model for Partial Multi-label Image Classification with Curriculum-based Disambiguation
10
作者 Feng Sun Ming-Kun Xie Sheng-Jun Huang 《Machine Intelligence Research》 EI CSCD 2024年第4期801-814,共14页
In this paper,we study the partial multi-label(PML)image classification problem,where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels.Existing PML metho... In this paper,we study the partial multi-label(PML)image classification problem,where each image is annotated with a candidate label set consisting of multiple relevant labels and other noisy labels.Existing PML methods typically design a disambiguation strategy to filter out noisy labels by utilizing prior knowledge with extra assumptions,which unfortunately is unavailable in many real tasks.Furthermore,because the objective function for disambiguation is usually elaborately designed on the whole training set,it can hardly be optimized in a deep model with stochastic gradient descent(SGD)on mini-batches.In this paper,for the first time,we propose a deep model for PML to enhance the representation and discrimination ability.On the one hand,we propose a novel curriculum-based disambiguation strategy to progressively identify ground-truth labels by incorporating the varied difficulties of different classes.On the other hand,consistency regularization is introduced for model training to balance fitting identified easy labels and exploiting potential relevant labels.Extensive experimental results on the commonly used benchmark datasets show that the proposed method significantlyoutperforms the SOTA methods. 展开更多
关键词 Partial multi-label image classification curriculum-based disambiguation consistency regularization label difficulty candidatelabel set.
原文传递
Robust AUC maximization for classification with pairwise confidence comparisons
11
作者 Haochen SHI Mingkun XIE Shengjun HUANG 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期73-83,共11页
Supervised learning often requires a large number of labeled examples,which has become a critical bottleneck in the case that manual annotating the class labels is costly.To mitigate this issue,a new framework called ... Supervised learning often requires a large number of labeled examples,which has become a critical bottleneck in the case that manual annotating the class labels is costly.To mitigate this issue,a new framework called pairwise comparison(Pcomp)classification is proposed to allow training examples only weakly annotated with pairwise comparison,i.e.,which one of two examples is more likely to be positive.The previous study solves Pcomp problems by minimizing the classification error,which may lead to less robust model due to its sensitivity to class distribution.In this paper,we propose a robust learning framework for Pcomp data along with a pairwise surrogate loss called Pcomp-AUC.It provides an unbiased estimator to equivalently maximize AUC without accessing the precise class labels.Theoretically,we prove the consistency with respect to AUC and further provide the estimation error bound for the proposed method.Empirical studies on multiple datasets validate the effectiveness of the proposed method. 展开更多
关键词 method pairwise WEAKLY
原文传递
Discrimination-Aware Domain Adversarial Neural Network 被引量:5
12
作者 Yun-Yun Wang Jian-Min Gu +2 位作者 Chao Wang Song-Can Chen Hui Xue 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第2期259-267,共9页
The domain adversarial neural network(DANN)methods have been successfully proposed and attracted much attention recently.In DANNs,a discriminator is trained to discriminate the domain labels of features generated by a... The domain adversarial neural network(DANN)methods have been successfully proposed and attracted much attention recently.In DANNs,a discriminator is trained to discriminate the domain labels of features generated by a generator,whereas the generator attempts to confuse it such that the distributions between domains are aligned.As a result,it actually encourages the whole alignment or transfer between domains,while the inter-class discriminative information across domains is not considered.In this paper,we present a Discrimination-Aware Domain Adversarial Neural Network(DA2NN)method to introduce the discriminative information or the discrepancy of inter-class instances across domains into deep domain adaptation.DA2NN considers both the alignment within the same class and the separation among different classes across domains in knowledge transfer via multiple discriminators.Empirical results show that DA2NN can achieve better classification performance compared with the DANN methods. 展开更多
关键词 adversarial learning inter-class SEPARATION deep NEURAL network discrimination-aware DOMAIN ADAPTATION
原文传递
A generative deep learning framework for airfoil flow field prediction with sparse data 被引量:5
13
作者 Haizhou WU Xuejun LIU +1 位作者 Wei AN Hongqiang LYU 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2022年第1期470-484,共15页
Deep learning has been probed for the airfoil performance prediction in recent years.Compared with the expensive CFD simulations and wind tunnel experiments,deep learning models can be leveraged to somewhat mitigate s... Deep learning has been probed for the airfoil performance prediction in recent years.Compared with the expensive CFD simulations and wind tunnel experiments,deep learning models can be leveraged to somewhat mitigate such expenses with proper means.Nevertheless,effective training of the data-driven models in deep learning severely hinges on the data in diversity and quantity.In this paper,we present a novel data augmented Generative Adversarial Network(GAN),daGAN,for rapid and accurate flow filed prediction,allowing the adaption to the task with sparse data.The presented approach consists of two modules,pre-training module and fine-tuning module.The pre-training module utilizes a conditional GAN(cGAN)to preliminarily estimate the distribution of the training data.In the fine-tuning module,we propose a novel adversarial architecture with two generators one of which fulfils a promising data augmentation operation,so that the complement data is adequately incorporated to boost the generalization of the model.We use numerical simulation data to verify the generalization of daGAN on airfoils and flow conditions with sparse training data.The results show that daGAN is a promising tool for rapid and accurate evaluation of detailed flow field without the requirement for big training data. 展开更多
关键词 CFD Flow field Generative adversarial networks(GANs) Sparse data Supercritical airfoil
原文传递
Incremental Multi-Label Learning with Active Queries 被引量:3
14
作者 Sheng-Jun Huang Guo-Xiang Li +1 位作者 Wen-Yu Huang Shao-Yuan Li 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第2期234-246,共13页
In multi-label learning,it is rather expensive to label instances since they are simultaneously associated with multiple labels.Therefore,active learning,which reduces the labeling cost by actively querying the labels... In multi-label learning,it is rather expensive to label instances since they are simultaneously associated with multiple labels.Therefore,active learning,which reduces the labeling cost by actively querying the labels of the most valuable data,becomes particularly important for multi-label learning.A good multi-label active learning algorithm usually consists of two crucial elements:a reasonable criterion to evaluate the gain of querying the label for an instance,and an effective classification model,based on whose prediction the criterion can be accurately computed.In this paper,we first introduce an effective multi-label classification model by combining label ranking with threshold learning,which is incrementally trained to avoid retraining from scratch after every query.Based on this model,we then propose to exploit both uncertainty and diversity in the instance space as well as the label space,and actively query the instance-label pairs which can improve the classification model most.Extensive experiments on 20 datasets demonstrate the superiority of the proposed approach to state-of-the-art methods. 展开更多
关键词 ACTIVE LEARNING MULTI-LABEL LEARNING uncertainty DIVERSITY
原文传递
Detecting differential transcript usage across multiple conditions for RNA-seq data based on the smoothed LDA model 被引量:1
15
作者 Jing LI Xuejun LIU Daoqiang ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第3期217-219,共3页
1 Introduction and main contributions Differential transcript usage(DTU),which refers to the event that the relative transcript abundance within a gene changes between conditions.To detect DTU,various methods have bee... 1 Introduction and main contributions Differential transcript usage(DTU),which refers to the event that the relative transcript abundance within a gene changes between conditions.To detect DTU,various methods have been proposed,which can be classified into exon-based models and gene-based models.These approaches either cannot estimate the relative transcript abundance,or they cannot deal properly with the multi-source mapping problems of reads.Besides,few methods currently consider sample-to-sample variability under multiple conditions[1]. 展开更多
关键词 conditions. PROPERLY DIFFERENTIAL
原文传递
Kernel based statistic: identifying topological differences in brain networks 被引量:1
16
作者 Kai Ma Wei Shao +1 位作者 Qi Zhu Daoqiang Zhang 《Intelligent Medicine》 2022年第1期30-40,共11页
Background Brain network describing interconnections between brain regions contains abundant topological information.It is a challenge for the existing statistical methods(e.g.,t test)to investigate the topological di... Background Brain network describing interconnections between brain regions contains abundant topological information.It is a challenge for the existing statistical methods(e.g.,t test)to investigate the topological differences of brain networks.Methods We proposed a kernel based statistic framework for identifying topological differences in brain networks.In our framework,the topological similarities between paired brain networks were measured by graph kernels.Then,graph kernels are embedded into maximum mean discrepancy for calculating kernel based test statistic.Based on this test statistic,we adopted conditional Monte Carlo simulation to compute the statistical significance(i.e.,P value)and statistical power.We recruited 33 patients with Alzheimer’s disease(AD),33 patients with early mild cognitive impairment(EMCI),33 patients with late mild cognitive impairment(LMCI)and 33 normal controls(NC)in our experiment.There are no statistical differences in demographic information between patients and NC.The compared state-of-the-art statistical methods include t test,t squared test,two-sample permutation test and non-normal test.Results We applied the proposed shortest path matched kernel to our framework for investigating the statistical differences of shortest path topological structures in brain networks of AD and NC.We compared our method with the existing state-of-the-art statistical methods in brain network characteristic including clustering coefficient and functional connection among EMCI,LMCI,AD,and NC.The results indicate that our framework can capture the statistically discriminative shortest path topological structures,such as shortest path from right rolandic operculum to right supplementary motor area(P=0.00314,statistical power=0.803).In clustering coefficient and functional connection,our framework outperforms the state-of-the-art statistical methods,such as P=0.0013 and statistical power=0.83 in the analysis of AD and NC.Conclusion Our proposed kernel based statistic framework not only can be used to investigate the topological differences of brain network,but also can be used to investigate the static characteristics(e.g.,clustering coefficient and functional connection)of brain network. 展开更多
关键词 Brain network Conditional monte carlo Graph kernel Statistical analysis Topological difference
原文传递
Learning multi-tasks with inconsistent labels by using auxiliary big task
17
作者 Quan FENG Songcan CHEN 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第5期119-132,共14页
Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.Existing MTL works mainly focus on the scenario where label sets among multiple tasks(MTs)are ... Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks.Existing MTL works mainly focus on the scenario where label sets among multiple tasks(MTs)are usually the same,thus they can be utilized for learning across the tasks.However,the real world has more general scenarios in which each task has only a small number of training samples and their label sets are just partially overlapped or even not.Learning such MTs is more challenging because of less correlation information available among these tasks.For this,we propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks and the information shared among those partiallyoverlapped tasks.In our implementation of using the same neural network architecture of the learnt auxiliary task to learn individual tasks,the key idea is to utilize available label information to adaptively prune the hidden layer neurons of the auxiliary network to construct corresponding network for each task,while accompanying a joint learning across individual tasks.Extensive experimental results demonstrate that our proposed method is significantly competitive compared to state-of-the-art methods. 展开更多
关键词 multi-task learning inconsistent labels auxiliary task
原文传递
Sequential Cooperative Distillation for Imbalanced Multi-Task Learning
18
作者 Quan Feng Jia-Yu Yao +2 位作者 Ming-Kun Xie Sheng-Jun Huang Song-Can Chen 《Journal of Computer Science & Technology》 SCIE EI 2024年第5期1094-1106,共13页
Multi-task learning(MTL)can boost the performance of individual tasks by mutual learning among multiple related tasks.However,when these tasks assume diverse complexities,their corresponding losses involved in the MTL... Multi-task learning(MTL)can boost the performance of individual tasks by mutual learning among multiple related tasks.However,when these tasks assume diverse complexities,their corresponding losses involved in the MTL objective inevitably compete with each other and ultimately make the learning biased towards simple tasks rather than complex ones.To address this imbalanced learning problem,we propose a novel MTL method that can equip multiple existing deep MTL model architectures with a sequential cooperative distillation(SCD)module.Specifically,we first introduce an efficient mechanism to measure the similarity between tasks,and group similar tasks into the same block to allow their cooperative learning from each other.Based on this,the grouped task blocks are sorted in a queue to determine the learning sequence of the tasks according to their complexities estimated with the defined performance indicator.Finally,a distillation between the individual task-specific models and the MTL model is performed block by block from complex to simple manner,achieving a balance between competition and cooperation among learning multiple tasks.Extensive experiments demonstrate that our method is significantly more competitive compared with state-of-the-art methods,ranking No.1 with average performances across multiple datasets by improving 12.95%and 3.72%compared with OMTL and MTLKD,respectively. 展开更多
关键词 multi-task learning(MIT) imbalanced learning similarity estimation knowledge distillation distillation queue
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部