Ambiguous expression is a common phenomenon in facial expression recognition(FER).Because of the existence of ambiguous expression,the effect of FER is severely limited.The reason maybe that the single label of the da...Ambiguous expression is a common phenomenon in facial expression recognition(FER).Because of the existence of ambiguous expression,the effect of FER is severely limited.The reason maybe that the single label of the data cannot effectively describe complex emotional intentions which are vital in FER.Label distribution learning contains more information and is a possible way to solve this problem.To apply label distribution learning on FER,a label distribution expression recognition algorithm based on asymptotic truth value is proposed.Under the premise of not incorporating extraneous quantitative information,the original information of database is fully used to complete the generation and utilization of label distribution.Firstly,in training part,single label learning is used to collect the mean value of the overall distribution of data.Then,the true value of data label is approached gradually on the granularity of data batch.Finally,the whole network model is retrained using the generated label distribution data.Experimental results show that this method can improve the accuracy of the network model obviously,and has certain competitiveness compared with the advanced algorithms.展开更多
Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts.However,existing methods usually ne...Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts.However,existing methods usually need complex post-processing stages to process ambiguous labels,i.e.,the labels of the pixels near the text boundary,which may belong to the text or background.In this paper,we present a framework for segmentation-based scene text detection by learning from ambiguous labels.We use the label distribution learning method to process the label ambiguity of text annotation,which achieves a good performance without using additional post-processing stage.Experiments on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection.展开更多
Label distribution learning(LDL)is a new learning paradigm to deal with label ambiguity and many researches have achieved the prominent performances.Compared with traditional supervised learning scenarios,the annotati...Label distribution learning(LDL)is a new learning paradigm to deal with label ambiguity and many researches have achieved the prominent performances.Compared with traditional supervised learning scenarios,the annotation with label distribution is more expensive.Direct use of existing active learning(AL)approaches,which aim to reduce the annotation cost in traditional learning,may lead to the degradation of their performance.To deal with the problem of high annotation cost in LDL,we propose the active label distribution learning via kernel maximum mean discrepancy(ALDL-kMMD)method to tackle this crucial but rarely studied problem.ALDL-kMMD captures the structural information of both data and label,extracts the most representative instances from the unlabeled ones by incorporating the nonlinear model and marginal probability distribution matching.Besides,it is also able to markedly decrease the amount of queried unlabeled instances.Meanwhile,an effective solution is proposed for the original optimization problem of ALDL-kMMD by constructing auxiliary variables.The effectiveness of our method is validated with experiments on the real-world datasets.展开更多
Age estimation plays an important role in human-computer interaction system.The lack of large number of facial images with definite age label makes age estimation al-gorithms inefficient.Deep label distribution learni...Age estimation plays an important role in human-computer interaction system.The lack of large number of facial images with definite age label makes age estimation al-gorithms inefficient.Deep label distribution learning(DLDL)which employs convolutional neural networks(CNN)and label distribution learning to learn ambiguity from ground-truth age and adjacent ages,has been proven to outperform current state-of-the-art framework.However,DLDL assumes a rough label distribution which covers all ages for any given age label.In this paper,a more practical label distribution paradigm is proposed:we limit age label distribution that only covers a reasonable number of neighboring ages.In addition,we explore different label distributions to improve the performance of the proposed learning model.We employ CNN and the improved label distribution learning to estimate age.Experimental results show that compared to the DLDL,our method is more effective for facial age recognition.展开更多
Accurate head poses are useful for many face-related tasks such as face recognition, gaze estimation,and emotion analysis. Most existing methods estimate head poses that are included in the training data(i.e.,previous...Accurate head poses are useful for many face-related tasks such as face recognition, gaze estimation,and emotion analysis. Most existing methods estimate head poses that are included in the training data(i.e.,previously seen head poses). To predict head poses that are not seen in the training data, some regression-based methods have been proposed. However, they focus on estimating continuous head pose angles, and thus do not systematically evaluate the performance on predicting unseen head poses. In this paper, we use a dense multivariate label distribution(MLD) to represent the pose angle of a face image. By incorporating both seen and unseen pose angles into MLD, the head pose predictor can estimate unseen head poses with an accuracy comparable to that of estimating seen head poses. On the Pointing'04 database, the mean absolute errors of results for yaw and pitch are 4.01?and 2.13?, respectively. In addition, experiments on the CAS-PEAL and CMU Multi-PIE databases show that the proposed dense MLD-based head pose estimation method can obtain the state-of-the-art performance when compared to some existing methods.展开更多
Multimodal machine learning(MML)aims to understand the world from multiple related modalities.It has attracted much attention as multimodal data has become increasingly available in real-world application.It is shown ...Multimodal machine learning(MML)aims to understand the world from multiple related modalities.It has attracted much attention as multimodal data has become increasingly available in real-world application.It is shown that MML can perform better than single-modal machine learning,since multi-modalities containing more information which could complement each other.However,it is a key challenge to fuse the multi-modalities in MML.Different from previous work,we further consider the side-information,which reflects the situation and influences the fusion of multi-modalities.We recover multimodal label distribution(MLD)by leveraging the side-information,representing the degree to which each modality contributes to describing the instance.Accordingly,a novel framework named multimodal label distribution learning(MLDL)is proposed to recover the MLD,and fuse the multimodalities with its guidance to learn an in-depth understanding of the jointly feature representation.Moreover,two versions of MLDL are proposed to deal with the sequential data.Experiments on multimodal sentiment analysis and disease prediction show that the proposed approaches perform favorably against state-of-the-art methods.展开更多
The topological connectivity information derived from the brain functional network can bring new insights for diagnosing and analyzing dementia disorders.The brain functional network is suitable to bridge the correlat...The topological connectivity information derived from the brain functional network can bring new insights for diagnosing and analyzing dementia disorders.The brain functional network is suitable to bridge the correlation between abnormal connectivities and dementia disorders.However,it is challenging to access considerable amounts of brain functional network data,which hinders the widespread application of data-driven models in dementia diagnosis.In this study,a novel distribution-regularized adversarial graph auto-Encoder(DAGAE)with transformer is proposed to generate new fake brain functional networks to augment the brain functional network dataset,improving the dementia diagnosis accuracy of data-driven models.Specifically,the label distribution is estimated to regularize the latent space learned by the graph encoder,which canmake the learning process stable and the learned representation robust.Also,the transformer generator is devised to map the node representations into node-to-node connections by exploring the long-term dependence of highly-correlated distant brain regions.The typical topological properties and discriminative features can be preserved entirely.Furthermore,the generated brain functional networks improve the prediction performance using different classifiers,which can be applied to analyze other cognitive diseases.Attempts on the Alzheimer’s Disease Neuroimaging Initiative(ADNI)dataset demonstrate that the proposed model can generate good brain functional networks.The classification results show adding generated data can achieve the best accuracy value of 85.33%,sensitivity value of 84.00%,specificity value of 86.67%.The proposed model also achieves superior performance compared with other related augmentedmodels.Overall,the proposedmodel effectively improves cognitive disease diagnosis by generating diverse brain functional networks.展开更多
In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in sema...In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in semantic feature extraction have turned to external knowledge to augment the model’s grasp of textual content,often overlooking intrinsic textual cues such as label statistical features.In contrast,these endogenous insights naturally align with the classification task.In our paper,to complement this focus on intrinsic knowledge,we introduce a novel Gate-Attention mechanism.This mechanism adeptly integrates statistical features from the text itself into the semantic fabric,enhancing the model’s capacity to understand and represent the data.Additionally,to address the intricate task of mining label correlations,we propose a Dual-end enhancement mechanism.This mechanism effectively mitigates the challenges of information loss and erroneous transmission inherent in traditional long short term memory propagation.We conducted an extensive battery of experiments on the AAPD and RCV1-2 datasets.These experiments serve the dual purpose of confirming the efficacy of both the Gate-Attention mechanism and the Dual-end enhancement mechanism.Our final model unequivocally outperforms the baseline model,attesting to its robustness.These findings emphatically underscore the imperativeness of taking into account not just external knowledge but also the inherent intricacies of textual data when crafting potent MLTC models.展开更多
Estimating the proportion of land-use types in different regions is essential to promote the organization of a compact city and reduce energy consumption.However,existing research in this area has a few limitations:(1...Estimating the proportion of land-use types in different regions is essential to promote the organization of a compact city and reduce energy consumption.However,existing research in this area has a few limitations:(1)lack of consideration of land-use distribution-related factors other than POIs;(2)inability to extract complex relations from heterogeneous information;and(3)overlooking the correlation between land-use types.To overcome these limitations,we propose a knowledge-based approach for estimating land-use distributions.We designed a knowledge graph to display POIs and other related heterogeneous data and then utilized a knowledge embedding model to directly obtain the region embedding vectors by learning the complex and implicit relations present in the knowledge graph.Region embedding vectors were mapped to land-use distributions using a label distribution learning method integrating the correlation between land-use types.To prove the reliability and validity of our approach,we conducted a case study in Jinhua,China.The results indicated that the proposed model outperformed other algorithms in all evaluation indices,thus illustrating the potential of this method to achieve higher accuracy land-use distribution estimates.展开更多
When utilizing the deep learning models in some real applications,the distribution of the labels in the environment can be used to increase the accuracy.Generally,to compute this distribution,there should be the valid...When utilizing the deep learning models in some real applications,the distribution of the labels in the environment can be used to increase the accuracy.Generally,to compute this distribution,there should be the validation set that is labeled by the ground truths.On the other side,the dependency of ground truths limits the utilization of the distribution in various environments.In this paper,we carried out a novel system for the deep learning-based classification to solve this problem.Firstly,our system only uses one validation set with ground truths to compute some hyper parameters,which is named as one-shot guidance.Secondly,in an environment,our system builds the validation set and labels this by the prediction results,which does not need any guidance by the ground truths.Thirdly,the computed distribution of labels by the validation set selectively cooperates with the probability of labels by the output of models,which is to increase the accuracy of predict results on testing samples.We selected six popular deep learning models on three real datasets for the evaluation.The experimental results show that our system can achieve higher accuracy than state-of-art methods while reducing the dependency of labeled validation set.展开更多
Generally,the performance of deep learning models is related to the captured features of training samples.When the training samples belong to different domains,the diverse features may increase the difficulty of train...Generally,the performance of deep learning models is related to the captured features of training samples.When the training samples belong to different domains,the diverse features may increase the difficulty of training high performance models.In this paper,we built a new framework that generates multiple models on the organized samples to increase the accuracy of classification.Firstly,our framework selects some existing models and trains each of them on organized training sets to get multiple trained models.Secondly,we select some of them based on a validation set.Finally,we use some fusion method on the outputs of the selected models to get more accurate results.The experimental results show that our framework achieved higher accuracy than the existing methods.Our framework can be an option for the deep learning system to increase the classification accuracy.展开更多
基金National Youth Natural Science Foundation of China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)Project Supported by Jiangsu University Superior Discipline Construction Project。
文摘Ambiguous expression is a common phenomenon in facial expression recognition(FER).Because of the existence of ambiguous expression,the effect of FER is severely limited.The reason maybe that the single label of the data cannot effectively describe complex emotional intentions which are vital in FER.Label distribution learning contains more information and is a possible way to solve this problem.To apply label distribution learning on FER,a label distribution expression recognition algorithm based on asymptotic truth value is proposed.Under the premise of not incorporating extraneous quantitative information,the original information of database is fully used to complete the generation and utilization of label distribution.Firstly,in training part,single label learning is used to collect the mean value of the overall distribution of data.Then,the true value of data label is approached gradually on the granularity of data batch.Finally,the whole network model is retrained using the generated label distribution data.Experimental results show that this method can improve the accuracy of the network model obviously,and has certain competitiveness compared with the advanced algorithms.
基金supported by the National Key R&D Program of China(2018AAA0100104,2018AAA0100100)the National Natural Science Foundation of China(Grant No.61702095)the Natural Science Foundation of Jiangsu Province(BK20211164).
文摘Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved texts.However,existing methods usually need complex post-processing stages to process ambiguous labels,i.e.,the labels of the pixels near the text boundary,which may belong to the text or background.In this paper,we present a framework for segmentation-based scene text detection by learning from ambiguous labels.We use the label distribution learning method to process the label ambiguity of text annotation,which achieves a good performance without using additional post-processing stage.Experiments on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection.
基金partially supported by the National Natural Science Fundation of China(Grant Nos.61922087,61906201 and 62006238)the Science and Technology Innovation Program of Hunan Province(2021RC3070).
文摘Label distribution learning(LDL)is a new learning paradigm to deal with label ambiguity and many researches have achieved the prominent performances.Compared with traditional supervised learning scenarios,the annotation with label distribution is more expensive.Direct use of existing active learning(AL)approaches,which aim to reduce the annotation cost in traditional learning,may lead to the degradation of their performance.To deal with the problem of high annotation cost in LDL,we propose the active label distribution learning via kernel maximum mean discrepancy(ALDL-kMMD)method to tackle this crucial but rarely studied problem.ALDL-kMMD captures the structural information of both data and label,extracts the most representative instances from the unlabeled ones by incorporating the nonlinear model and marginal probability distribution matching.Besides,it is also able to markedly decrease the amount of queried unlabeled instances.Meanwhile,an effective solution is proposed for the original optimization problem of ALDL-kMMD by constructing auxiliary variables.The effectiveness of our method is validated with experiments on the real-world datasets.
基金the financial support of the China National Natural Science Foundation(61702095)Natural Science Founda-tion(njpj2018209)of Nanjing Tech University Pujiang Institute,Anhui Polytechnic University Scientific Research Foundation(S031702004)+1 种基金Natural Science Foundation of Fujian Province(2018J01806)Scientific Research Pro-gram of Outstanding Talents in Universities of Fujian。
文摘Age estimation plays an important role in human-computer interaction system.The lack of large number of facial images with definite age label makes age estimation al-gorithms inefficient.Deep label distribution learning(DLDL)which employs convolutional neural networks(CNN)and label distribution learning to learn ambiguity from ground-truth age and adjacent ages,has been proven to outperform current state-of-the-art framework.However,DLDL assumes a rough label distribution which covers all ages for any given age label.In this paper,a more practical label distribution paradigm is proposed:we limit age label distribution that only covers a reasonable number of neighboring ages.In addition,we explore different label distributions to improve the performance of the proposed learning model.We employ CNN and the improved label distribution learning to estimate age.Experimental results show that compared to the DLDL,our method is more effective for facial age recognition.
基金supported by the National Key Scientific Instrument and Equipment Development Project of China(No.2013YQ49087903)the National Natural Science Foundation of China(No.61202160)
文摘Accurate head poses are useful for many face-related tasks such as face recognition, gaze estimation,and emotion analysis. Most existing methods estimate head poses that are included in the training data(i.e.,previously seen head poses). To predict head poses that are not seen in the training data, some regression-based methods have been proposed. However, they focus on estimating continuous head pose angles, and thus do not systematically evaluate the performance on predicting unseen head poses. In this paper, we use a dense multivariate label distribution(MLD) to represent the pose angle of a face image. By incorporating both seen and unseen pose angles into MLD, the head pose predictor can estimate unseen head poses with an accuracy comparable to that of estimating seen head poses. On the Pointing'04 database, the mean absolute errors of results for yaw and pitch are 4.01?and 2.13?, respectively. In addition, experiments on the CAS-PEAL and CMU Multi-PIE databases show that the proposed dense MLD-based head pose estimation method can obtain the state-of-the-art performance when compared to some existing methods.
基金This research was supported by the National Key Research and Development Plan of China(2018AAA0100104)the National Natural Science Foundation of China(Grant No.62076063)the Fundamental Research Funds for the Central Universities(2242021k30056).
文摘Multimodal machine learning(MML)aims to understand the world from multiple related modalities.It has attracted much attention as multimodal data has become increasingly available in real-world application.It is shown that MML can perform better than single-modal machine learning,since multi-modalities containing more information which could complement each other.However,it is a key challenge to fuse the multi-modalities in MML.Different from previous work,we further consider the side-information,which reflects the situation and influences the fusion of multi-modalities.We recover multimodal label distribution(MLD)by leveraging the side-information,representing the degree to which each modality contributes to describing the instance.Accordingly,a novel framework named multimodal label distribution learning(MLDL)is proposed to recover the MLD,and fuse the multimodalities with its guidance to learn an in-depth understanding of the jointly feature representation.Moreover,two versions of MLDL are proposed to deal with the sequential data.Experiments on multimodal sentiment analysis and disease prediction show that the proposed approaches perform favorably against state-of-the-art methods.
基金This paper is partially supported by the British Heart Foundation Accelerator Award,UK(AA\18\3\34220)Royal Society International Exchanges Cost Share Award,UK(RP202G0230)+9 种基金Hope Foundation for Cancer Research,UK(RM60G0680)Medical Research Council Confidence in Concept Award,UK(MC_PC_17171)Sino-UK Industrial Fund,UK(RP202G0289)Global Challenges Research Fund(GCRF),UK(P202PF11)LIAS Pioneering Partnerships Award,UK(P202ED10)Data Science Enhancement Fund,UK(P202RE237)Fight for Sight,UK(24NN201)Sino-UK Education Fund,UK(OP202006)Biotechnology and Biological Sciences Research Council,UK(RM32G0178B8)LIAS Seed Corn,UK(P202RE969).
文摘The topological connectivity information derived from the brain functional network can bring new insights for diagnosing and analyzing dementia disorders.The brain functional network is suitable to bridge the correlation between abnormal connectivities and dementia disorders.However,it is challenging to access considerable amounts of brain functional network data,which hinders the widespread application of data-driven models in dementia diagnosis.In this study,a novel distribution-regularized adversarial graph auto-Encoder(DAGAE)with transformer is proposed to generate new fake brain functional networks to augment the brain functional network dataset,improving the dementia diagnosis accuracy of data-driven models.Specifically,the label distribution is estimated to regularize the latent space learned by the graph encoder,which canmake the learning process stable and the learned representation robust.Also,the transformer generator is devised to map the node representations into node-to-node connections by exploring the long-term dependence of highly-correlated distant brain regions.The typical topological properties and discriminative features can be preserved entirely.Furthermore,the generated brain functional networks improve the prediction performance using different classifiers,which can be applied to analyze other cognitive diseases.Attempts on the Alzheimer’s Disease Neuroimaging Initiative(ADNI)dataset demonstrate that the proposed model can generate good brain functional networks.The classification results show adding generated data can achieve the best accuracy value of 85.33%,sensitivity value of 84.00%,specificity value of 86.67%.The proposed model also achieves superior performance compared with other related augmentedmodels.Overall,the proposedmodel effectively improves cognitive disease diagnosis by generating diverse brain functional networks.
基金supported by National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2020040,ZDYF2021GXJS003)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant Nos.620MS021,621QN211)Science and Technology Development Center of the Ministry of Education Industry-University-Research Innovation Fund(2021JQR017).
文摘In the realm of Multi-Label Text Classification(MLTC),the dual challenges of extracting rich semantic features from text and discerning inter-label relationships have spurred innovative approaches.Many studies in semantic feature extraction have turned to external knowledge to augment the model’s grasp of textual content,often overlooking intrinsic textual cues such as label statistical features.In contrast,these endogenous insights naturally align with the classification task.In our paper,to complement this focus on intrinsic knowledge,we introduce a novel Gate-Attention mechanism.This mechanism adeptly integrates statistical features from the text itself into the semantic fabric,enhancing the model’s capacity to understand and represent the data.Additionally,to address the intricate task of mining label correlations,we propose a Dual-end enhancement mechanism.This mechanism effectively mitigates the challenges of information loss and erroneous transmission inherent in traditional long short term memory propagation.We conducted an extensive battery of experiments on the AAPD and RCV1-2 datasets.These experiments serve the dual purpose of confirming the efficacy of both the Gate-Attention mechanism and the Dual-end enhancement mechanism.Our final model unequivocally outperforms the baseline model,attesting to its robustness.These findings emphatically underscore the imperativeness of taking into account not just external knowledge but also the inherent intricacies of textual data when crafting potent MLTC models.
基金supported by N ational Natural Science Foundation of China[grant number 41801313].
文摘Estimating the proportion of land-use types in different regions is essential to promote the organization of a compact city and reduce energy consumption.However,existing research in this area has a few limitations:(1)lack of consideration of land-use distribution-related factors other than POIs;(2)inability to extract complex relations from heterogeneous information;and(3)overlooking the correlation between land-use types.To overcome these limitations,we propose a knowledge-based approach for estimating land-use distributions.We designed a knowledge graph to display POIs and other related heterogeneous data and then utilized a knowledge embedding model to directly obtain the region embedding vectors by learning the complex and implicit relations present in the knowledge graph.Region embedding vectors were mapped to land-use distributions using a label distribution learning method integrating the correlation between land-use types.To prove the reliability and validity of our approach,we conducted a case study in Jinhua,China.The results indicated that the proposed model outperformed other algorithms in all evaluation indices,thus illustrating the potential of this method to achieve higher accuracy land-use distribution estimates.
基金NationalNatural Science Foundation of China(GrantNos.61802279,6180021345,61702281,and 61702366)Natural Science Foundation of Tianjin(Grant Nos.18JCQNJC70300,19JCTPJC49200,19PTZWHZ00020,and 19JCYBJC15800)+2 种基金Fundamental Research Funds for the Tianjin Universities(Grant No.2019KJ019)the Tianjin Science and Technology Program(Grant No.19PTZWHZ00020)and in part by the State Key Laboratory of ASIC and System(Grant No.2021KF014)Tianjin Educational Commission Scientific Research Program Project(Grant Nos.2020KJ112 and 2018KJ215).
文摘When utilizing the deep learning models in some real applications,the distribution of the labels in the environment can be used to increase the accuracy.Generally,to compute this distribution,there should be the validation set that is labeled by the ground truths.On the other side,the dependency of ground truths limits the utilization of the distribution in various environments.In this paper,we carried out a novel system for the deep learning-based classification to solve this problem.Firstly,our system only uses one validation set with ground truths to compute some hyper parameters,which is named as one-shot guidance.Secondly,in an environment,our system builds the validation set and labels this by the prediction results,which does not need any guidance by the ground truths.Thirdly,the computed distribution of labels by the validation set selectively cooperates with the probability of labels by the output of models,which is to increase the accuracy of predict results on testing samples.We selected six popular deep learning models on three real datasets for the evaluation.The experimental results show that our system can achieve higher accuracy than state-of-art methods while reducing the dependency of labeled validation set.
基金National Natural Science Foundation of China(Grant Nos.61702281,61802279,6180021345 and 61702366)Natural Science Foundation of Tianjin(Grant Nos.18JCQNJC70300,19PTZWHZ00020,19JCTPJC49200 and 19JCYBJC15800)+3 种基金Fundamental Research Funds for the Tianjin Universities(Grant No.2019KJ019)the Tianjin Science and Technology Program(Grant No.19PTZWHZ00020)in part by the State Key Laboratory of ASIC and System(Grant No.2021KF014)Tianjin Educational Commission Scientific Research Program Project(Grant Nos.2018KJ215 and 2020KJ112).
文摘Generally,the performance of deep learning models is related to the captured features of training samples.When the training samples belong to different domains,the diverse features may increase the difficulty of training high performance models.In this paper,we built a new framework that generates multiple models on the organized samples to increase the accuracy of classification.Firstly,our framework selects some existing models and trains each of them on organized training sets to get multiple trained models.Secondly,we select some of them based on a validation set.Finally,we use some fusion method on the outputs of the selected models to get more accurate results.The experimental results show that our framework achieved higher accuracy than the existing methods.Our framework can be an option for the deep learning system to increase the classification accuracy.