A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning.The main objective of this wor...A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning.The main objective of this work is to create a novel framework for learning and classifying imbalancedmulti-label data.This work proposes a framework of two phases.The imbalanced distribution of themulti-label dataset is addressed through the proposed Borderline MLSMOTE resampling method in phase 1.Later,an adaptive weighted l21 norm regularized(Elastic-net)multilabel logistic regression is used to predict unseen samples in phase 2.The proposed Borderline MLSMOTE resampling method focuses on samples with concurrent high labels in contrast to conventional MLSMOTE.The minority labels in these samples are called difficult minority labels and are more prone to penalize classification performance.The concurrentmeasure is considered borderline,and labels associated with samples are regarded as borderline labels in the decision boundary.In phase II,a novel adaptive l21 norm regularized weighted multi-label logistic regression is used to handle balanced data with different weighted synthetic samples.Experimentation on various benchmark datasets shows the outperformance of the proposed method and its powerful predictive performances over existing conventional state-of-the-art multi-label methods.展开更多
Phononic crystals,as artificial composite materials,have sparked significant interest due to their novel characteristics that emerge upon the introduction of nonlinearity.Among these properties,second-harmonic feature...Phononic crystals,as artificial composite materials,have sparked significant interest due to their novel characteristics that emerge upon the introduction of nonlinearity.Among these properties,second-harmonic features exhibit potential applications in acoustic frequency conversion,non-reciprocal wave propagation,and non-destructive testing.Precisely manipulating the harmonic band structure presents a major challenge in the design of nonlinear phononic crystals.Traditional design approaches based on parameter adjustments to meet specific application requirements are inefficient and often yield suboptimal performance.Therefore,this paper develops a design methodology using Softmax logistic regression and multi-label classification learning to inversely design the material distribution of nonlinear phononic crystals by exploiting information from harmonic transmission spectra.The results demonstrate that the neural network-based inverse design method can effectively tailor nonlinear phononic crystals with desired functionalities.This work establishes a mapping relationship between the band structure and the material distribution within phononic crystals,providing valuable insights into the inverse design of metamaterials.展开更多
Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological heal...Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological health.To create a clean and positive Internet environment,network enforcement agencies need an automatic and efficient pornographic image recognition tool.Previous studies on pornographic images mainly rely on convolutional neural networks(CNN).Because of CNN’s many parameters,they must rely on a large labeled training dataset,which takes work to build.To reduce the effect of the database on the recognition performance of pornographic images,many researchers view pornographic image recognition as a binary classification task.In actual application,when faced with pornographic images of various features,the performance and recognition accuracy of the network model often decrease.In addition,the pornographic content in images usually lies in several small-sized local regions,which are not a large proportion of the image.CNN,this kind of strong supervised learning method,usually cannot automatically focus on the pornographic area of the image,thus affecting the recognition accuracy of pornographic images.This paper established an image dataset with seven classes by crawling pornographic websites and Baidu Image Library.A weakly supervised pornographic image recognition method based on multiple instance learning(MIL)is proposed.The Squeeze and Extraction(SE)module is introduced in the feature extraction to strengthen the critical information and weaken the influence of non-key and useless information on the result of pornographic image recognition.To meet the requirements of the pooling layer operation in Multiple Instance Learning,we introduced the idea of an attention mechanism to weight and average instances.The experimental results show that the proposed method has better accuracy and F1 scores than other methods.展开更多
Multi-label learning is an active research area which plays an important role in machine learning. Traditional learning algorithms, however, have to depend on samples with complete labels. The existing learning algori...Multi-label learning is an active research area which plays an important role in machine learning. Traditional learning algorithms, however, have to depend on samples with complete labels. The existing learning algorithms with missing labels do not consider the relevance of labels, resulting in label estimation errors of new samples. A new multi-label learning algorithm with support vector machine(SVM) based association(SVMA) is proposed to estimate missing labels by constructing the association between different labels. SVMA will establish a mapping function to minimize the number of samples in the margin while ensuring the margin large enough as well as minimizing the misclassification probability. To evaluate the performance of SVMA in the condition of missing labels, four typical data sets are adopted with the integrity of the labels being handled manually. Simulation results show the superiority of SVMA in dealing with the samples with missing labels compared with other models in image classification.展开更多
In recent years,multi-label learning has received a lot of attention.However,most of the existing methods only consider global label correlation or local label correlation.In fact,on the one hand,both global and local...In recent years,multi-label learning has received a lot of attention.However,most of the existing methods only consider global label correlation or local label correlation.In fact,on the one hand,both global and local label correlations can appear in real-world situation at same time.On the other hand,we should not be limited to pairwise labels while ignoring the high-order label correlation.In this paper,we propose a novel and effective method called GLLCBN for multi-label learning.Firstly,we obtain the global label correlation by exploiting label semantic similarity.Then,we analyze the pairwise labels in the label space of the data set to acquire the local correlation.Next,we build the original version of the label dependency model by global and local label correlations.After that,we use graph theory,probability theory and Bayesian networks to eliminate redundant dependency structure in the initial version model,so as to get the optimal label dependent model.Finally,we obtain the feature extraction model by adjusting the Inception V3 model of convolution neural network and combine it with the GLLCBN model to achieve the multi-label learning.The experimental results show that our proposed model has better performance than other multi-label learning methods in performance evaluating.展开更多
The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. To provide useful clues for developing antiviral ...The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. To provide useful clues for developing antiviral drugs, information of anatomical therapeutic chemicals is vitally important. In view of this, a CNN based predictor called “iATC_Deep-mISF” has been developed. The predictor is particularly useful in dealing with the multi-label systems in which some chemicals may occur in two or more different classes. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/iATC_Deep-mISF/, which will become a very powerful tool for developing effective drugs to fight pandemic coronavirus and save the mankind of this planet.展开更多
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t...Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.展开更多
Multi-label image classification is recognized as an important task within the field of computer vision,a discipline that has experienced a significant escalation in research endeavors in recent years.The widespread a...Multi-label image classification is recognized as an important task within the field of computer vision,a discipline that has experienced a significant escalation in research endeavors in recent years.The widespread adoption of convolutional neural networks(CNNs)has catalyzed the remarkable success of architectures such as ResNet-101 within the domain of image classification.However,inmulti-label image classification tasks,it is crucial to consider the correlation between labels.In order to improve the accuracy and performance of multi-label classification and fully combine visual and semantic features,many existing studies use graph convolutional networks(GCN)for modeling.Object detection and multi-label image classification exhibit a degree of conceptual overlap;however,the integration of these two tasks within a unified framework has been relatively underexplored in the existing literature.In this paper,we come up with Object-GCN framework,a model combining object detection network YOLOv5 and graph convolutional network,and we carry out a thorough experimental analysis using a range of well-established public datasets.The designed framework Object-GCN achieves significantly better performance than existing studies in public datasets COCO2014,VOC2007,VOC2012.The final results achieved are 86.9%,96.7%,and 96.3%mean Average Precision(mAP)across the three datasets.展开更多
Objective:To explore the feasibility of remotely obtaining complex information on traditional Chinese medicine(TCM)pulse conditions through voice signals.Methods: We used multi-label pulse conditions as the entry poin...Objective:To explore the feasibility of remotely obtaining complex information on traditional Chinese medicine(TCM)pulse conditions through voice signals.Methods: We used multi-label pulse conditions as the entry point and modeled and analyzed TCM pulse diagnosis by combining voice analysis and machine learning.Audio features were extracted from voice recordings in the TCM pulse condition dataset.The obtained features were combined with information from tongue and facial diagnoses.A multi-label pulse condition voice classification DNN model was built using 10-fold cross-validation,and the modeling methods were validated using publicly available datasets.Results: The analysis showed that the proposed method achieved an accuracy of 92.59%on the public dataset.The accuracies of the three single-label pulse manifestation models in the test set were 94.27%,96.35%,and 95.39%.The absolute accuracy of the multi-label model was 92.74%.Conclusion: Voice data analysis may serve as a remote adjunct to the TCM diagnostic method for pulse condition assessment.展开更多
To reduce the discrepancy between the source and target domains,a new multi-label adaptation network(ML-ANet)based on multiple kernel variants with maximum mean discrepancies is proposed in this paper.The hidden repre...To reduce the discrepancy between the source and target domains,a new multi-label adaptation network(ML-ANet)based on multiple kernel variants with maximum mean discrepancies is proposed in this paper.The hidden representations of the task-specific layers in ML-ANet are embedded in the reproducing kernel Hilbert space(RKHS)so that the mean-embeddings of specific features in different domains could be precisely matched.Multiple kernel functions are used to improve feature distribution efficiency for explicit mean embedding matching,which can further reduce domain discrepancy.Adverse weather and cross-camera adaptation examinations are conducted to verify the effectiveness of our proposed ML-ANet.The results show that our proposed ML-ANet achieves higher accuracies than the compared state-of-the-art methods for multi-label image classification in both the adverse weather adaptation and cross-camera adaptation experiments.These results indicate that ML-ANet can alleviate the reliance on fully labeled training data and improve the accuracy of multi-label image classification in various domain shift scenarios.展开更多
Multi-label learning deals with objects associated with multiple class labels,and aims to induce a predictive model which can assign a set of relevant class labels for an unseen instance.Since each class might possess...Multi-label learning deals with objects associated with multiple class labels,and aims to induce a predictive model which can assign a set of relevant class labels for an unseen instance.Since each class might possess its own characteristics,the strategy of extracting label-specific features has been widely employed to improve the discrimination process in multi-label learning,where the predictive model is induced based on tailored features specific to each class label instead of the identical instance representations.As a representative approach,LIFT generates label-specific features by conducting clustering analysis.However,its performance may be degraded due to the inherent instability of the single clustering algorithm.To improve this,a novel multi-label learning approach named SENCE(stable label-Specific features gENeration for multi-label learning via mixture-based Clustering Ensemble)is proposed,which stabilizes the generation process of label-specific features via clustering ensemble techniques.Specifically,more stable clustering results are obtained by firstly augmenting the original instance repre-sentation with cluster assignments from base clusters and then fitting a mixture model via the expectation-maximization(EM)algorithm.Extensive experiments on eighteen benchmark data sets show that SENCE performs better than LIFT and other well-established multi-label learning algorithms.展开更多
In this paper, we utilize the framework of multi-label learning for face demographic classification. We also attempt t;o explore the suitable classifiers and features for face demographic classification. Three most po...In this paper, we utilize the framework of multi-label learning for face demographic classification. We also attempt t;o explore the suitable classifiers and features for face demographic classification. Three most popular demographic information, gender, ethnicity and age are considered in experiments. Based on the results from demographic classification, we utilize statistic analysis to explore the correlation among various face demographic information. Through the analysis, we draw several conclusions on the correlation and interaction among these high-level face semantic, and the obtained results can be helpful in automatic face semantic annotation and other face analysis tasks.展开更多
<div style="text-align:justify;"> This paper studies a kind of urban security risk assessment model based on multi-label learning, which is transformed into the solution of linear equations through a s...<div style="text-align:justify;"> This paper studies a kind of urban security risk assessment model based on multi-label learning, which is transformed into the solution of linear equations through a series of transformations, and then the solution of linear equations is transformed into an optimization problem. Finally, this paper uses some classical optimization algorithms to solve these optimization problems, the convergence of the algorithm is proved, and the advantages and disadvantages of several optimization methods are compared. </div>展开更多
Label correlations are an essential technique for data mining that solves the possible correlation problem between different labels in multi-label classification.Although this technique is widely used in multi-label c...Label correlations are an essential technique for data mining that solves the possible correlation problem between different labels in multi-label classification.Although this technique is widely used in multi-label classification problems,batch learning deals with most issues,which consumes a lot of time and space resources.Unlike traditional batch learning methods,online learning represents a promising family of efficient and scalable machine learning algorithms for large-scale datasets.However,existing online learning research has done little to consider correlations between labels.On the basis of existing research,this paper proposes a multi-label online learning algorithm based on label correlations by maximizing the interval between related labels and unrelated labels in multi-label samples.We evaluate the performance of the proposed algorithm on several public datasets.Experiments show the effectiveness of our algorithm.展开更多
Recently, the life of human beings around the entire world has been endangering by the spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1. To develop effective drugs against Coronavirus, kno...Recently, the life of human beings around the entire world has been endangering by the spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1. To develop effective drugs against Coronavirus, knowledge of protein subcellular localization is indispensable. In 2019, a predictor called “pLoc_bal-mHum” was developed for identifying the subcellular localization of human proteins. Its predicted results are significantly better than its counterparts, particularly for those proteins that may simultaneously occur or move between two or more subcellular location sites. However, more efforts are definitely needed to further improve its power since pLoc_bal-mHum was still not trained by a “deep learning”, a very powerful technique developed recently. The present study was devoted to incorporate the “deep-learning” technique and develop a new predictor called “pLoc_Deep-mHum”. The global absolute true rate achieved by the new predictor is over 81% and its local accuracy is over 90%. Both are overwhelmingly superior to its counterparts. Moreover, a user-friendly web-server for the new predictor has been well established at http://www.jci-bioinfo.cn/pLoc_Deep-mHum/, which will become a very useful tool for fighting pandemic coronavirus and save the mankind of this planet.展开更多
<span style="font-family:Verdana;"> <p class="MsoNormal"> <span lang="EN-US" style="" color:black;"="">Recently, the life of worldwide human bei...<span style="font-family:Verdana;"> <p class="MsoNormal"> <span lang="EN-US" style="" color:black;"="">Recently, the life of worldwide human beings has been endangering by the spreading of </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">pneu</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">- </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">monia</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">-</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">causing virus, such as Coronavirus, COVID-19, and H1N1. To develop effective </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">drugs against Coronavirus, knowledge of protein subcellular localization is prerequisite. In 2019, a predictor called “pLoc_bal-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its predicted results are significantly better than its counterparts, particularly for those proteins that may simultaneously occur or move between two or more subcellular location sites. However, more efforts are definitely needed to further improve its power since pLoc_bal-mEuk was still not trained by a “deep learning”, a very powerful technique developed recently. The present study was devoted to incorporating the “deep</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">- </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">learning” technique and develop</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">ed</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;"> a new predictor called “pLoc_Deep-mEuk”. The global absolute true rate achieved by the new predictor is over 81% and its local accuracy is over 90%. Both are overwhelmingly superior to its counterparts. Moreover, a user-friendly web-</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;"> </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">server for the new predictor has been well established at <a href="http://www.jci-bioinfo.cn/pLoc_Deep-mEuk/">http://www.jci-bioinfo.cn/pLoc_Deep-mEuk/</a>, by which the majority of experimental scientists can easily get their desired data.</span> </p> </span>展开更多
<p class="MsoNormal"> <span lang="EN-US" style="" color:black;"="">The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, </span>...<p class="MsoNormal"> <span lang="EN-US" style="" color:black;"="">The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, </span><span "="" style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of virus protein subcellular localization is vitally important. In view of this, a CNN based virus protein subcellular localization predictor called “pLoc_Deep-mVirus” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 97% and its local accuracy is over 98%. Both are transcending other existing state-of-the-art predictors significantly. It has not escaped our notice that the deep-learning treatment can be used to deal with many other biological systems as well. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at <a href="http://www.jci-bioinfo.cn/pLoc_Deep-mVirus/">http://www.jci-bioinfo.cn/pLoc_Deep-mVirus/</a>.</span> </p>展开更多
Current coronavirus pandemic has endangered mankind life. The reported cases are increasing exponentially. Information of plant protein subcellular localization can provide useful clues to develop antiviral drugs. To ...Current coronavirus pandemic has endangered mankind life. The reported cases are increasing exponentially. Information of plant protein subcellular localization can provide useful clues to develop antiviral drugs. To cope with such a catastrophe, a CNN based plant protein subcellular localization predictor called “pLoc_Deep-mPlant” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 95% and its local accuracy is about 90%?-?100%. Both have substantially exceeded the?other existing state-of-the-art predictors. To maximize the convenience for most?experimental scientists, a user-friendly web-server for the new predictor has been established?at?http://www.jci-bioinfo.cn/pLoc_Deep-mPlant/, by which the majority of experimental?scientists can easily obtain their desired data without the need to go through the?mathematical details.展开更多
The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological proc...The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of Gram negative bacterial protein subcellular localization is vitally important. In view of this, a CNN based protein subcellular localization predictor called “pLoc_Deep-mGnet” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 98% and its local accuracy is around 94% - 100%. Both are transcending other existing state-of-the-art predictors significantly. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_Deep-mGneg/, which will become a very useful tool for fighting pandemic coronavirus and save the mankind of this planet.展开更多
The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological proc...The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of Gram positive bacteria protein subcellular localization is vitally important. In view of this, a CNN based protein subcellular localization predictor called “pLoc_Deep-mGpos” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 99% and its local accuracy is around 92% - 99%. Both are transcending other existing state-of-the-art predictors significantly. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_Deep-mGpos/, which will become a very powerful tool for developing effective drugs to fight pandemic coronavirus and save the mankind of this planet.展开更多
基金partly supported by the Technology Development Program of MSS(No.S3033853)by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2021R1A4A1031509).
文摘A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning.The main objective of this work is to create a novel framework for learning and classifying imbalancedmulti-label data.This work proposes a framework of two phases.The imbalanced distribution of themulti-label dataset is addressed through the proposed Borderline MLSMOTE resampling method in phase 1.Later,an adaptive weighted l21 norm regularized(Elastic-net)multilabel logistic regression is used to predict unseen samples in phase 2.The proposed Borderline MLSMOTE resampling method focuses on samples with concurrent high labels in contrast to conventional MLSMOTE.The minority labels in these samples are called difficult minority labels and are more prone to penalize classification performance.The concurrentmeasure is considered borderline,and labels associated with samples are regarded as borderline labels in the decision boundary.In phase II,a novel adaptive l21 norm regularized weighted multi-label logistic regression is used to handle balanced data with different weighted synthetic samples.Experimentation on various benchmark datasets shows the outperformance of the proposed method and its powerful predictive performances over existing conventional state-of-the-art multi-label methods.
基金supported by the National Key Research and Development Program of China(Grant No.2020YFA0211400)the State Key Program of the National Natural Science of China(Grant No.11834008)+2 种基金the National Natural Science Foundation of China(Grant Nos.12174192,12174188,and 11974176)the State Key Laboratory of Acoustics,Chinese Academy of Sciences(Grant No.SKLA202410)the Fund from the Key Laboratory of Underwater Acoustic Environment,Chinese Academy of Sciences(Grant No.SSHJ-KFKT-1701).
文摘Phononic crystals,as artificial composite materials,have sparked significant interest due to their novel characteristics that emerge upon the introduction of nonlinearity.Among these properties,second-harmonic features exhibit potential applications in acoustic frequency conversion,non-reciprocal wave propagation,and non-destructive testing.Precisely manipulating the harmonic band structure presents a major challenge in the design of nonlinear phononic crystals.Traditional design approaches based on parameter adjustments to meet specific application requirements are inefficient and often yield suboptimal performance.Therefore,this paper develops a design methodology using Softmax logistic regression and multi-label classification learning to inversely design the material distribution of nonlinear phononic crystals by exploiting information from harmonic transmission spectra.The results demonstrate that the neural network-based inverse design method can effectively tailor nonlinear phononic crystals with desired functionalities.This work establishes a mapping relationship between the band structure and the material distribution within phononic crystals,providing valuable insights into the inverse design of metamaterials.
基金This work is supported by the Academic Research Project of Henan Police College(Grant:HNJY-2021-QN-14 and HNJY202220)the Key Technology R&D Program of Henan Province(Grant:222102210041).
文摘Image has become an essential medium for expressing meaning and disseminating information.Many images are uploaded to the Internet,among which some are pornographic,causing adverse effects on public psychological health.To create a clean and positive Internet environment,network enforcement agencies need an automatic and efficient pornographic image recognition tool.Previous studies on pornographic images mainly rely on convolutional neural networks(CNN).Because of CNN’s many parameters,they must rely on a large labeled training dataset,which takes work to build.To reduce the effect of the database on the recognition performance of pornographic images,many researchers view pornographic image recognition as a binary classification task.In actual application,when faced with pornographic images of various features,the performance and recognition accuracy of the network model often decrease.In addition,the pornographic content in images usually lies in several small-sized local regions,which are not a large proportion of the image.CNN,this kind of strong supervised learning method,usually cannot automatically focus on the pornographic area of the image,thus affecting the recognition accuracy of pornographic images.This paper established an image dataset with seven classes by crawling pornographic websites and Baidu Image Library.A weakly supervised pornographic image recognition method based on multiple instance learning(MIL)is proposed.The Squeeze and Extraction(SE)module is introduced in the feature extraction to strengthen the critical information and weaken the influence of non-key and useless information on the result of pornographic image recognition.To meet the requirements of the pooling layer operation in Multiple Instance Learning,we introduced the idea of an attention mechanism to weight and average instances.The experimental results show that the proposed method has better accuracy and F1 scores than other methods.
基金Support by the National High Technology Research and Development Program of China(No.2012AA120802)National Natural Science Foundation of China(No.61771186)+1 种基金Postdoctoral Research Project of Heilongjiang Province(No.LBH-Q15121)Undergraduate University Project of Young Scientist Creative Talent of Heilongjiang Province(No.UNPYSCT-2017125)
文摘Multi-label learning is an active research area which plays an important role in machine learning. Traditional learning algorithms, however, have to depend on samples with complete labels. The existing learning algorithms with missing labels do not consider the relevance of labels, resulting in label estimation errors of new samples. A new multi-label learning algorithm with support vector machine(SVM) based association(SVMA) is proposed to estimate missing labels by constructing the association between different labels. SVMA will establish a mapping function to minimize the number of samples in the margin while ensuring the margin large enough as well as minimizing the misclassification probability. To evaluate the performance of SVMA in the condition of missing labels, four typical data sets are adopted with the integrity of the labels being handled manually. Simulation results show the superiority of SVMA in dealing with the samples with missing labels compared with other models in image classification.
文摘In recent years,multi-label learning has received a lot of attention.However,most of the existing methods only consider global label correlation or local label correlation.In fact,on the one hand,both global and local label correlations can appear in real-world situation at same time.On the other hand,we should not be limited to pairwise labels while ignoring the high-order label correlation.In this paper,we propose a novel and effective method called GLLCBN for multi-label learning.Firstly,we obtain the global label correlation by exploiting label semantic similarity.Then,we analyze the pairwise labels in the label space of the data set to acquire the local correlation.Next,we build the original version of the label dependency model by global and local label correlations.After that,we use graph theory,probability theory and Bayesian networks to eliminate redundant dependency structure in the initial version model,so as to get the optimal label dependent model.Finally,we obtain the feature extraction model by adjusting the Inception V3 model of convolution neural network and combine it with the GLLCBN model to achieve the multi-label learning.The experimental results show that our proposed model has better performance than other multi-label learning methods in performance evaluating.
文摘The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. To provide useful clues for developing antiviral drugs, information of anatomical therapeutic chemicals is vitally important. In view of this, a CNN based predictor called “iATC_Deep-mISF” has been developed. The predictor is particularly useful in dealing with the multi-label systems in which some chemicals may occur in two or more different classes. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/iATC_Deep-mISF/, which will become a very powerful tool for developing effective drugs to fight pandemic coronavirus and save the mankind of this planet.
基金the Natural Science Foundation of China(Grant Numbers 72074014 and 72004012).
文摘Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.
文摘Multi-label image classification is recognized as an important task within the field of computer vision,a discipline that has experienced a significant escalation in research endeavors in recent years.The widespread adoption of convolutional neural networks(CNNs)has catalyzed the remarkable success of architectures such as ResNet-101 within the domain of image classification.However,inmulti-label image classification tasks,it is crucial to consider the correlation between labels.In order to improve the accuracy and performance of multi-label classification and fully combine visual and semantic features,many existing studies use graph convolutional networks(GCN)for modeling.Object detection and multi-label image classification exhibit a degree of conceptual overlap;however,the integration of these two tasks within a unified framework has been relatively underexplored in the existing literature.In this paper,we come up with Object-GCN framework,a model combining object detection network YOLOv5 and graph convolutional network,and we carry out a thorough experimental analysis using a range of well-established public datasets.The designed framework Object-GCN achieves significantly better performance than existing studies in public datasets COCO2014,VOC2007,VOC2012.The final results achieved are 86.9%,96.7%,and 96.3%mean Average Precision(mAP)across the three datasets.
基金supported by Fundamental Research Funds from the Beijing University of Chinese Medicine(2023-JYB-KYPT-13)the Developmental Fund of Beijing University of Chinese Medicine(2020-ZXFZJJ-083).
文摘Objective:To explore the feasibility of remotely obtaining complex information on traditional Chinese medicine(TCM)pulse conditions through voice signals.Methods: We used multi-label pulse conditions as the entry point and modeled and analyzed TCM pulse diagnosis by combining voice analysis and machine learning.Audio features were extracted from voice recordings in the TCM pulse condition dataset.The obtained features were combined with information from tongue and facial diagnoses.A multi-label pulse condition voice classification DNN model was built using 10-fold cross-validation,and the modeling methods were validated using publicly available datasets.Results: The analysis showed that the proposed method achieved an accuracy of 92.59%on the public dataset.The accuracies of the three single-label pulse manifestation models in the test set were 94.27%,96.35%,and 95.39%.The absolute accuracy of the multi-label model was 92.74%.Conclusion: Voice data analysis may serve as a remote adjunct to the TCM diagnostic method for pulse condition assessment.
基金Supported by Shenzhen Fundamental Research Fund of China(Grant No.JCYJ20190808142613246)National Natural Science Foundation of China(Grant No.51805332),and Young Elite Scientists Sponsorship Program funded by the China Society of Automotive Engineers.
文摘To reduce the discrepancy between the source and target domains,a new multi-label adaptation network(ML-ANet)based on multiple kernel variants with maximum mean discrepancies is proposed in this paper.The hidden representations of the task-specific layers in ML-ANet are embedded in the reproducing kernel Hilbert space(RKHS)so that the mean-embeddings of specific features in different domains could be precisely matched.Multiple kernel functions are used to improve feature distribution efficiency for explicit mean embedding matching,which can further reduce domain discrepancy.Adverse weather and cross-camera adaptation examinations are conducted to verify the effectiveness of our proposed ML-ANet.The results show that our proposed ML-ANet achieves higher accuracies than the compared state-of-the-art methods for multi-label image classification in both the adverse weather adaptation and cross-camera adaptation experiments.These results indicate that ML-ANet can alleviate the reliance on fully labeled training data and improve the accuracy of multi-label image classification in various domain shift scenarios.
基金This work was supported by the National Science Foundation of China(62176055)the China University S&T Innovation Plan Guided by the Ministry of Education.
文摘Multi-label learning deals with objects associated with multiple class labels,and aims to induce a predictive model which can assign a set of relevant class labels for an unseen instance.Since each class might possess its own characteristics,the strategy of extracting label-specific features has been widely employed to improve the discrimination process in multi-label learning,where the predictive model is induced based on tailored features specific to each class label instead of the identical instance representations.As a representative approach,LIFT generates label-specific features by conducting clustering analysis.However,its performance may be degraded due to the inherent instability of the single clustering algorithm.To improve this,a novel multi-label learning approach named SENCE(stable label-Specific features gENeration for multi-label learning via mixture-based Clustering Ensemble)is proposed,which stabilizes the generation process of label-specific features via clustering ensemble techniques.Specifically,more stable clustering results are obtained by firstly augmenting the original instance repre-sentation with cluster assignments from base clusters and then fitting a mixture model via the expectation-maximization(EM)algorithm.Extensive experiments on eighteen benchmark data sets show that SENCE performs better than LIFT and other well-established multi-label learning algorithms.
基金Project supported by the National Natural Science Foundation of China(Grant No.60605012)the Natural Science Foundation of Shanghai(Grant No.08ZR1408200)+1 种基金the Open Project Program of the National Laboratory of Pattern Recognition of China(Grant No.08-2-16)the Shanghai Leading Academic Discipline Project(Grant No.J50103)
文摘In this paper, we utilize the framework of multi-label learning for face demographic classification. We also attempt t;o explore the suitable classifiers and features for face demographic classification. Three most popular demographic information, gender, ethnicity and age are considered in experiments. Based on the results from demographic classification, we utilize statistic analysis to explore the correlation among various face demographic information. Through the analysis, we draw several conclusions on the correlation and interaction among these high-level face semantic, and the obtained results can be helpful in automatic face semantic annotation and other face analysis tasks.
文摘<div style="text-align:justify;"> This paper studies a kind of urban security risk assessment model based on multi-label learning, which is transformed into the solution of linear equations through a series of transformations, and then the solution of linear equations is transformed into an optimization problem. Finally, this paper uses some classical optimization algorithms to solve these optimization problems, the convergence of the algorithm is proved, and the advantages and disadvantages of several optimization methods are compared. </div>
基金Supported by the State Grid Technology Item(52460D230002)。
文摘Label correlations are an essential technique for data mining that solves the possible correlation problem between different labels in multi-label classification.Although this technique is widely used in multi-label classification problems,batch learning deals with most issues,which consumes a lot of time and space resources.Unlike traditional batch learning methods,online learning represents a promising family of efficient and scalable machine learning algorithms for large-scale datasets.However,existing online learning research has done little to consider correlations between labels.On the basis of existing research,this paper proposes a multi-label online learning algorithm based on label correlations by maximizing the interval between related labels and unrelated labels in multi-label samples.We evaluate the performance of the proposed algorithm on several public datasets.Experiments show the effectiveness of our algorithm.
文摘Recently, the life of human beings around the entire world has been endangering by the spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1. To develop effective drugs against Coronavirus, knowledge of protein subcellular localization is indispensable. In 2019, a predictor called “pLoc_bal-mHum” was developed for identifying the subcellular localization of human proteins. Its predicted results are significantly better than its counterparts, particularly for those proteins that may simultaneously occur or move between two or more subcellular location sites. However, more efforts are definitely needed to further improve its power since pLoc_bal-mHum was still not trained by a “deep learning”, a very powerful technique developed recently. The present study was devoted to incorporate the “deep-learning” technique and develop a new predictor called “pLoc_Deep-mHum”. The global absolute true rate achieved by the new predictor is over 81% and its local accuracy is over 90%. Both are overwhelmingly superior to its counterparts. Moreover, a user-friendly web-server for the new predictor has been well established at http://www.jci-bioinfo.cn/pLoc_Deep-mHum/, which will become a very useful tool for fighting pandemic coronavirus and save the mankind of this planet.
文摘<span style="font-family:Verdana;"> <p class="MsoNormal"> <span lang="EN-US" style="" color:black;"="">Recently, the life of worldwide human beings has been endangering by the spreading of </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">pneu</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">- </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">monia</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">-</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">causing virus, such as Coronavirus, COVID-19, and H1N1. To develop effective </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">drugs against Coronavirus, knowledge of protein subcellular localization is prerequisite. In 2019, a predictor called “pLoc_bal-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its predicted results are significantly better than its counterparts, particularly for those proteins that may simultaneously occur or move between two or more subcellular location sites. However, more efforts are definitely needed to further improve its power since pLoc_bal-mEuk was still not trained by a “deep learning”, a very powerful technique developed recently. The present study was devoted to incorporating the “deep</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">- </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">learning” technique and develop</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">ed</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;"> a new predictor called “pLoc_Deep-mEuk”. The global absolute true rate achieved by the new predictor is over 81% and its local accuracy is over 90%. Both are overwhelmingly superior to its counterparts. Moreover, a user-friendly web-</span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;"> </span><span style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">server for the new predictor has been well established at <a href="http://www.jci-bioinfo.cn/pLoc_Deep-mEuk/">http://www.jci-bioinfo.cn/pLoc_Deep-mEuk/</a>, by which the majority of experimental scientists can easily get their desired data.</span> </p> </span>
文摘<p class="MsoNormal"> <span lang="EN-US" style="" color:black;"="">The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, </span><span "="" style="font-variant-ligatures:normal;font-variant-caps:normal;orphans:2;text-align:start;widows:2;-webkit-text-stroke-width:0px;text-decoration-style:initial;text-decoration-color:initial;word-spacing:0px;">COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of virus protein subcellular localization is vitally important. In view of this, a CNN based virus protein subcellular localization predictor called “pLoc_Deep-mVirus” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 97% and its local accuracy is over 98%. Both are transcending other existing state-of-the-art predictors significantly. It has not escaped our notice that the deep-learning treatment can be used to deal with many other biological systems as well. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at <a href="http://www.jci-bioinfo.cn/pLoc_Deep-mVirus/">http://www.jci-bioinfo.cn/pLoc_Deep-mVirus/</a>.</span> </p>
文摘Current coronavirus pandemic has endangered mankind life. The reported cases are increasing exponentially. Information of plant protein subcellular localization can provide useful clues to develop antiviral drugs. To cope with such a catastrophe, a CNN based plant protein subcellular localization predictor called “pLoc_Deep-mPlant” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 95% and its local accuracy is about 90%?-?100%. Both have substantially exceeded the?other existing state-of-the-art predictors. To maximize the convenience for most?experimental scientists, a user-friendly web-server for the new predictor has been established?at?http://www.jci-bioinfo.cn/pLoc_Deep-mPlant/, by which the majority of experimental?scientists can easily obtain their desired data without the need to go through the?mathematical details.
文摘The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of Gram negative bacterial protein subcellular localization is vitally important. In view of this, a CNN based protein subcellular localization predictor called “pLoc_Deep-mGnet” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 98% and its local accuracy is around 94% - 100%. Both are transcending other existing state-of-the-art predictors significantly. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_Deep-mGneg/, which will become a very useful tool for fighting pandemic coronavirus and save the mankind of this planet.
文摘The recent worldwide spreading of pneumonia-causing virus, such as Coronavirus, COVID-19, and H1N1, has been endangering the life of human beings all around the world. In order to really understand the biological process within a cell level and provide useful clues to develop antiviral drugs, information of Gram positive bacteria protein subcellular localization is vitally important. In view of this, a CNN based protein subcellular localization predictor called “pLoc_Deep-mGpos” was developed. The predictor is particularly useful in dealing with the multi-sites systems in which some proteins may simultaneously occur in two or more different organelles that are the current focus of pharmaceutical industry. The global absolute true rate achieved by the new predictor is over 99% and its local accuracy is around 92% - 99%. Both are transcending other existing state-of-the-art predictors significantly. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_Deep-mGpos/, which will become a very powerful tool for developing effective drugs to fight pandemic coronavirus and save the mankind of this planet.