Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-ti...Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.展开更多
Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewpriv...Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.展开更多
Domain adaptation aims to transfer knowledge from the labeled source domain to an unlabeled target domain that follows a similar but different distribution.Recently,adversarial-based methods have achieved remarkable s...Domain adaptation aims to transfer knowledge from the labeled source domain to an unlabeled target domain that follows a similar but different distribution.Recently,adversarial-based methods have achieved remarkable success due to the excellent performance of domain-invariant feature presentation learning.However,the adversarial methods learn the transferability at the expense of the discriminability in feature representation,leading to low generalization to the target domain.To this end,we propose a Multi-view Feature Learning method for the Over-penalty in Adversarial Domain Adaptation.Specifically,multi-view representation learning is proposed to enrich the discriminative information contained in domain-invariant feature representation,which will counter the over-penalty for discriminability in adversarial training.Besides,the class distribution in the intra-domain is proposed to replace that in the inter-domain to capture more discriminative information in the learning of transferrable features.Extensive experiments show that our method can improve the discriminability while maintaining transferability and exceeds the most advanced methods in the domain adaptation benchmark datasets.展开更多
With the enhancement of data collection capabilities,massive streaming data have been accumulated in numerous application scenarios.Specifically,the issue of classifying data streams based on mobile sensors can be for...With the enhancement of data collection capabilities,massive streaming data have been accumulated in numerous application scenarios.Specifically,the issue of classifying data streams based on mobile sensors can be formalized as a multi-task multi-view learning problem with a specific task comprising multiple views with shared features collected from multiple sensors.Existing incremental learning methods are often single-task single-view,which cannot learn shared representations between relevant tasks and views.An adaptive multi-task multi-view incremental learning framework for data stream classification called MTMVIS is proposed to address the above challenges,utilizing the idea of multi-task multi-view learning.Specifically,the attention mechanism is first used to align different sensor data of different views.In addition,MTMVIS uses adaptive Fisher regularization from the perspective of multi-task multi-view learning to overcome catastrophic forgetting in incremental learning.Results reveal that the proposed framework outperforms state-of-the-art methods based on the experiments on two different datasets with other baselines.展开更多
To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the ...To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the time domain, frequency domain and time-frequency domain are extracted through the Fourier transform, Hilbert transform and empirical mode decomposition (EMD).Then, the random forest model (RF) is applied to select features which are highly correlated with the bearing operating state. Subsequently, the selected features are fused via the autoencoder (AE) to further reduce the redundancy. Finally, the effectiveness of the fused features is evaluated by the support vector machine (SVM). The experimental results indicate that the proposed method based on the multi-view feature fusion can effectively reflect the difference in the state of the rolling bearing, and improve the accuracy of fault diagnosis.展开更多
Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information mor...Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.展开更多
Transfer learning aims at leveraging the knowledge in labeled source domains to predict the unlabeled data in a target domain, where the distributions are diiTerent in domains. Among various methods for transfer learn...Transfer learning aims at leveraging the knowledge in labeled source domains to predict the unlabeled data in a target domain, where the distributions are diiTerent in domains. Among various methods for transfer learning, one kind of Mgorithms focus on the correspondence between bridge features and all the other specific features from different domains, and later conduct transfer learning via the single-view correspondence. However, the single-view correspondence may prevent these algorithms from further improvement due to the problem of incorrect correlation discovery. To tackle this problem, we propose a new method for transfer learning in a multi-view correspondence perspective, which is called MultiView Principal Component Analysis (MVPCA) approach. MVPCA discovers the correspondence between bridge features representative across all domains and specific features from different domains respectively, and conducts the transfer learning by dimensionality reduction in a multi-view way, which can better depict the knowledge transfer. Experiments show that MVPCA can significantly reduce the cross domain prediction error of a baseline non-transfer method. With multi-view correspondence information incorporated to the single-view transfer learning method, MVPCA can further improve the performance of one state-of-the-art single-view method.展开更多
In this paper, we propose a multi-kernel multi-view canonical correlations(M2CCs) framework for subspace learning. In the proposed framework,the input data of each original view are mapped into multiple higher dimensi...In this paper, we propose a multi-kernel multi-view canonical correlations(M2CCs) framework for subspace learning. In the proposed framework,the input data of each original view are mapped into multiple higher dimensional feature spaces by multiple nonlinear mappings determined by different kernels. This makes M2 CC can discover multiple kinds of useful information of each original view in the feature spaces. With the framework, we further provide a specific multi-view feature learning method based on direct summation kernel strategy and its regularized version. The experimental results in visual recognition tasks demonstrate the effectiveness and robustness of the proposed method.展开更多
Label noise is often contained in the training data due to various human factors or measurement errors,which significantly causes a negative effect on classifiers.Despite many previous methods that have been proposed ...Label noise is often contained in the training data due to various human factors or measurement errors,which significantly causes a negative effect on classifiers.Despite many previous methods that have been proposed to learn robust classifiers,they are mainly based on the single-view feature.On the other hand,although existing multi-view classification methods benefit from the more comprehensive information,they rarely consider label noise.In this paper,we propose a novel label-noise robust classification model with multi-view learning to overcome these limitations.In the proposed model,not only the classifier learning but also the label-noise removal can benefit from the multi-view information.Specifically,we relax the label matrix of the basic multi-view least squares regression model,and develop a nonlinear transformation with a natural probabilistic approximation in the process of labels,which is conveniently optimized and beneficial to improve the discriminative ability of classifiers.Moreover,we preserve the intrinsic manifold structure of multi-view data on the relaxed label matrix,facilitating the process of label relaxation.For optimizing the proposed model with the nonlinear transformation,we derive a lemma about the partial derivation of the softmax related function,and develop an efficient alternating algorithm.Experimental evaluations on six real-world datasets confirm the advantages of the proposed method,compared to the related state-of-the-art methods.展开更多
Human action recognition is currently one of the most active research areas in computer vision. It has been widely used in many applications, such as intelligent surveillance, perceptual interface, and content-based v...Human action recognition is currently one of the most active research areas in computer vision. It has been widely used in many applications, such as intelligent surveillance, perceptual interface, and content-based video retrieval. However, some extrinsic factors are barriers for the development of action recognition; e.g., human actions may be observed from arbitrary camera viewpoints in realistic scene. Thus, view-invariant analysis becomes important for action recognition algorithms, and a number of researchers have paid much attention to this issue. In this paper, we present a multi-view learning approach to recognize human actions from different views. As most existing multi-view learning algorithms often suffer from the problem of lacking data adaptiveness in the nearest neighborhood graph construction procedure, a robust locally adaptive multi-view learning algorithm based on learning multiple local L 1-graphs is proposed. Moreover, an efficient iterative optimization method is proposed to solve the proposed objective function. Experiments on three public view-invariant action recognition datasets, i.e., ViHASi, IXMAS, and WVU, demonstrate data adaptiveness, effectiveness, and efficiency of our algorithm. More importantly, when the feature dimension is correctly selected (i.e., 〉60), the proposed algorithm stably outperforms state-of-the-art counterparts and obtains about 6% improvement in recognition accuracy on the three datasets.展开更多
The existing multi-view subspace clustering algorithms based on tensor singular value decomposition(t-SVD)predominantly utilize tensor nuclear norm to explore the intra view correlation between views of the same sampl...The existing multi-view subspace clustering algorithms based on tensor singular value decomposition(t-SVD)predominantly utilize tensor nuclear norm to explore the intra view correlation between views of the same samples,while neglecting the correlation among the samples within different views.Moreover,the tensor nuclear norm is not fully considered as a convex approximation of the tensor rank function.Treating different singular values equally may result in suboptimal tensor representation.A hypergraph regularized multi-view subspace clustering algorithm with dual tensor log-determinant(HRMSC-DTL)was proposed.The algorithm used subspace learning in each view to learn a specific set of affinity matrices,and introduced a non-convex tensor log-determinant function to replace the tensor nuclear norm to better improve global low-rankness.It also introduced hyper-Laplacian regularization to preserve the local geometric structure embedded in the high-dimensional space.Furthermore,it rotated the original tensor and incorporated a dual tensor mechanism to fully exploit the intra view correlation of the original tensor and the inter view correlation of the rotated tensor.At the same time,an alternating direction of multipliers method(ADMM)was also designed to solve non-convex optimization model.Experimental evaluations on seven widely used datasets,along with comparisons to several state-of-the-art algorithms,demonstrated the superiority and effectiveness of the HRMSC-DTL algorithm in terms of clustering performance.展开更多
3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimat...3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.展开更多
We address the problem of metric learning for multi-view data. Many metric learning algorithms have been proposed, most of them focus just on single view circumstances, and only a few deal with multi-view data. In thi...We address the problem of metric learning for multi-view data. Many metric learning algorithms have been proposed, most of them focus just on single view circumstances, and only a few deal with multi-view data. In this paper, motivated by the co-training framework, we propose an algorithm-independent framework, named co-metric, to learn Mahalanobis metrics in multi-view settings. In its implementation, an off-the-shelf single-view metric learning algorithm is used to learn metrics in individual views of a few labeled examples. Then the most confidently-labeled examples chosen from the unlabeled set are used to guide the metric learning in the next loop. This procedure is repeated until some stop criteria are met. The framework can accommodate most existing metric learning algorithms whether types-of- side-information or example-labels are used. In addition it can naturally deal with semi-supervised circumstances under more than two views. Our comparative experiments demon- strate its competiveness and effectiveness.展开更多
Real-world data can often be represented in multiple forms and views,and analyzing data from different perspectives allows for more comprehensive learning of the data,resulting in better data clustering results.Non-ne...Real-world data can often be represented in multiple forms and views,and analyzing data from different perspectives allows for more comprehensive learning of the data,resulting in better data clustering results.Non-negative matrix factorization(NMF)is used to solve the clustering problem to extract uniform discriminative low-dimensional features from multi-view data.Many clustering methods based on graph regularization have been proposed and proven to be effective,but ordinary graphs only consider pairwise relationships between samples.In order to learn the higher-order relationships that exist in the sample manifold and feature manifold of multi-view data,we propose a new semi-supervised multi-view clustering method called dual hypergraph regularized partially shared non-negative matrix factorization(DHPS-NMF).The complex manifold structure of samples and features is learned by constructing samples and feature hypergraphs.To improve the discrimination power of the obtained lowdimensional features,semi-supervised regression terms are incorporated into the model to effectively use the label information when capturing the complex manifold structure of the data.Ultimately,we conduct experiments on six real data sets and the results show that our algorithm achieves encouraging results in comparison with some methods.展开更多
In the big data era, the data are generated from different sources or observed from different views. These data are referred to as multi-view data. Unleashing the power of knowledge in multi-view data is very importan...In the big data era, the data are generated from different sources or observed from different views. These data are referred to as multi-view data. Unleashing the power of knowledge in multi-view data is very important in big data mining and analysis. This calls for advanced techniques that consider the diversity of different views,while fusing these data. Multi-view Clustering(MvC) has attracted increasing attention in recent years by aiming to exploit complementary and consensus information across multiple views. This paper summarizes a large number of multi-view clustering algorithms, provides a taxonomy according to the mechanisms and principles involved, and classifies these algorithms into five categories, namely, co-training style algorithms, multi-kernel learning, multiview graph clustering, multi-view subspace clustering, and multi-task multi-view clustering. Therein, multi-view graph clustering is further categorized as graph-based, network-based, and spectral-based methods. Multi-view subspace clustering is further divided into subspace learning-based, and non-negative matrix factorization-based methods. This paper does not only introduce the mechanisms for each category of methods, but also gives a few examples for how these techniques are used. In addition, it lists some publically available multi-view datasets.Overall, this paper serves as an introductory text and survey for multi-view clustering.展开更多
This paper tackles the problem of video concept detection using the multi-modality fusion method. Motivated by multi-view learning algorithms, multi-modality features of videos can be represented by multiple graphs. A...This paper tackles the problem of video concept detection using the multi-modality fusion method. Motivated by multi-view learning algorithms, multi-modality features of videos can be represented by multiple graphs. And the graph-based semi-supervised learning methods can be extended to multiple graphs to predict the semantic labels for unlabeled video data. However, traditional graphs represent only homogeneous pairwise linking relations, and therefore the high-order correlations inherent in videos, such as high-order visual similarities, are ignored. In this paper we represent heterogeneous features by multiple hypergraphs and then the high-order correlated samples can be associated with hyperedges. Furthermore, the multi-hypergraph ranking (MHR) algorithm is proposed by defining Markov random walk on each hypergraph and then forming the mixture Markov chains so as to perform transductive learning in multiple hypergraphs. In experiments on the TRECVID dataset, a triple-hypergraph consisting of visual, textual features and multiple labeled tags is constructed to predict concept labels for unlabeled video shots by the MHR framework. Experimental results show that our approach is effective.展开更多
In order to reconstruct and render the weak and repetitive texture of the damaged functional surface of aviation,an improved neural radiance field,named TranSR-NeRF,is proposed.In this paper,a data acquisition system ...In order to reconstruct and render the weak and repetitive texture of the damaged functional surface of aviation,an improved neural radiance field,named TranSR-NeRF,is proposed.In this paper,a data acquisition system was designed and built.The acquired images generated initial point clouds through TransMVSNet.Meanwhile,after extracting features from the images through the improved SE-ConvNeXt network,the extracted features were aligned and fused with the initial point cloud to generate high-quality neural point cloud.After ray-tracing and sampling of the neural point cloud,the ResMLP neural network designed in this paper was used to regress the volume density and radiance under a given viewing angle,which introduced spatial coordinate and relative positional encoding.The reconstruction and rendering of arbitrary-scale super-resolution of damaged functional surface is realized.In this paper,the influence of illumination conditions and background environment on the model performance is also studied through experiments,and the comparison and ablation experiments for the improved methods proposed in this paper is conducted.The experimental results show that the improved model has good effect.Finally,the application experiment of object detection task is carried out,and the experimental results show that the model has good practicability.展开更多
基金supported in part by the National Natural Science Foundation of China(Grant No.82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)of Shenzhen Science and Technology Innovation Committee+6 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Natural Science Foundation of Jiangsu Province(No.BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038 and SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575)the Henan Province Science and Technology Research(222102310322)The Jiangsu Students’Innovation and Entrepreneurship Training Program(202110304096Y).
文摘Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.
文摘Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.
基金This work is supported in part by the National Natural Science Foundation of China under grants(62076087,61976077)Anhui Provincial Natural Science Foundation under grants(2208085MF170).
文摘Domain adaptation aims to transfer knowledge from the labeled source domain to an unlabeled target domain that follows a similar but different distribution.Recently,adversarial-based methods have achieved remarkable success due to the excellent performance of domain-invariant feature presentation learning.However,the adversarial methods learn the transferability at the expense of the discriminability in feature representation,leading to low generalization to the target domain.To this end,we propose a Multi-view Feature Learning method for the Over-penalty in Adversarial Domain Adaptation.Specifically,multi-view representation learning is proposed to enrich the discriminative information contained in domain-invariant feature representation,which will counter the over-penalty for discriminability in adversarial training.Besides,the class distribution in the intra-domain is proposed to replace that in the inter-domain to capture more discriminative information in the learning of transferrable features.Extensive experiments show that our method can improve the discriminability while maintaining transferability and exceeds the most advanced methods in the domain adaptation benchmark datasets.
文摘With the enhancement of data collection capabilities,massive streaming data have been accumulated in numerous application scenarios.Specifically,the issue of classifying data streams based on mobile sensors can be formalized as a multi-task multi-view learning problem with a specific task comprising multiple views with shared features collected from multiple sensors.Existing incremental learning methods are often single-task single-view,which cannot learn shared representations between relevant tasks and views.An adaptive multi-task multi-view incremental learning framework for data stream classification called MTMVIS is proposed to address the above challenges,utilizing the idea of multi-task multi-view learning.Specifically,the attention mechanism is first used to align different sensor data of different views.In addition,MTMVIS uses adaptive Fisher regularization from the perspective of multi-task multi-view learning to overcome catastrophic forgetting in incremental learning.Results reveal that the proposed framework outperforms state-of-the-art methods based on the experiments on two different datasets with other baselines.
基金The National Natural Science Foundation of China(No.51875100)
文摘To improve the accuracy and robustness of rolling bearing fault diagnosis under complex conditions, a novel method based on multi-view feature fusion is proposed. Firstly, multi-view features from perspectives of the time domain, frequency domain and time-frequency domain are extracted through the Fourier transform, Hilbert transform and empirical mode decomposition (EMD).Then, the random forest model (RF) is applied to select features which are highly correlated with the bearing operating state. Subsequently, the selected features are fused via the autoencoder (AE) to further reduce the redundancy. Finally, the effectiveness of the fused features is evaluated by the support vector machine (SVM). The experimental results indicate that the proposed method based on the multi-view feature fusion can effectively reflect the difference in the state of the rolling bearing, and improve the accuracy of fault diagnosis.
基金This work is supported by the National Natural Science Foundation of China(No.61772561)the Key Research&Development Plan of Hunan Province(No.2018NK2012)+1 种基金the Science Research Projects of Hunan Provincial Education Department(Nos.18A174,18C0262)the Science&Technology Innovation Platform and Talent Plan of Hunan Province(2017TP1022).
文摘Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.
基金supported by the National Natural Science Foundation of China under Grant No.61003112the National Basic Research 973 Program of China under Grant No.2010CB327903+1 种基金the National High-Tech Research and Development 863 Program of China under Grant No.2006AA010109the Natural Science Foundation of Jiangsu Province under Grant No.2009233
文摘Transfer learning aims at leveraging the knowledge in labeled source domains to predict the unlabeled data in a target domain, where the distributions are diiTerent in domains. Among various methods for transfer learning, one kind of Mgorithms focus on the correspondence between bridge features and all the other specific features from different domains, and later conduct transfer learning via the single-view correspondence. However, the single-view correspondence may prevent these algorithms from further improvement due to the problem of incorrect correlation discovery. To tackle this problem, we propose a new method for transfer learning in a multi-view correspondence perspective, which is called MultiView Principal Component Analysis (MVPCA) approach. MVPCA discovers the correspondence between bridge features representative across all domains and specific features from different domains respectively, and conducts the transfer learning by dimensionality reduction in a multi-view way, which can better depict the knowledge transfer. Experiments show that MVPCA can significantly reduce the cross domain prediction error of a baseline non-transfer method. With multi-view correspondence information incorporated to the single-view transfer learning method, MVPCA can further improve the performance of one state-of-the-art single-view method.
基金supported by the National Natural Science Foundation of China under Grant Nos. 61402203, 61273251, and 61170120the Fundamental Research Funds for the Central Universities under Grant No. JUSRP11458the Program for New Century Excellent Talents in University under Grant No. NCET-12-0881
文摘In this paper, we propose a multi-kernel multi-view canonical correlations(M2CCs) framework for subspace learning. In the proposed framework,the input data of each original view are mapped into multiple higher dimensional feature spaces by multiple nonlinear mappings determined by different kernels. This makes M2 CC can discover multiple kinds of useful information of each original view in the feature spaces. With the framework, we further provide a specific multi-view feature learning method based on direct summation kernel strategy and its regularized version. The experimental results in visual recognition tasks demonstrate the effectiveness and robustness of the proposed method.
基金supported by the Key-Area Research and Development Program of Guangdong Province(Grant No.2019B010154002)the Guangdong Natural Science Foundation(Grant No.2022A1515010688)the National Natural Science Foundation of China(Grant No.61722304)。
文摘Label noise is often contained in the training data due to various human factors or measurement errors,which significantly causes a negative effect on classifiers.Despite many previous methods that have been proposed to learn robust classifiers,they are mainly based on the single-view feature.On the other hand,although existing multi-view classification methods benefit from the more comprehensive information,they rarely consider label noise.In this paper,we propose a novel label-noise robust classification model with multi-view learning to overcome these limitations.In the proposed model,not only the classifier learning but also the label-noise removal can benefit from the multi-view information.Specifically,we relax the label matrix of the basic multi-view least squares regression model,and develop a nonlinear transformation with a natural probabilistic approximation in the process of labels,which is conveniently optimized and beneficial to improve the discriminative ability of classifiers.Moreover,we preserve the intrinsic manifold structure of multi-view data on the relaxed label matrix,facilitating the process of label relaxation.For optimizing the proposed model with the nonlinear transformation,we derive a lemma about the partial derivation of the softmax related function,and develop an efficient alternating algorithm.Experimental evaluations on six real-world datasets confirm the advantages of the proposed method,compared to the related state-of-the-art methods.
基金Project supported by the National Natural Science Foundation of China(No.61572431)the National Key Technology R&D Program(No.2013BAH59F00)+1 种基金the Zhejiang Provincial Natural Science Foundation of China(No.LY13F020001)the Zhejiang Province Public Technology Applied Research Projects,China(No.2014C33090)
文摘Human action recognition is currently one of the most active research areas in computer vision. It has been widely used in many applications, such as intelligent surveillance, perceptual interface, and content-based video retrieval. However, some extrinsic factors are barriers for the development of action recognition; e.g., human actions may be observed from arbitrary camera viewpoints in realistic scene. Thus, view-invariant analysis becomes important for action recognition algorithms, and a number of researchers have paid much attention to this issue. In this paper, we present a multi-view learning approach to recognize human actions from different views. As most existing multi-view learning algorithms often suffer from the problem of lacking data adaptiveness in the nearest neighborhood graph construction procedure, a robust locally adaptive multi-view learning algorithm based on learning multiple local L 1-graphs is proposed. Moreover, an efficient iterative optimization method is proposed to solve the proposed objective function. Experiments on three public view-invariant action recognition datasets, i.e., ViHASi, IXMAS, and WVU, demonstrate data adaptiveness, effectiveness, and efficiency of our algorithm. More importantly, when the feature dimension is correctly selected (i.e., 〉60), the proposed algorithm stably outperforms state-of-the-art counterparts and obtains about 6% improvement in recognition accuracy on the three datasets.
基金supported by National Natural Science Foundation of China(No.61806006)Priority Academic Program Development of Jiangsu Higher Education Institutions。
文摘The existing multi-view subspace clustering algorithms based on tensor singular value decomposition(t-SVD)predominantly utilize tensor nuclear norm to explore the intra view correlation between views of the same samples,while neglecting the correlation among the samples within different views.Moreover,the tensor nuclear norm is not fully considered as a convex approximation of the tensor rank function.Treating different singular values equally may result in suboptimal tensor representation.A hypergraph regularized multi-view subspace clustering algorithm with dual tensor log-determinant(HRMSC-DTL)was proposed.The algorithm used subspace learning in each view to learn a specific set of affinity matrices,and introduced a non-convex tensor log-determinant function to replace the tensor nuclear norm to better improve global low-rankness.It also introduced hyper-Laplacian regularization to preserve the local geometric structure embedded in the high-dimensional space.Furthermore,it rotated the original tensor and incorporated a dual tensor mechanism to fully exploit the intra view correlation of the original tensor and the inter view correlation of the rotated tensor.At the same time,an alternating direction of multipliers method(ADMM)was also designed to solve non-convex optimization model.Experimental evaluations on seven widely used datasets,along with comparisons to several state-of-the-art algorithms,demonstrated the superiority and effectiveness of the HRMSC-DTL algorithm in terms of clustering performance.
基金supported by the Program of Entrepreneurship and Innovation Ph.D.in Jiangsu Province(JSSCBS20211175)the School Ph.D.Talent Funding(Z301B2055)the Natural Science Foundation of the Jiangsu Higher Education Institutions of China(21KJB520002).
文摘3D human pose estimation is a major focus area in the field of computer vision,which plays an important role in practical applications.This article summarizes the framework and research progress related to the estimation of monocular RGB images and videos.An overall perspective ofmethods integrated with deep learning is introduced.Novel image-based and video-based inputs are proposed as the analysis framework.From this viewpoint,common problems are discussed.The diversity of human postures usually leads to problems such as occlusion and ambiguity,and the lack of training datasets often results in poor generalization ability of the model.Regression methods are crucial for solving such problems.Considering image-based input,the multi-view method is commonly used to solve occlusion problems.Here,the multi-view method is analyzed comprehensively.By referring to video-based input,the human prior knowledge of restricted motion is used to predict human postures.In addition,structural constraints are widely used as prior knowledge.Furthermore,weakly supervised learningmethods are studied and discussed for these two types of inputs to improve the model generalization ability.The problem of insufficient training datasets must also be considered,especially because 3D datasets are usually biased and limited.Finally,emerging and popular datasets and evaluation indicators are discussed.The characteristics of the datasets and the relationships of the indicators are explained and highlighted.Thus,this article can be useful and instructive for researchers who are lacking in experience and find this field confusing.In addition,by providing an overview of 3D human pose estimation,this article sorts and refines recent studies on 3D human pose estimation.It describes kernel problems and common useful methods,and discusses the scope for further research.
基金We would like to thank the National Natural Science Foundations of China (NSFC) (Grant Nos. 61035003 and 61170151) for support.
文摘We address the problem of metric learning for multi-view data. Many metric learning algorithms have been proposed, most of them focus just on single view circumstances, and only a few deal with multi-view data. In this paper, motivated by the co-training framework, we propose an algorithm-independent framework, named co-metric, to learn Mahalanobis metrics in multi-view settings. In its implementation, an off-the-shelf single-view metric learning algorithm is used to learn metrics in individual views of a few labeled examples. Then the most confidently-labeled examples chosen from the unlabeled set are used to guide the metric learning in the next loop. This procedure is repeated until some stop criteria are met. The framework can accommodate most existing metric learning algorithms whether types-of- side-information or example-labels are used. In addition it can naturally deal with semi-supervised circumstances under more than two views. Our comparative experiments demon- strate its competiveness and effectiveness.
基金supported by the National Natural Science Foundation of China (Grant Nos.62073087,U1911401,62071132,and 61973090)the Guangdong Key R&D Project of China (Grant No.2019B010121001)。
文摘Real-world data can often be represented in multiple forms and views,and analyzing data from different perspectives allows for more comprehensive learning of the data,resulting in better data clustering results.Non-negative matrix factorization(NMF)is used to solve the clustering problem to extract uniform discriminative low-dimensional features from multi-view data.Many clustering methods based on graph regularization have been proposed and proven to be effective,but ordinary graphs only consider pairwise relationships between samples.In order to learn the higher-order relationships that exist in the sample manifold and feature manifold of multi-view data,we propose a new semi-supervised multi-view clustering method called dual hypergraph regularized partially shared non-negative matrix factorization(DHPS-NMF).The complex manifold structure of samples and features is learned by constructing samples and feature hypergraphs.To improve the discrimination power of the obtained lowdimensional features,semi-supervised regression terms are incorporated into the model to effectively use the label information when capturing the complex manifold structure of the data.Ultimately,we conduct experiments on six real data sets and the results show that our algorithm achieves encouraging results in comparison with some methods.
基金supported in part by the National Natural Science Foundation of China (No. 61572407)
文摘In the big data era, the data are generated from different sources or observed from different views. These data are referred to as multi-view data. Unleashing the power of knowledge in multi-view data is very important in big data mining and analysis. This calls for advanced techniques that consider the diversity of different views,while fusing these data. Multi-view Clustering(MvC) has attracted increasing attention in recent years by aiming to exploit complementary and consensus information across multiple views. This paper summarizes a large number of multi-view clustering algorithms, provides a taxonomy according to the mechanisms and principles involved, and classifies these algorithms into five categories, namely, co-training style algorithms, multi-kernel learning, multiview graph clustering, multi-view subspace clustering, and multi-task multi-view clustering. Therein, multi-view graph clustering is further categorized as graph-based, network-based, and spectral-based methods. Multi-view subspace clustering is further divided into subspace learning-based, and non-negative matrix factorization-based methods. This paper does not only introduce the mechanisms for each category of methods, but also gives a few examples for how these techniques are used. In addition, it lists some publically available multi-view datasets.Overall, this paper serves as an introductory text and survey for multi-view clustering.
基金supported by the National Natural Science Foundation of China(Nos.60603096 and 60673088)the National High-Tech Re-search and Development Program(863)of China(No.2006AA010107)the Program for Changjiang Scholars and Innovative Research Team in University of China(No.IRT0652)
文摘This paper tackles the problem of video concept detection using the multi-modality fusion method. Motivated by multi-view learning algorithms, multi-modality features of videos can be represented by multiple graphs. And the graph-based semi-supervised learning methods can be extended to multiple graphs to predict the semantic labels for unlabeled video data. However, traditional graphs represent only homogeneous pairwise linking relations, and therefore the high-order correlations inherent in videos, such as high-order visual similarities, are ignored. In this paper we represent heterogeneous features by multiple hypergraphs and then the high-order correlated samples can be associated with hyperedges. Furthermore, the multi-hypergraph ranking (MHR) algorithm is proposed by defining Markov random walk on each hypergraph and then forming the mixture Markov chains so as to perform transductive learning in multiple hypergraphs. In experiments on the TRECVID dataset, a triple-hypergraph consisting of visual, textual features and multiple labeled tags is constructed to predict concept labels for unlabeled video shots by the MHR framework. Experimental results show that our approach is effective.
基金supported by the National Science and Technology Major Project,China(No.J2019-Ⅲ-0009-0053)the National Natural Science Foundation of China(No.12075319)。
文摘In order to reconstruct and render the weak and repetitive texture of the damaged functional surface of aviation,an improved neural radiance field,named TranSR-NeRF,is proposed.In this paper,a data acquisition system was designed and built.The acquired images generated initial point clouds through TransMVSNet.Meanwhile,after extracting features from the images through the improved SE-ConvNeXt network,the extracted features were aligned and fused with the initial point cloud to generate high-quality neural point cloud.After ray-tracing and sampling of the neural point cloud,the ResMLP neural network designed in this paper was used to regress the volume density and radiance under a given viewing angle,which introduced spatial coordinate and relative positional encoding.The reconstruction and rendering of arbitrary-scale super-resolution of damaged functional surface is realized.In this paper,the influence of illumination conditions and background environment on the model performance is also studied through experiments,and the comparison and ablation experiments for the improved methods proposed in this paper is conducted.The experimental results show that the improved model has good effect.Finally,the application experiment of object detection task is carried out,and the experimental results show that the model has good practicability.