Although many multi-view clustering(MVC) algorithms with acceptable performances have been presented, to the best of our knowledge, nearly all of them need to be fed with the correct number of clusters. In addition, t...Although many multi-view clustering(MVC) algorithms with acceptable performances have been presented, to the best of our knowledge, nearly all of them need to be fed with the correct number of clusters. In addition, these existing algorithms create only the hard and fuzzy partitions for multi-view objects,which are often located in highly-overlapping areas of multi-view feature space. The adoption of hard and fuzzy partition ignores the ambiguity and uncertainty in the assignment of objects, likely leading to performance degradation. To address these issues, we propose a novel sparse reconstructive multi-view evidential clustering algorithm(SRMVEC). Based on a sparse reconstructive procedure, SRMVEC learns a shared affinity matrix across views, and maps multi-view objects to a 2-dimensional humanreadable chart by calculating 2 newly defined mathematical metrics for each object. From this chart, users can detect the number of clusters and select several objects existing in the dataset as cluster centers. Then, SRMVEC derives a credal partition under the framework of evidence theory, improving the fault tolerance of clustering. Ablation studies show the benefits of adopting the sparse reconstructive procedure and evidence theory. Besides,SRMVEC delivers effectiveness on benchmark datasets by outperforming some state-of-the-art methods.展开更多
Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewpriv...Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.展开更多
Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The signif...Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The significance of low-rank prior in MVSC is emphasized, highlighting its role in capturing the global data structure across views for improved performance. However, it faces challenges with outlier sensitivity due to its reliance on the Frobenius norm for error measurement. Addressing this, our paper proposes a Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization (LMVSC- Sparse) approach. Sparse regularization helps in selecting the most relevant features or views for clustering while ignoring irrelevant or noisy ones. This leads to a more efficient and effective representation of the data, improving the clustering accuracy and robustness, especially in the presence of outliers or noisy data. By incorporating sparse regularization, LMVSC-Sparse can effectively handle outlier sensitivity, which is a common challenge in traditional MVSC methods relying solely on low-rank priors. Then Alternating Direction Method of Multipliers (ADMM) algorithm is employed to solve the proposed optimization problems. Our comprehensive experiments demonstrate the efficiency and effectiveness of LMVSC-Sparse, offering a robust alternative to traditional MVSC methods.展开更多
Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-ti...Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.展开更多
Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application scenarios.With the introduction of end-to-end direct regression methods,the fi...Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application scenarios.With the introduction of end-to-end direct regression methods,the field has entered a new stage of development.However,the regression results of joints that are more heavily influenced by external factors are not accurate enough even for the optimal method.In this paper,we propose an effective feature recalibration module based on the channel attention mechanism and a relative optimal calibration strategy,which is applied to themulti-viewmulti-person 3D human pose estimation task to achieve improved detection accuracy for joints that are more severely affected by external factors.Specifically,it achieves relative optimal weight adjustment of joint feature information through the recalibration module and strategy,which enables the model to learn the dependencies between joints and the dependencies between people and their corresponding joints.We call this method as the Efficient Recalibration Network(ER-Net).Finally,experiments were conducted on two benchmark datasets for this task,Campus and Shelf,in which the PCP reached 97.3% and 98.3%,respectively.展开更多
In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relationa...In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.展开更多
Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency o...Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency of multi-view data,while neglecting the diversity among different views as well as the high-order relationships of data,resulting in the loss of valuable complementary information.In this paper,we design a hypergraph regularized diverse deep matrix factorization(HDDMF)model for multi-view data representation,to jointly utilize multi-view diversity and a high-order manifold in a multilayer factorization framework.A novel diversity enhancement term is designed to exploit the structural complementarity between different views of data.Hypergraph regularization is utilized to preserve the high-order geometry structure of data in each view.An efficient iterative optimization algorithm is developed to solve the proposed model with theoretical convergence analysis.Experimental results on five real-world data sets demonstrate that the proposed method significantly outperforms stateof-the-art multi-view learning approaches.展开更多
The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, ...The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, where each end point re- ceives only a selected number of views required for rendering video from its current viewpoint at any given time. The set of selected videos changes in real time as the user’s viewpoint changes because of head or eye movements. Techniques for reducing the black-outs during fast viewpoint changes were investigated. The performance of the approach was studied through network experiments.展开更多
Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information mor...Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.展开更多
Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based ...Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.展开更多
Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are co...Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.展开更多
In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may b...In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may be lost during the process of compositing image and capture EMG signals.Errors and the recognition accuracy may be introduced and affected respectively by some factors such as period detection.To better solve the problems,a multi-view gait recognition method using deep convolutional neural network and channel attention mechanism is proposed.Firstly,the sliding time window method is used to capture EMG signals.Then,the back-propagation learning algorithm is used to train each layer of convolution,which improves the learning ability of the convolutional neural network.Finally,the channel attention mechanism is integrated into the neural network,which will improve the ability of expressing gait features.And a classifier is used to classify gait.As can be shown from experimental results on two public datasets,OULP and CASIA-B,the recognition rate of the proposed method can be achieved at 88.44%and 97.25%respectively.As can be shown from the comparative experimental results,the proposed method has better recognition effect than several other newer convolutional neural network methods.Therefore,the combination of convolutional neural network and channel attention mechanism is of great value for gait recognition.展开更多
Lung is an important organ of human body.More and more people are suffering from lung diseases due to air pollution.These diseases are usually highly infectious.Such as lung tuberculosis,novel coronavirus COVID-19,etc...Lung is an important organ of human body.More and more people are suffering from lung diseases due to air pollution.These diseases are usually highly infectious.Such as lung tuberculosis,novel coronavirus COVID-19,etc.Lung nodule is a kind of high-density globular lesion in the lung.Physicians need to spend a lot of time and energy to observe the computed tomography image sequences to make a diagnosis,which is inefficient.For this reason,the use of computer-assisted diagnosis of lung nodules has become the current main trend.In the process of computer-aided diagnosis,how to reduce the false positive rate while ensuring a low missed detection rate is a difficulty and focus of current research.To solve this problem,we propose a three-dimensional optimization model to achieve the extraction of suspected regions,improve the traditional deep belief network,and to modify the dispersion matrix between classes.We construct a multi-view model,fuse local three-dimensional information into two-dimensional images,and thereby to reduce the complexity of the algorithm.And alleviate the problem of unbalanced training caused by only a small number of positive samples.Experiments show that the false positive rate of the algorithm proposed in this paper is as low as 12%,which is in line with clinical application standards.展开更多
We present a threedimensional(3D)isotropic imaging of mouse brain using light-sheet fuo-rescent microscopy(LSFM)in conjumction with a multi-view imaging computation.Unlike common single view LSFM is used for mouse bra...We present a threedimensional(3D)isotropic imaging of mouse brain using light-sheet fuo-rescent microscopy(LSFM)in conjumction with a multi-view imaging computation.Unlike common single view LSFM is used for mouse brain imaging,the brain tissue is 3D imaged under eight views in our study,by a home-built selective plane ilumination microscopy(SPIM).An output image containing complete structural infornation as well as significantly improved res olution(~4 times)are then computed based on these eight views of data,using a bead-guided multi-view registration and deconvolution.With superior imaging quality,the astrocyte and pyrarmidal neurons together with their subcellular nerve fbers can be clearly visualized and segmented.With further incuding other computational methods,this study can be potentially scaled up to map the conectome of whole mouse brain with a simple light.sheet microscope.展开更多
Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate...Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate-distortion (R-D) model. We reasonably allocate bit-rate among views based on the correlation analysisl The proposed algorithm consists of three levels to control the rate bits more accurately, of which the frame layer allocates bits according to the frame complexity and the temporal activity. Extensive experiments show that the proposed algorithm can control the bit rate efficiently.展开更多
Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feat...Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feature extraction from multiple images affects the reliability and real-time performance of 3D reconstruction technology.To solve the problem,a multi-view image 3D reconstruction algorithm based on self-encoding convolutional neural network is proposed in this paper.The algorithm first extracts the feature information of multiple two-dimensional(2D)images based on scale and rotation invariance parameters of Scale-invariant feature transform(SIFT)operator.Secondly,self-encoding learning neural network is introduced into the feature refinement process to take full advantage of its feature extraction ability.Then,Fish-Net is used to replace the U-Net structure inside the self-encoding network to improve gradient propagation between U-Net structures,and Generative Adversarial Networks(GAN)loss function is used to replace mean square error(MSE)to better express image features,discarding useless features to obtain effective image features.Finally,an incremental structure from motion(SFM)algorithm is performed to calculate rotation matrix and translation vector of the camera,and the feature points are triangulated to obtain a sparse spatial point cloud,and meshlab software is used to display the results.Simulation experiments show that compared with the traditional method,the image feature extraction method proposed in this paper can significantly improve the rendering effect of 3D point cloud,with an accuracy rate of 92.5%and a reconstruction complete rate of 83.6%.展开更多
In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to descr...In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance.展开更多
Multi-view laser radar (ladar) data registration in obscure environments is an important research field of obscured target detection from air to ground. There are few overlap regions of the observational data in dif...Multi-view laser radar (ladar) data registration in obscure environments is an important research field of obscured target detection from air to ground. There are few overlap regions of the observational data in different views because of the occluder, so the multi-view data registration is rather difficult. Through indepth analyses of the typical methods and problems, it is obtained that the sequence registration is more appropriate, but needs to improve the registration accuracy. On this basis, a multi-view data registration algorithm based on aggregating the adjacent frames, which are already registered, is proposed. It increases the overlap region between the pending registration frames by aggregation and further improves the registration accuracy. The experiment results show that the proposed algorithm can effectively register the multi-view ladar data in the obscure environment, and it also has a greater robustness and a higher registration accuracy compared with the sequence registration under the condition of equivalent operating efficiency.展开更多
Subpixel localization in image center is one of the key technologies of vision measurement. In order to meet the requirements of accurate calibration and measurement in multi-field, the existing sub-pixel positioning ...Subpixel localization in image center is one of the key technologies of vision measurement. In order to meet the requirements of accurate calibration and measurement in multi-field, the existing sub-pixel positioning methods are complex, the positioning accuracy is greatly affected by the effect of initial edge extraction, and the positioning accuracy is low. Because remote sensing multi-view images are usually not stationary random signals, in order to better express the non-stationary characteristics of images, random analysis is combined to segment sub-pixel objects in the center of remote sensing images. The accuracy of mark positioning will affect the accuracy of the whole measurement. The control point signs with different characteristics correspond to different recognition methods, so the selection of control point marks should be based on different requirements. It is used to describe the target view from different viewpoints and use the geometric features to retrieve the model library. The matching process uses global and local, statistical and structural target recognition features hierarchically, and is divided into two steps of retrieval and exact matching. The experiment was carried out to verify the effectiveness of the method.展开更多
In order to realize the intelligent mechanization of the last process of the fruit industry chains,the identification of fruit packing boxes is researched.A multi-view database is established to describe the omnidirec...In order to realize the intelligent mechanization of the last process of the fruit industry chains,the identification of fruit packing boxes is researched.A multi-view database is established to describe the omnidirectional attitudes of the fruit packing boxes.In order to reduce the data redundancy caused by multi-view acquisition,a new binary multi-view kernel principal component analysis network(BMKPCANet) is built,and a multi-view recognition method of fruit packing boxes is proposed based on the BMKPCANet and support vector machine(SVM).The experimental results show that the recognition accuracy of proposed BMKPCANet is 12.82% higher than PCANet and3.51% higher than KPCANet on average.The time consumption of proposed BMKPCANet is 7.74%lower than PCANet and 29.01% lower than KPCANet on average.This work has laid a theoretical foundation for multi-view recognition of 3 D objects and has a good practical application value.展开更多
基金supported in part by NUS startup grantthe National Natural Science Foundation of China (52076037)。
文摘Although many multi-view clustering(MVC) algorithms with acceptable performances have been presented, to the best of our knowledge, nearly all of them need to be fed with the correct number of clusters. In addition, these existing algorithms create only the hard and fuzzy partitions for multi-view objects,which are often located in highly-overlapping areas of multi-view feature space. The adoption of hard and fuzzy partition ignores the ambiguity and uncertainty in the assignment of objects, likely leading to performance degradation. To address these issues, we propose a novel sparse reconstructive multi-view evidential clustering algorithm(SRMVEC). Based on a sparse reconstructive procedure, SRMVEC learns a shared affinity matrix across views, and maps multi-view objects to a 2-dimensional humanreadable chart by calculating 2 newly defined mathematical metrics for each object. From this chart, users can detect the number of clusters and select several objects existing in the dataset as cluster centers. Then, SRMVEC derives a credal partition under the framework of evidence theory, improving the fault tolerance of clustering. Ablation studies show the benefits of adopting the sparse reconstructive procedure and evidence theory. Besides,SRMVEC delivers effectiveness on benchmark datasets by outperforming some state-of-the-art methods.
文摘Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.
文摘Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The significance of low-rank prior in MVSC is emphasized, highlighting its role in capturing the global data structure across views for improved performance. However, it faces challenges with outlier sensitivity due to its reliance on the Frobenius norm for error measurement. Addressing this, our paper proposes a Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization (LMVSC- Sparse) approach. Sparse regularization helps in selecting the most relevant features or views for clustering while ignoring irrelevant or noisy ones. This leads to a more efficient and effective representation of the data, improving the clustering accuracy and robustness, especially in the presence of outliers or noisy data. By incorporating sparse regularization, LMVSC-Sparse can effectively handle outlier sensitivity, which is a common challenge in traditional MVSC methods relying solely on low-rank priors. Then Alternating Direction Method of Multipliers (ADMM) algorithm is employed to solve the proposed optimization problems. Our comprehensive experiments demonstrate the efficiency and effectiveness of LMVSC-Sparse, offering a robust alternative to traditional MVSC methods.
基金supported in part by the National Natural Science Foundation of China(Grant No.82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)of Shenzhen Science and Technology Innovation Committee+6 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Natural Science Foundation of Jiangsu Province(No.BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038 and SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575)the Henan Province Science and Technology Research(222102310322)The Jiangsu Students’Innovation and Entrepreneurship Training Program(202110304096Y).
文摘Epilepsy is a central nervous system disorder in which brain activity becomes abnormal.Electroencephalogram(EEG)signals,as recordings of brain activity,have been widely used for epilepsy recognition.To study epilep-tic EEG signals and develop artificial intelligence(AI)-assist recognition,a multi-view transfer learning(MVTL-LSR)algorithm based on least squares regression is proposed in this study.Compared with most existing multi-view transfer learning algorithms,MVTL-LSR has two merits:(1)Since traditional transfer learning algorithms leverage knowledge from different sources,which poses a significant risk to data privacy.Therefore,we develop a knowledge transfer mechanism that can protect the security of source domain data while guaranteeing performance.(2)When utilizing multi-view data,we embed view weighting and manifold regularization into the transfer framework to measure the views’strengths and weaknesses and improve generalization ability.In the experimental studies,12 different simulated multi-view&transfer scenarios are constructed from epileptic EEG signals licensed and provided by the Uni-versity of Bonn,Germany.Extensive experimental results show that MVTL-LSR outperforms baselines.The source code will be available on https://github.com/didid5/MVTL-LSR.
基金supported in part by the Key Program of NSFC (Grant No.U1908214)Special Project of Central Government Guiding Local Science and Technology Development (Grant No.2021JH6/10500140)+3 种基金Program for the Liaoning Distinguished Professor,Program for Innovative Research Team in University of Liaoning Province (LT2020015)Dalian (2021RT06)and Dalian University (XLJ202010)the Science and Technology Innovation Fund of Dalian (Grant No.2020JJ25CY001)Dalian University Scientific Research Platform Project (No.202101YB03).
文摘Multi-view multi-person 3D human pose estimation is a hot topic in the field of human pose estimation due to its wide range of application scenarios.With the introduction of end-to-end direct regression methods,the field has entered a new stage of development.However,the regression results of joints that are more heavily influenced by external factors are not accurate enough even for the optimal method.In this paper,we propose an effective feature recalibration module based on the channel attention mechanism and a relative optimal calibration strategy,which is applied to themulti-viewmulti-person 3D human pose estimation task to achieve improved detection accuracy for joints that are more severely affected by external factors.Specifically,it achieves relative optimal weight adjustment of joint feature information through the recalibration module and strategy,which enables the model to learn the dependencies between joints and the dependencies between people and their corresponding joints.We call this method as the Efficient Recalibration Network(ER-Net).Finally,experiments were conducted on two benchmark datasets for this task,Campus and Shelf,in which the PCP reached 97.3% and 98.3%,respectively.
文摘In multi-view image localization task,the features of the images captured from different views should be fused properly.This paper considers the classification-based image localization problem.We propose the relational graph location network(RGLN)to perform this task.In this network,we propose a heterogeneous graph construction approach for graph classification tasks,which aims to describe the location in a more appropriate way,thereby improving the expression ability of the location representation module.Experiments show that the expression ability of the proposed graph construction approach outperforms the compared methods by a large margin.In addition,the proposed localization method outperforms the compared localization methods by around 1.7%in terms of meter-level accuracy.
基金This work was supported by the National Natural Science Foundation of China(62073087,62071132,61973090).
文摘Deep matrix factorization(DMF)has been demonstrated to be a powerful tool to take in the complex hierarchical information of multi-view data(MDR).However,existing multiview DMF methods mainly explore the consistency of multi-view data,while neglecting the diversity among different views as well as the high-order relationships of data,resulting in the loss of valuable complementary information.In this paper,we design a hypergraph regularized diverse deep matrix factorization(HDDMF)model for multi-view data representation,to jointly utilize multi-view diversity and a high-order manifold in a multilayer factorization framework.A novel diversity enhancement term is designed to exploit the structural complementarity between different views of data.Hypergraph regularization is utilized to preserve the high-order geometry structure of data in each view.An efficient iterative optimization algorithm is developed to solve the proposed model with theoretical convergence analysis.Experimental results on five real-world data sets demonstrate that the proposed method significantly outperforms stateof-the-art multi-view learning approaches.
基金Project (No. 511568) supported by the European Commissionwithin Framework Program 6 with the acronym 3DTV
文摘The authors propose a novel method for transporting multi-view videos that aims to keep the bandwidth requirements on both end-users and servers as low as possible. The method is based on application layer multicast, where each end point re- ceives only a selected number of views required for rendering video from its current viewpoint at any given time. The set of selected videos changes in real time as the user’s viewpoint changes because of head or eye movements. Techniques for reducing the black-outs during fast viewpoint changes were investigated. The performance of the approach was studied through network experiments.
基金This work is supported by the National Natural Science Foundation of China(No.61772561)the Key Research&Development Plan of Hunan Province(No.2018NK2012)+1 种基金the Science Research Projects of Hunan Provincial Education Department(Nos.18A174,18C0262)the Science&Technology Innovation Platform and Talent Plan of Hunan Province(2017TP1022).
文摘Hashing technology has the advantages of reducing data storage and improving the efficiency of the learning system,making it more and more widely used in image retrieval.Multi-view data describes image information more comprehensively than traditional methods using a single-view.How to use hashing to combine multi-view data for image retrieval is still a challenge.In this paper,a multi-view fusion hashing method based on RKCCA(Random Kernel Canonical Correlation Analysis)is proposed.In order to describe image content more accurately,we use deep learning dense convolutional network feature DenseNet to construct multi-view by combining GIST feature or BoW_SIFT(Bag-of-Words model+SIFT feature)feature.This algorithm uses RKCCA method to fuse multi-view features to construct association features and apply them to image retrieval.The algorithm generates binary hash code with minimal distortion error by designing quantization regularization terms.A large number of experiments on benchmark datasets show that this method is superior to other multi-view hashing methods.
文摘Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.
基金supported by the National Natural Science Foundation of China (60672073)the Program for New Century Excellent Talents in University (NCET-06-0537)+1 种基金the Natural Science Foundation of Ningbo (2008A610016)the K.C.Wong Magna Fund in Ningbo University.
文摘Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.
基金This work was supported by the Natural Science Foundation of China(No.61902133)Fujian natural science foundation project(No.2018J05106)Xiamen Collaborative Innovation projects of Produces study grinds(3502Z20173046)。
文摘In many existing multi-view gait recognition methods based on images or video sequences,gait sequences are usually used to superimpose and synthesize images and construct energy-like template.However,information may be lost during the process of compositing image and capture EMG signals.Errors and the recognition accuracy may be introduced and affected respectively by some factors such as period detection.To better solve the problems,a multi-view gait recognition method using deep convolutional neural network and channel attention mechanism is proposed.Firstly,the sliding time window method is used to capture EMG signals.Then,the back-propagation learning algorithm is used to train each layer of convolution,which improves the learning ability of the convolutional neural network.Finally,the channel attention mechanism is integrated into the neural network,which will improve the ability of expressing gait features.And a classifier is used to classify gait.As can be shown from experimental results on two public datasets,OULP and CASIA-B,the recognition rate of the proposed method can be achieved at 88.44%and 97.25%respectively.As can be shown from the comparative experimental results,the proposed method has better recognition effect than several other newer convolutional neural network methods.Therefore,the combination of convolutional neural network and channel attention mechanism is of great value for gait recognition.
基金This work was supported by Science and Technology Rising Star of Shaanxi Youth(No.2021KJXX-61)The Open Project Program of the State Key Lab of CAD&CG,Zhejiang University(No.A2206)+3 种基金The China Postdoctoral Science Foundation(No.2020M683696XB)Natural Science Basic Research Plan in Shaanxi Province of China(No.2021JQ-455)Natural Science Foundation of China(No.62062003),Key Research and Development Project of Ningxia(Special projects for talents)(No.2020BEB04022)North Minzu University Research Project of Talent Introduction(No.2020KYQD08).
文摘Lung is an important organ of human body.More and more people are suffering from lung diseases due to air pollution.These diseases are usually highly infectious.Such as lung tuberculosis,novel coronavirus COVID-19,etc.Lung nodule is a kind of high-density globular lesion in the lung.Physicians need to spend a lot of time and energy to observe the computed tomography image sequences to make a diagnosis,which is inefficient.For this reason,the use of computer-assisted diagnosis of lung nodules has become the current main trend.In the process of computer-aided diagnosis,how to reduce the false positive rate while ensuring a low missed detection rate is a difficulty and focus of current research.To solve this problem,we propose a three-dimensional optimization model to achieve the extraction of suspected regions,improve the traditional deep belief network,and to modify the dispersion matrix between classes.We construct a multi-view model,fuse local three-dimensional information into two-dimensional images,and thereby to reduce the complexity of the algorithm.And alleviate the problem of unbalanced training caused by only a small number of positive samples.Experiments show that the false positive rate of the algorithm proposed in this paper is as low as 12%,which is in line with clinical application standards.
基金funding support from 1000 Youth Talents Plan of China (P.F.)Fundamental Research Program of Shenzhen (P.F.,JCYJ20160429182424047)+1 种基金National Science Foundation of China (NSFC31571002,D.Z)Graduates'Innovation Fund of Huazhong University of Science and Technology (5003182004).
文摘We present a threedimensional(3D)isotropic imaging of mouse brain using light-sheet fuo-rescent microscopy(LSFM)in conjumction with a multi-view imaging computation.Unlike common single view LSFM is used for mouse brain imaging,the brain tissue is 3D imaged under eight views in our study,by a home-built selective plane ilumination microscopy(SPIM).An output image containing complete structural infornation as well as significantly improved res olution(~4 times)are then computed based on these eight views of data,using a bead-guided multi-view registration and deconvolution.With superior imaging quality,the astrocyte and pyrarmidal neurons together with their subcellular nerve fbers can be clearly visualized and segmented.With further incuding other computational methods,this study can be potentially scaled up to map the conectome of whole mouse brain with a simple light.sheet microscope.
基金supported by the National Natural Science Foundation of China (Grant Nos.60832003,60672052,60902085,60972137)the Key Project of Shanghai Municipal Education Commission (Grant No.09ZZ90)+2 种基金the Natural Science Foundation of Shanghai(Grant No.09ZR1412500)the Innovation Foundation of Shanghai University (Grants Nos.10YZ09,SHUCX091061)the Shuguang Plan of Shanghai Education Development Foundation (Grant No.06SG43)
文摘Current multi-view video coding (MVC) reference model in joint video team (JVT) does not provide efficient rate control schemes. This paper presents a rate control algorithm for MVC by improving the quadratic rate-distortion (R-D) model. We reasonably allocate bit-rate among views based on the correlation analysisl The proposed algorithm consists of three levels to control the rate bits more accurately, of which the frame layer allocates bits according to the frame complexity and the temporal activity. Extensive experiments show that the proposed algorithm can control the bit rate efficiently.
基金This work is funded by Key Scientific Research Projects of Colleges and Universities in Henan Province under Grant 22A460022Training Plan for Young Backbone Teachers in Colleges and Universities in Henan Province under Grant 2021GGJS077.
文摘Traditional three-dimensional(3D)image reconstruction method,which highly dependent on the environment and has poor reconstruction effect,is easy to lead to mismatch and poor real-time performance.The accuracy of feature extraction from multiple images affects the reliability and real-time performance of 3D reconstruction technology.To solve the problem,a multi-view image 3D reconstruction algorithm based on self-encoding convolutional neural network is proposed in this paper.The algorithm first extracts the feature information of multiple two-dimensional(2D)images based on scale and rotation invariance parameters of Scale-invariant feature transform(SIFT)operator.Secondly,self-encoding learning neural network is introduced into the feature refinement process to take full advantage of its feature extraction ability.Then,Fish-Net is used to replace the U-Net structure inside the self-encoding network to improve gradient propagation between U-Net structures,and Generative Adversarial Networks(GAN)loss function is used to replace mean square error(MSE)to better express image features,discarding useless features to obtain effective image features.Finally,an incremental structure from motion(SFM)algorithm is performed to calculate rotation matrix and translation vector of the camera,and the feature points are triangulated to obtain a sparse spatial point cloud,and meshlab software is used to display the results.Simulation experiments show that compared with the traditional method,the image feature extraction method proposed in this paper can significantly improve the rendering effect of 3D point cloud,with an accuracy rate of 92.5%and a reconstruction complete rate of 83.6%.
基金supported in part by the National Natural Science Foundation of China under Grant 61772561,author J.Q,http://www.nsfc.gov.cn/in part by the Key Research and Development Plan of Hunan Province under Grant 2018NK2012,author J.Q,http://kjt.hunan.gov.cn/+7 种基金in part by the Key Research and Development Plan of Hunan Province under Grant 2019SK2022,author Y.T,http://kjt.hunan.gov.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 18A174,author X.X,http://kxjsc.gov.hnedu.cn/in part by the Science Research Projects of Hunan Provincial Education Department under Grant 19B584,author Y.T,http://kxjsc.gov.hnedu.cn/in part by the Degree&Postgraduate Education Reform Project of Hunan Province under Grant 2019JGYB154,author J.Q,http://xwb.gov.hnedu.cn/in part by the Postgraduate Excellent teaching team Project of Hunan Province under Grant[2019]370-133,author J.Q,http://xwb.gov.hnedu.cn/in part by the Postgraduate Education and Teaching Reform Project of Central South University of Forestry&Technology under Grant 2019JG013,author X.X,http://jwc.csuft.edu.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4140),author Y.T,http://kjt.hunan.gov.cn/in part by the Natural Science Foundation of Hunan Province(No.2020JJ4141),author X.X,http://kjt.hunan.gov.cn/.
文摘In recent years,with the massive growth of image data,how to match the image required by users quickly and efficiently becomes a challenge.Compared with single-view feature,multi-view feature is more accurate to describe image information.The advantages of hash method in reducing data storage and improving efficiency also make us study how to effectively apply to large-scale image retrieval.In this paper,a hash algorithm of multi-index image retrieval based on multi-view feature coding is proposed.By learning the data correlation between different views,this algorithm uses multi-view data with deeper level image semantics to achieve better retrieval results.This algorithm uses a quantitative hash method to generate binary sequences,and uses the hash code generated by the association features to construct database inverted index files,so as to reduce the memory burden and promote the efficient matching.In order to reduce the matching error of hash code and ensure the retrieval accuracy,this algorithm uses inverted multi-index structure instead of single-index structure.Compared with other advanced image retrieval method,this method has better retrieval performance.
文摘Multi-view laser radar (ladar) data registration in obscure environments is an important research field of obscured target detection from air to ground. There are few overlap regions of the observational data in different views because of the occluder, so the multi-view data registration is rather difficult. Through indepth analyses of the typical methods and problems, it is obtained that the sequence registration is more appropriate, but needs to improve the registration accuracy. On this basis, a multi-view data registration algorithm based on aggregating the adjacent frames, which are already registered, is proposed. It increases the overlap region between the pending registration frames by aggregation and further improves the registration accuracy. The experiment results show that the proposed algorithm can effectively register the multi-view ladar data in the obscure environment, and it also has a greater robustness and a higher registration accuracy compared with the sequence registration under the condition of equivalent operating efficiency.
文摘Subpixel localization in image center is one of the key technologies of vision measurement. In order to meet the requirements of accurate calibration and measurement in multi-field, the existing sub-pixel positioning methods are complex, the positioning accuracy is greatly affected by the effect of initial edge extraction, and the positioning accuracy is low. Because remote sensing multi-view images are usually not stationary random signals, in order to better express the non-stationary characteristics of images, random analysis is combined to segment sub-pixel objects in the center of remote sensing images. The accuracy of mark positioning will affect the accuracy of the whole measurement. The control point signs with different characteristics correspond to different recognition methods, so the selection of control point marks should be based on different requirements. It is used to describe the target view from different viewpoints and use the geometric features to retrieve the model library. The matching process uses global and local, statistical and structural target recognition features hierarchically, and is divided into two steps of retrieval and exact matching. The experiment was carried out to verify the effectiveness of the method.
基金Supported by the National Natural Science Foundation of China(No.52075306).
文摘In order to realize the intelligent mechanization of the last process of the fruit industry chains,the identification of fruit packing boxes is researched.A multi-view database is established to describe the omnidirectional attitudes of the fruit packing boxes.In order to reduce the data redundancy caused by multi-view acquisition,a new binary multi-view kernel principal component analysis network(BMKPCANet) is built,and a multi-view recognition method of fruit packing boxes is proposed based on the BMKPCANet and support vector machine(SVM).The experimental results show that the recognition accuracy of proposed BMKPCANet is 12.82% higher than PCANet and3.51% higher than KPCANet on average.The time consumption of proposed BMKPCANet is 7.74%lower than PCANet and 29.01% lower than KPCANet on average.This work has laid a theoretical foundation for multi-view recognition of 3 D objects and has a good practical application value.