Multi-label image classification is recognized as an important task within the field of computer vision,a discipline that has experienced a significant escalation in research endeavors in recent years.The widespread a...Multi-label image classification is recognized as an important task within the field of computer vision,a discipline that has experienced a significant escalation in research endeavors in recent years.The widespread adoption of convolutional neural networks(CNNs)has catalyzed the remarkable success of architectures such as ResNet-101 within the domain of image classification.However,inmulti-label image classification tasks,it is crucial to consider the correlation between labels.In order to improve the accuracy and performance of multi-label classification and fully combine visual and semantic features,many existing studies use graph convolutional networks(GCN)for modeling.Object detection and multi-label image classification exhibit a degree of conceptual overlap;however,the integration of these two tasks within a unified framework has been relatively underexplored in the existing literature.In this paper,we come up with Object-GCN framework,a model combining object detection network YOLOv5 and graph convolutional network,and we carry out a thorough experimental analysis using a range of well-established public datasets.The designed framework Object-GCN achieves significantly better performance than existing studies in public datasets COCO2014,VOC2007,VOC2012.The final results achieved are 86.9%,96.7%,and 96.3%mean Average Precision(mAP)across the three datasets.展开更多
Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier ...Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier filtering problem from two aspects. First, a robust and efficient graph interaction model,is proposed, with the assumption that matches are correlated with each other rather than independently distributed. To this end, we construct a graph based on the local relationships of matches and formulate the outlier filtering task as a binary labeling energy minimization problem, where the pairwise term encodes the interaction between matches. We further show that this formulation can be solved globally by graph cut algorithm. Our new formulation always improves the performance of previous localitybased method without noticeable deterioration in processing time,adding a few milliseconds. Second, to construct a better graph structure, a robust and geometrically meaningful topology-aware relationship is developed to capture the topology relationship between matches. The two components in sum lead to topology interaction matching(TIM), an effective and efficient method for outlier filtering. Extensive experiments on several large and diverse datasets for multiple vision tasks including general feature matching, as well as relative pose estimation, homography and fundamental matrix estimation, loop-closure detection, and multi-modal image matching, demonstrate that our TIM is more competitive than current state-of-the-art methods, in terms of generality, efficiency, and effectiveness. The source code is publicly available at http://github.com/YifanLu2000/TIM.展开更多
文摘Multi-label image classification is recognized as an important task within the field of computer vision,a discipline that has experienced a significant escalation in research endeavors in recent years.The widespread adoption of convolutional neural networks(CNNs)has catalyzed the remarkable success of architectures such as ResNet-101 within the domain of image classification.However,inmulti-label image classification tasks,it is crucial to consider the correlation between labels.In order to improve the accuracy and performance of multi-label classification and fully combine visual and semantic features,many existing studies use graph convolutional networks(GCN)for modeling.Object detection and multi-label image classification exhibit a degree of conceptual overlap;however,the integration of these two tasks within a unified framework has been relatively underexplored in the existing literature.In this paper,we come up with Object-GCN framework,a model combining object detection network YOLOv5 and graph convolutional network,and we carry out a thorough experimental analysis using a range of well-established public datasets.The designed framework Object-GCN achieves significantly better performance than existing studies in public datasets COCO2014,VOC2007,VOC2012.The final results achieved are 86.9%,96.7%,and 96.3%mean Average Precision(mAP)across the three datasets.
基金supported by the National Natural Science Foundation of China (62276192)。
文摘Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier filtering problem from two aspects. First, a robust and efficient graph interaction model,is proposed, with the assumption that matches are correlated with each other rather than independently distributed. To this end, we construct a graph based on the local relationships of matches and formulate the outlier filtering task as a binary labeling energy minimization problem, where the pairwise term encodes the interaction between matches. We further show that this formulation can be solved globally by graph cut algorithm. Our new formulation always improves the performance of previous localitybased method without noticeable deterioration in processing time,adding a few milliseconds. Second, to construct a better graph structure, a robust and geometrically meaningful topology-aware relationship is developed to capture the topology relationship between matches. The two components in sum lead to topology interaction matching(TIM), an effective and efficient method for outlier filtering. Extensive experiments on several large and diverse datasets for multiple vision tasks including general feature matching, as well as relative pose estimation, homography and fundamental matrix estimation, loop-closure detection, and multi-modal image matching, demonstrate that our TIM is more competitive than current state-of-the-art methods, in terms of generality, efficiency, and effectiveness. The source code is publicly available at http://github.com/YifanLu2000/TIM.