Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer ...Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer learning to recognize defect lemons (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;">, green and mold defects). First, the data enhancement and brightness compensation techniques are used for data prepossessing. The visual feature extraction is used to quantify the defects and determine the feature variables as the bandit basis for classification. Then we construct a convolutional neural network with an embedded Visual Geome</span><span style="font-family:Verdana;">try Group 16 based (VGG16-based) network using transfer learning. The proposed model is compared with many benchmark models such as</span><span style="font-family:Verdana;"> K-</span></span><span style="font-family:Verdana;">n</span><span style="font-family:Verdana;">earest</span><span style="font-family:""> </span><span style="font-family:Verdana;">Neighbor (KNN) and Support Vector Machine (SVM). Result</span><span style="font-family:Verdana;">s</span><span style="font-family:Verdana;"> show that the proposed model achieves the highest accuracy (95.44%) in the testing data set. The research provides a new solution for lemon defect recognition.展开更多
In the robotic welding process with thick steel plates,laser vision sensors are widely used to profile the weld seam to implement automatic seam tracking.The weld seam profile extraction(WSPE)result is a crucial step ...In the robotic welding process with thick steel plates,laser vision sensors are widely used to profile the weld seam to implement automatic seam tracking.The weld seam profile extraction(WSPE)result is a crucial step for identifying the feature points of the extracted profile to guide the welding torch in real time.The visual information processing system may collapse when interference data points in the image survive during the phase of feature point identification,which results in low tracking accuracy and poor welding quality.This paper presents a visual attention featurebased method to extract the weld seam profile(WSP)from the strong arc background using clustering results.First,a binary image is obtained through the preprocessing stage.Second,all data points with a gray value 255 are clustered with the nearest neighborhood clustering algorithm.Third,a strategy is developed to discern one cluster belonging to the WSP from the appointed candidate clusters in each loop,and a scheme is proposed to extract the entire WSP using visual continuity.Compared with the previous methods the proposed method in this paper can extract more useful details of the WSP and has better stability in terms of removing the interference data.Considerable WSPE tests with butt joints and T-joints show the anti-interference ability of the proposed method,which contributes to smoothing the welding process and shows its practical value in robotic automated welding with thick steel plates.展开更多
The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the ...The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts.展开更多
Unconstrained face images are interfered by many factors such as illumination,posture,expression,occlusion,age,accessories and so on,resulting in the randomness of the noise pollution implied in the original samples.I...Unconstrained face images are interfered by many factors such as illumination,posture,expression,occlusion,age,accessories and so on,resulting in the randomness of the noise pollution implied in the original samples.In order to improve the sample quality,a weighted block cooperative sparse representation algorithm is proposed based on visual saliency dictionary.First,the algorithm uses the biological visual attention mechanism to quickly and accurately obtain the face salient target and constructs the visual salient dictionary.Then,a block cooperation framework is presented to perform sparse coding for different local structures of human face,and the weighted regular term is introduced in the sparse representation process to enhance the identification of information hidden in the coding coefficients.Finally,by synthesising the sparse representation results of all visual salient block dictionaries,the global coding residual is obtained and the class label is given.The experimental results on four databases,that is,AR,extended Yale B,LFW and PubFig,indicate that the combination of visual saliency dictionary,block cooperative sparse representation and weighted constraint coding can effectively enhance the accuracy of sparse representation of the samples to be tested and improve the performance of unconstrained face recognition.展开更多
Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to ach...Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to achieve better results in VQA tasks.Analysis of all features may cause information redundancy and heavy computational burden.Attention mechanism is a wise way to solve this problem.However,using single attention mechanism may cause incomplete concern of features.This paper improves the attention mechanism method and proposes a hybrid attention mechanism that combines the spatial attention mechanism method and the channel attention mechanism method.In the case that the attention mechanism will cause the loss of the original features,a small portion of image features were added as compensation.For the attention mechanism of text features,a selfattention mechanism was introduced,and the internal structural features of sentences were strengthened to improve the overall model.The results show that attention mechanism and feature compensation add 6.1%accuracy to multimodal low-rank bilinear pooling network.展开更多
为了解决在全球导航卫星系统(Global Navigation Satellite System)拒止情况下无人机导航能力缺失等问题,提出了一种基于改进快速提取旋转描述子(Oriented FAST and Rotated Brief,ORB)图像特征匹配的无人机视觉导航方法。首先,为了实...为了解决在全球导航卫星系统(Global Navigation Satellite System)拒止情况下无人机导航能力缺失等问题,提出了一种基于改进快速提取旋转描述子(Oriented FAST and Rotated Brief,ORB)图像特征匹配的无人机视觉导航方法。首先,为了实现无人机的绝对定位,提出了一种特征图像基准数据库构建方法;其次,为提取图像数据集的特征点,采用了一种结合尺度不变特征变换(Scale Invariant Feature Transform,SIFT)的尺度空间优化ORB特征提取算法;最后,为了将图像特征与图像基准数据库快速匹配并提高其匹配精度,提出了一种改进ORB特征匹配算法——ORB+GMS+PROSAC算法。通过在ArcGIS中分割图像构建基准数据库并进行实验分析,结果表明,基于ORB+GMS+PROSAC特征匹配算法性能显著提升,其中匹配准确率上升5.05%,匹配时间减少41.61%,明显优于其他传统特征匹配算法。展开更多
文摘Applying machine learning to lemon defect recognition can improve the efficiency of lemon quality detection. This paper proposes a deep learning-based classification method with visual feature extraction and transfer learning to recognize defect lemons (</span><i><span style="font-family:Verdana;">i.e.</span></i><span style="font-family:Verdana;">, green and mold defects). First, the data enhancement and brightness compensation techniques are used for data prepossessing. The visual feature extraction is used to quantify the defects and determine the feature variables as the bandit basis for classification. Then we construct a convolutional neural network with an embedded Visual Geome</span><span style="font-family:Verdana;">try Group 16 based (VGG16-based) network using transfer learning. The proposed model is compared with many benchmark models such as</span><span style="font-family:Verdana;"> K-</span></span><span style="font-family:Verdana;">n</span><span style="font-family:Verdana;">earest</span><span style="font-family:""> </span><span style="font-family:Verdana;">Neighbor (KNN) and Support Vector Machine (SVM). Result</span><span style="font-family:Verdana;">s</span><span style="font-family:Verdana;"> show that the proposed model achieves the highest accuracy (95.44%) in the testing data set. The research provides a new solution for lemon defect recognition.
基金Supported by National Natural Science Foundation of China(Grant Nos.51575349,51665037,51575348)State Key Laboratory of Smart Manufacturing for Special Vehicles and Transmission System(Grant No.GZ2016KF002).
文摘In the robotic welding process with thick steel plates,laser vision sensors are widely used to profile the weld seam to implement automatic seam tracking.The weld seam profile extraction(WSPE)result is a crucial step for identifying the feature points of the extracted profile to guide the welding torch in real time.The visual information processing system may collapse when interference data points in the image survive during the phase of feature point identification,which results in low tracking accuracy and poor welding quality.This paper presents a visual attention featurebased method to extract the weld seam profile(WSP)from the strong arc background using clustering results.First,a binary image is obtained through the preprocessing stage.Second,all data points with a gray value 255 are clustered with the nearest neighborhood clustering algorithm.Third,a strategy is developed to discern one cluster belonging to the WSP from the appointed candidate clusters in each loop,and a scheme is proposed to extract the entire WSP using visual continuity.Compared with the previous methods the proposed method in this paper can extract more useful details of the WSP and has better stability in terms of removing the interference data.Considerable WSPE tests with butt joints and T-joints show the anti-interference ability of the proposed method,which contributes to smoothing the welding process and shows its practical value in robotic automated welding with thick steel plates.
基金Acknowledgements This paper was supported by the coUabomtive Research Project SEV under Cant No. 01100474 between Beijing University of Posts and Telecorrrcnications and France Telecom R&D Beijing the National Natural Science Foundation of China under Cant No. 90920001 the Caduate Innovation Fund of SICE, BUPT, 2011.
文摘The rapid growth of multimedia content necessitates powerful technologies to filter, classify, index and retrieve video documents more efficiently. However, the essential bottleneck of image and video analysis is the problem of semantic gap that low level features extracted by computers always fail to coincide with high-level concepts interpreted by humans. In this paper, we present a generic scheme for the detection video semantic concepts based on multiple visual features machine learning. Various global and local low-level visual features are systelrtically investigated, and kernelbased learning method equips the concept detection system to explore the potential of these features. Then we combine the different features and sub-systen on both classifier-level and kernel-level fusion that contribute to a more robust system Our proposed system is tested on the TRECVID dataset. The resulted Mean Average Precision (MAP) score is rmch better than the benchmark perforrmnce, which proves that our concepts detection engine develops a generic model and perforrrs well on both object and scene type concepts.
基金Natural Science Foundation of Jiangsu Province,Grant/Award Number:BK20170765National Natural Science Foundation of China,Grant/Award Number:61703201+1 种基金Future Network Scientific Research Fund Project,Grant/Award Number:FNSRFP2021YB26Science Foundation of Nanjing Institute of Technology,Grant/Award Numbers:ZKJ202002,ZKJ202003,and YKJ202019。
文摘Unconstrained face images are interfered by many factors such as illumination,posture,expression,occlusion,age,accessories and so on,resulting in the randomness of the noise pollution implied in the original samples.In order to improve the sample quality,a weighted block cooperative sparse representation algorithm is proposed based on visual saliency dictionary.First,the algorithm uses the biological visual attention mechanism to quickly and accurately obtain the face salient target and constructs the visual salient dictionary.Then,a block cooperation framework is presented to perform sparse coding for different local structures of human face,and the weighted regular term is introduced in the sparse representation process to enhance the identification of information hidden in the coding coefficients.Finally,by synthesising the sparse representation results of all visual salient block dictionaries,the global coding residual is obtained and the class label is given.The experimental results on four databases,that is,AR,extended Yale B,LFW and PubFig,indicate that the combination of visual saliency dictionary,block cooperative sparse representation and weighted constraint coding can effectively enhance the accuracy of sparse representation of the samples to be tested and improve the performance of unconstrained face recognition.
基金This work was supported by the Sichuan Science and Technology Program(2021YFQ0003).
文摘Visual question answering(VQA)has attracted more and more attention in computer vision and natural language processing.Scholars are committed to studying how to better integrate image features and text features to achieve better results in VQA tasks.Analysis of all features may cause information redundancy and heavy computational burden.Attention mechanism is a wise way to solve this problem.However,using single attention mechanism may cause incomplete concern of features.This paper improves the attention mechanism method and proposes a hybrid attention mechanism that combines the spatial attention mechanism method and the channel attention mechanism method.In the case that the attention mechanism will cause the loss of the original features,a small portion of image features were added as compensation.For the attention mechanism of text features,a selfattention mechanism was introduced,and the internal structural features of sentences were strengthened to improve the overall model.The results show that attention mechanism and feature compensation add 6.1%accuracy to multimodal low-rank bilinear pooling network.