期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Leveraging Vision-Language Pre-Trained Model and Contrastive Learning for Enhanced Multimodal Sentiment Analysis
1
作者 Jieyu An Wan Mohd Nazmee Wan Zainon Binfen Ding 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1673-1689,共17页
Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on... Multimodal sentiment analysis is an essential area of research in artificial intelligence that combines multiple modes,such as text and image,to accurately assess sentiment.However,conventional approaches that rely on unimodal pre-trained models for feature extraction from each modality often overlook the intrinsic connections of semantic information between modalities.This limitation is attributed to their training on unimodal data,and necessitates the use of complex fusion mechanisms for sentiment analysis.In this study,we present a novel approach that combines a vision-language pre-trained model with a proposed multimodal contrastive learning method.Our approach harnesses the power of transfer learning by utilizing a vision-language pre-trained model to extract both visual and textual representations in a unified framework.We employ a Transformer architecture to integrate these representations,thereby enabling the capture of rich semantic infor-mation in image-text pairs.To further enhance the representation learning of these pairs,we introduce our proposed multimodal contrastive learning method,which leads to improved performance in sentiment analysis tasks.Our approach is evaluated through extensive experiments on two publicly accessible datasets,where we demonstrate its effectiveness.We achieve a significant improvement in sentiment analysis accuracy,indicating the supe-riority of our approach over existing techniques.These results highlight the potential of multimodal sentiment analysis and underscore the importance of considering the intrinsic semantic connections between modalities for accurate sentiment assessment. 展开更多
关键词 Multimodal sentiment analysis vision–language pre-trained model contrastive learning sentiment classification
下载PDF
Binocular Vision Object Positioning Method for Robots Based on Coarse-fine Stereo Matching 被引量:5
2
作者 Wei-Ping Ma Wen-Xin Li Peng-Xia Cao 《International Journal of Automation and computing》 EI CSCD 2020年第4期562-571,共10页
In order to improve the low positioning accuracy and execution efficiency of the robot binocular vision,a binocular vision positioning method based on coarse-fine stereo matching is proposed to achieve object position... In order to improve the low positioning accuracy and execution efficiency of the robot binocular vision,a binocular vision positioning method based on coarse-fine stereo matching is proposed to achieve object positioning.The random fern is used in the coarse matching to identify objects in the left and right images,and the pixel coordinates of the object center points in the two images are calculated to complete the center matching.In the fine matching,the right center point is viewed as an estimated value to set the search range of the right image,in which the region matching is implemented to find the best matched point of the left center point.Then,the similar triangle principle of the binocular vision model is used to calculate the 3D coordinates of the center point,achieving fast and accurate object positioning.Finally,the proposed method is applied to the object scene images and the robotic arm grasping platform.The experimental results show that the average absolute positioning error and average relative positioning error of the proposed method are 8.22 mm and 1.96%respectively when the object's depth distance is within 600 mm,the time consumption is less than 1.029s.The method can meet the needs of the robot grasping system,and has better accuracy and robustness. 展开更多
关键词 Object positioning stereo matching random fern normalized cross correlation binocular vision model
原文传递
Topography of Visual Features in the Human Ventral Visual Pathway
3
作者 Shijia Fan Xiaosha Wang +2 位作者 Xiaoying Wang Tao Wei Yanchao Bi 《Neuroscience Bulletin》 SCIE CAS CSCD 2021年第10期1454-1468,共15页
Visual object recognition in humans and nonhuman primates is achieved by the ventral visual pathway(ventral occipital-temporal cortex,VOTC),which shows a well-documented object domain structure.An on-going question is... Visual object recognition in humans and nonhuman primates is achieved by the ventral visual pathway(ventral occipital-temporal cortex,VOTC),which shows a well-documented object domain structure.An on-going question is what type of information is processed in the higher-order VOTC that underlies such observations,with recent evidence suggesting effects of certain visual features.Combining computational vision models,fMRI experiment using a parametric-modulation approach,and natural image statistics of common objects,we depicted the neural distribution of a comprehensive set of visual features in the VOTC,identifying voxel sensitivities with specific feature sets across geometry/shape,Fourier power,and color.The visual feature combination pattern in the VOTC is significantly explained by their relationships to different types of response-action computation(fight-or-flight,navigation,and manipulation),as derived from behavioral ratings and natural image statistics.These results offer a comprehensive visual feature map in the VOTC and a plausible theoretical explanation as a mapping onto different types of downstream response-action systems. 展开更多
关键词 Ventral occipital temporal cortex Computational vision model Domain organization Response mapping
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部