期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Exploration of French-Chinese Translation Methods of Electrical Engineering Terminology Using Online Image-Text Retrieval Mode
1
作者 Tian Li 《Journal of Contemporary Educational Research》 2023年第6期47-52,共6页
With the incessant propulsion of the Open Door Policy,which is related to the consolidation of international collaborative partnerships,an increasing number of Chinese companies are moving toward cooperating countries... With the incessant propulsion of the Open Door Policy,which is related to the consolidation of international collaborative partnerships,an increasing number of Chinese companies are moving toward cooperating countries to participate in infrastructure construction,employing a win-win strategy in favor of the people and governments of both countries.Among the cooperation domains,our country’s electrical companies have achieved a series of remarkable results in the international Engineering,Procurement,and Construction(EPC)project market with their outstanding business capabilities and technical advantages.Nevertheless,some shortcomings cannot be overlooked,the most notable of which appears to be the impediment associated with engineering translation,which has always been an obsession among translators of Chinese companies.Taking the transmission line project in the Republic of Madagascar as an example,an analysis of French-Chinese translation methods of electrical engineering terminology in the field of the transmission line is carried out. 展开更多
关键词 Engineering translation Translation methods Electrical engineering terminology Interdisciplinary communication Online image-text retrieval mode
下载PDF
Multi-Task Visual Semantic Embedding Network for Image-Text Retrieval
2
作者 Xue-Yang Qin Li-Shuang Li +3 位作者 Jing-Yao Tang Fei Hao Mei-Ling Ge Guang-Yao Pang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2024年第4期811-826,共16页
Image-text retrieval aims to capture the semantic correspondence between images and texts,which serves as a foundation and crucial component in multi-modal recommendations,search systems,and online shopping.Existing m... Image-text retrieval aims to capture the semantic correspondence between images and texts,which serves as a foundation and crucial component in multi-modal recommendations,search systems,and online shopping.Existing mainstream methods primarily focus on modeling the association of image-text pairs while neglecting the advantageous impact of multi-task learning on image-text retrieval.To this end,a multi-task visual semantic embedding network(MVSEN)is proposed for image-text retrieval.Specifically,we design two auxiliary tasks,including text-text matching and multi-label classification,for semantic constraints to improve the generalization and robustness of visual semantic embedding from a training perspective.Besides,we present an intra-and inter-modality interaction scheme to learn discriminative visual and textual feature representations by facilitating information flow within and between modalities.Subsequently,we utilize multi-layer graph convolutional networks in a cascading manner to infer the correlation of image-text pairs.Experimental results show that MVSEN outperforms state-of-the-art methods on two publicly available datasets,Flickr30K and MSCOCO,with rSum improvements of 8.2%and 3.0%,respectively. 展开更多
关键词 image-text retrieval cross-modal retrieval multi-task learning graph convolutional network
原文传递
Cross-modal Contrastive Learning for Generalizable and Efficient Image-text Retrieval
3
作者 Haoyu Lu Yuqi Huo +2 位作者 Mingyu Ding Nanyi Fei Zhiwu Lu 《Machine Intelligence Research》 EI CSCD 2023年第4期569-582,共14页
Cross-modal image-text retrieval is a fundamental task in bridging vision and language. It faces two main challenges that are typically not well addressed in previous works. 1) Generalizability: Existing methods often... Cross-modal image-text retrieval is a fundamental task in bridging vision and language. It faces two main challenges that are typically not well addressed in previous works. 1) Generalizability: Existing methods often assume a strong semantic correlation between each text-image pair, which are thus difficult to generalize to real-world scenarios where the weak correlation dominates. 2) Efficiency: Many latest works adopt the single-tower architecture with heavy detectors, which are inefficient during the inference stage because the costly computation needs to be repeated for each text-image pair. In this work, to overcome these two challenges, we propose a two-tower cross-modal contrastive learning (CMCL) framework. Specifically, we first devise a two-tower architecture, which enables a unified feature space for the text and image modalities to be directly compared with each other, alleviating the heavy computation during inference. We further introduce a simple yet effective module named multi-grid split (MGS) to learn fine-grained image features without using detectors. Last but not the least, we deploy a cross-modal contrastive loss on the global image/text features to learn their weak correlation and thus achieve high generalizability. To validate that our CMCL can be readily generalized to real-world scenarios, we construct a large multi-source image-text dataset called weak semantic correlation dataset (WSCD). Extensive experiments show that our CMCL outperforms the state-of-the-arts while being much more efficient. 展开更多
关键词 image-text retrieval multimodal modeling contrastive learning weak correlation computer vision
原文传递
Multimodal Social Media Fake News Detection Based on Similarity Inference and Adversarial Networks 被引量:1
4
作者 Fangfang Shan Huifang Sun Mengyi Wang 《Computers, Materials & Continua》 SCIE EI 2024年第4期581-605,共25页
As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocrea... As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocreate a misleading perception among users. While early research primarily focused on text-based features forfake news detection mechanisms, there has been relatively limited exploration of learning shared representationsin multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal modelfor detecting fake news, which relies on similarity reasoning and adversarial networks. The model employsBidirectional Encoder Representation from Transformers (BERT) and Text Convolutional Neural Network (Text-CNN) for extracting textual features while utilizing the pre-trained Visual Geometry Group 19-layer (VGG-19) toextract visual features. Subsequently, the model establishes similarity representations between the textual featuresextracted by Text-CNN and visual features through similarity learning and reasoning. Finally, these features arefused to enhance the accuracy of fake news detection, and adversarial networks have been employed to investigatethe relationship between fake news and events. This paper validates the proposed model using publicly availablemultimodal datasets from Weibo and Twitter. Experimental results demonstrate that our proposed approachachieves superior performance on Twitter, with an accuracy of 86%, surpassing traditional unimodalmodalmodelsand existing multimodal models. In contrast, the overall better performance of our model on the Weibo datasetsurpasses the benchmark models across multiple metrics. The application of similarity reasoning and adversarialnetworks in multimodal fake news detection significantly enhances detection effectiveness in this paper. However,current research is limited to the fusion of only text and image modalities. Future research directions should aimto further integrate features fromadditionalmodalities to comprehensively represent themultifaceted informationof fake news. 展开更多
关键词 Fake news detection attention mechanism image-text similarity multimodal feature fusion
下载PDF
Dixit Player with Open CLIP
5
作者 Ryan Wei 《Journal of Data Analysis and Information Processing》 2023年第4期536-547,共12页
A computer vision approach through Open AI’s CLIP, a model capable of predicting text-image pairs, is used to create an AI agent for Dixit, a game which requires creative linking between images and text. This paper c... A computer vision approach through Open AI’s CLIP, a model capable of predicting text-image pairs, is used to create an AI agent for Dixit, a game which requires creative linking between images and text. This paper calculates baseline accuracies for both the ability to match the correct image to a hint and the ability to match up with human preferences. A dataset created by previous work on Dixit is used for testing. CLIP is utilized through the comparison of a hint to multiple images, and previous hints, achieving a final accuracy of 0.5011 which surpasses previous results. 展开更多
关键词 Computer Vision AI CLIP Dixit Open AI Creative Gameplay Open CLIP Natural Language Processing Visual Models Game AI image-text Pairing
下载PDF
What Contributes to a Crowdfunding Campaign’s Success?Evidence and Analyses from GoFundMe Data
6
作者 Xupin Zhang Hanjia Lyu Jiebo Luo 《Journal of Social Computing》 2021年第2期183-192,共10页
Researchers have attempted to measure the success of crowdfunding campaigns using a variety of determinants,such as the descriptions of the crowdfunding campaigns,the amount of funding goals,and crowdfunding project c... Researchers have attempted to measure the success of crowdfunding campaigns using a variety of determinants,such as the descriptions of the crowdfunding campaigns,the amount of funding goals,and crowdfunding project characteristics.Although many successful determinants have been reported in the literature,it remains unclear whether the cover photo and the text in the title and description could be combined in a fusion classifier to better predict the crowdfunding campaign’s success.In this work,we focus on the performance of the crowdfunding campaigns on GoFundMe across a wide variety of funding categories.We analyze the attributes available at the launch of the campaign and identify attributes that are important for each category of the campaigns.Furthermore,we develop a fusion classifier based on the random forest that significantly improves the prediction result,thus suggesting effective ways to make a campaign successful. 展开更多
关键词 CROWDFUNDING image-text fusion GoFundMe
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部