期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Efficient User Identity Linkage Based on Aligned Multimodal Features and Temporal Correlation
1
作者 Jiaqi Gao Kangfeng Zheng +2 位作者 Xiujuan Wang Chunhua Wu Bin Wu 《Computers, Materials & Continua》 SCIE EI 2024年第10期251-270,共20页
User identity linkage(UIL)refers to identifying user accounts belonging to the same identity across different social media platforms.Most of the current research is based on text analysis,which fails to fully explore ... User identity linkage(UIL)refers to identifying user accounts belonging to the same identity across different social media platforms.Most of the current research is based on text analysis,which fails to fully explore the rich image resources generated by users,and the existing attempts touch on the multimodal domain,but still face the challenge of semantic differences between text and images.Given this,we investigate the UIL task across different social media platforms based on multimodal user-generated contents(UGCs).We innovatively introduce the efficient user identity linkage via aligned multi-modal features and temporal correlation(EUIL)approach.The method first generates captions for user-posted images with the BLIP model,alleviating the problem of missing textual information.Subsequently,we extract aligned text and image features with the CLIP model,which closely aligns the two modalities and significantly reduces the semantic gap.Accordingly,we construct a set of adapter modules to integrate the multimodal features.Furthermore,we design a temporal weight assignment mechanism to incorporate the temporal dimension of user behavior.We evaluate the proposed scheme on the real-world social dataset TWIN,and the results show that our method reaches 86.39%accuracy,which demonstrates the excellence in handling multimodal data,and provides strong algorithmic support for UIL. 展开更多
关键词 User identity linkage multimodal models attention mechanism temporal correlation
下载PDF
Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models
2
作者 Zheyi Chen Liuchang Xu +5 位作者 Hongting Zheng Luyao Chen Amr Tolba Liang Zhao Keping Yu Hailin Feng 《Computers, Materials & Continua》 SCIE EI 2024年第8期1753-1808,共56页
Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the ... Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the last two decades.Recently,transformer-based Pre-trained Language Models(PLM)have excelled in Natural Language Processing(NLP)tasks by leveraging large-scale training corpora.Increasing the scale of these models enhances performance significantly,introducing abilities like context learning that smaller models lack.The advancement in Large Language Models,exemplified by the development of ChatGPT,has made significant impacts both academically and industrially,capturing widespread societal interest.This survey provides an overview of the development and prospects from Large Language Models(LLM)to Large Multimodal Models(LMM).It first discusses the contributions and technological advancements of LLMs in the field of natural language processing,especially in text generation and language understanding.Then,it turns to the discussion of LMMs,which integrates various data modalities such as text,images,and sound,demonstrating advanced capabilities in understanding and generating cross-modal content,paving new pathways for the adaptability and flexibility of AI systems.Finally,the survey highlights the prospects of LMMs in terms of technological development and application potential,while also pointing out challenges in data integration,cross-modal understanding accuracy,providing a comprehensive perspective on the latest developments in this field. 展开更多
关键词 Artificial intelligence large language models large multimodal models foundation models
下载PDF
Large multimodal models assist in psychiatry disorders prevention and diagnosis of students
3
作者 Xin-Qiao Liu Xin Wang Hui-Rui Zhang 《World Journal of Psychiatry》 SCIE 2024年第10期1415-1421,共7页
Students are considered one of the groups most affected by psychological pro-blems.Given the highly dangerous nature of mental illnesses and the increasing-ly serious state of global mental health,it is imperative for... Students are considered one of the groups most affected by psychological pro-blems.Given the highly dangerous nature of mental illnesses and the increasing-ly serious state of global mental health,it is imperative for us to explore new me-thods and approaches concerning the prevention and treatment of mental illne-sses.Large multimodal models(LMMs),as the most advanced artificial intelligen-ce models(i.e.ChatGPT-4),have brought new hope to the accurate prevention,diagnosis,and treatment of psychiatric disorders.The assistance of these models in the promotion of mental health is critical,as the latter necessitates a strong foundation of medical knowledge and professional skills,emotional support,stigma mitigation,the encouragement of more honest patient self-disclosure,reduced health care costs,improved medical efficiency,and greater mental health service coverage.However,these models must address challenges related to health,safety,hallucinations,and ethics simultaneously.In the future,we should address these challenges by developing relevant usage manuals,accountability rules,and legal regulations;implementing a human-centered approach;and intelligently upgrading LMMs through the deep optimization of such models,their algorithms,and other means.This effort will thus substantially contribute not only to the maintenance of students’health but also to the achievement of global sustainable development goals. 展开更多
关键词 Large multimodal models ChatGPT Psychiatric disorders Mental health STUDENT
下载PDF
Research status and application of artificial intelligence large models in the oil and gas industry
4
作者 LIU He REN Yili +6 位作者 LI Xin DENG Yue WANG Yongtao CAO Qianwen DU Jinyang LIN Zhiwei WANG Wenjie 《Petroleum Exploration and Development》 SCIE 2024年第4期1049-1065,共17页
This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large mode... This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large models in vertical industries,outlines the challenges and issues confronted in applying large models in the oil and gas sector,and offers prospects for the application of large models in the oil and gas industry.The existing large models can be briefly divided into three categories:large language models,visual large models,and multimodal large models.The application of large models in the oil and gas industry is still in its infancy.Based on open-source large language models,some oil and gas enterprises have released large language model products using methods like fine-tuning and retrieval augmented generation.Scholars have attempted to develop scenario-specific models for oil and gas operations by using visual/multimodal foundation models.A few researchers have constructed pre-trained foundation models for seismic data processing and interpretation,as well as core analysis.The application of large models in the oil and gas industry faces challenges such as current data quantity and quality being difficult to support the training of large models,high research and development costs,and poor algorithm autonomy and control.The application of large models should be guided by the needs of oil and gas business,taking the application of large models as an opportunity to improve data lifecycle management,enhance data governance capabilities,promote the construction of computing power,strengthen the construction of“artificial intelligence+energy”composite teams,and boost the autonomy and control of large model technology. 展开更多
关键词 foundation model large language mode visual large model multimodal large model large model of oil and gas industry pre-training fine-tuning
下载PDF
Power allocation and mode selection methods for cooperative communication in the rectangular tunnel 被引量:2
5
作者 Zhai Wenyan Sun Yanjing +1 位作者 Xu Zhao Li Song 《International Journal of Mining Science and Technology》 SCIE EI CSCD 2015年第2期253-260,共8页
For the multipath fading on electromagnetic waves of wireless communication in the confined areas,the rectangular tunnel cooperative communication system was established based on the multimode channel model and the ch... For the multipath fading on electromagnetic waves of wireless communication in the confined areas,the rectangular tunnel cooperative communication system was established based on the multimode channel model and the channel capacity formula derivation was obtained.On the optimal criterion of the channel capacity,the power allocation methods of both amplifying and forwarding(AF) and decoding and forwarding(DF) cooperative communication systems were proposed in the limitation of the total power to maximize the channel capacity.The mode selection methods of single input single output(SISO) and single input multiple output(SIMO) models in the rectangular tunnel,through which the higher channel capacity can be obtained,were put forward as well.The theoretical analysis and simulation comparison show that,channel capacity of the wireless communication system in the rectangular tunnel can be effectively enhanced through the cooperative technology;channel capacity of the rectangular tunnel under complicated conditions is maximized through the proposed power allocation methods,and the optimal cooperative mode of the channel capacity can be chosen according to the cooperative mode selection methods given in the paper. 展开更多
关键词 Rectangular tunnel Multimode channel model Channel capacity Cooperative communication Power allocation Mode selection
下载PDF
An aligned mixture probabilistic principal component analysis for fault detection of multimode chemical processes 被引量:4
6
作者 杨雅伟 马玉鑫 +1 位作者 宋冰 侍洪波 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2015年第8期1357-1363,共7页
A novel approach named aligned mixture probabilistic principal component analysis(AMPPCA) is proposed in this study for fault detection of multimode chemical processes. In order to exploit within-mode correlations,the... A novel approach named aligned mixture probabilistic principal component analysis(AMPPCA) is proposed in this study for fault detection of multimode chemical processes. In order to exploit within-mode correlations,the AMPPCA algorithm first estimates a statistical description for each operating mode by applying mixture probabilistic principal component analysis(MPPCA). As a comparison, the combined MPPCA is employed where monitoring results are softly integrated according to posterior probabilities of the test sample in each local model. For exploiting the cross-mode correlations, which may be useful but are inadvertently neglected due to separately held monitoring approaches, a global monitoring model is constructed by aligning all local models together. In this way, both within-mode and cross-mode correlations are preserved in this integrated space. Finally, the utility and feasibility of AMPPCA are demonstrated through a non-isothermal continuous stirred tank reactor and the TE benchmark process. 展开更多
关键词 Multimode process monitoring Mixture probabilistic principal component analysis Model alignment Fault detection
下载PDF
A novel multimode process monitoring method integrating LCGMM with modified LFDA 被引量:4
7
作者 任世锦 宋执环 +1 位作者 杨茂云 任建国 《Chinese Journal of Chemical Engineering》 SCIE EI CAS CSCD 2015年第12期1970-1980,共11页
Complex processes often work with multiple operation regions, it is critical to develop effective monitoring approaches to ensure the safety of chemical processes. In this work, a discriminant local consistency Gaussi... Complex processes often work with multiple operation regions, it is critical to develop effective monitoring approaches to ensure the safety of chemical processes. In this work, a discriminant local consistency Gaussian mixture model(DLCGMM) for multimode process monitoring is proposed for multimode process monitoring by integrating LCGMM with modified local Fisher discriminant analysis(MLFDA). Different from Fisher discriminant analysis(FDA) that aims to discover the global optimal discriminant directions, MLFDA is capable of uncovering multimodality and local structure of the data by exploiting the posterior probabilities of observations within clusters calculated from the results of LCGMM. This may enable MLFDA to capture more meaningful discriminant information hidden in the high-dimensional multimode observations comparing to FDA. Contrary to most existing multimode process monitoring approaches, DLCGMM performs LCGMM and MFLDA iteratively, and the optimal subspaces with multi-Gaussianity and the optimal discriminant projection vectors are simultaneously achieved in the framework of supervised and unsupervised learning. Furthermore, monitoring statistics are established on each cluster that represents a specific operation condition and two global Bayesian inference-based fault monitoring indexes are established by combining with all the monitoring results of all clusters. The efficiency and effectiveness of the proposed method are evaluated through UCI datasets, a simulated multimode model and the Tennessee Eastman benchmark process. 展开更多
关键词 Multimode process monitoring Discriminant local consistency Gaussian mixture model Modified local Fisher discriminant analysis Global fault detection index Tennessee Eastman process
下载PDF
Cross-modal Contrastive Learning for Generalizable and Efficient Image-text Retrieval
8
作者 Haoyu Lu Yuqi Huo +2 位作者 Mingyu Ding Nanyi Fei Zhiwu Lu 《Machine Intelligence Research》 EI CSCD 2023年第4期569-582,共14页
Cross-modal image-text retrieval is a fundamental task in bridging vision and language. It faces two main challenges that are typically not well addressed in previous works. 1) Generalizability: Existing methods often... Cross-modal image-text retrieval is a fundamental task in bridging vision and language. It faces two main challenges that are typically not well addressed in previous works. 1) Generalizability: Existing methods often assume a strong semantic correlation between each text-image pair, which are thus difficult to generalize to real-world scenarios where the weak correlation dominates. 2) Efficiency: Many latest works adopt the single-tower architecture with heavy detectors, which are inefficient during the inference stage because the costly computation needs to be repeated for each text-image pair. In this work, to overcome these two challenges, we propose a two-tower cross-modal contrastive learning (CMCL) framework. Specifically, we first devise a two-tower architecture, which enables a unified feature space for the text and image modalities to be directly compared with each other, alleviating the heavy computation during inference. We further introduce a simple yet effective module named multi-grid split (MGS) to learn fine-grained image features without using detectors. Last but not the least, we deploy a cross-modal contrastive loss on the global image/text features to learn their weak correlation and thus achieve high generalizability. To validate that our CMCL can be readily generalized to real-world scenarios, we construct a large multi-source image-text dataset called weak semantic correlation dataset (WSCD). Extensive experiments show that our CMCL outperforms the state-of-the-arts while being much more efficient. 展开更多
关键词 Image-text retrieval multimodal modeling contrastive learning weak correlation computer vision
原文传递
Numerical Investigation of Tip Clearance Effects in an Axial Transonic Compressor 被引量:9
9
作者 R. Ciorciari A. Lesser +1 位作者 F. Blaim R. Niehuis 《Journal of Thermal Science》 SCIE EI CAS CSCD 2012年第2期109-119,共11页
Numerical investigations of the Darmstadt transonic single stage compressor (DTC), in the Rotor1-Stator1 configuration, aimed at advancing the understanding of the effect of different rotor tip gaps and transition mod... Numerical investigations of the Darmstadt transonic single stage compressor (DTC), in the Rotor1-Stator1 configuration, aimed at advancing the understanding of the effect of different rotor tip gaps and transition modelling on the blade surfaces are presented. Steady three dimensional Reynolds Averaged Navier Stokes (RANS) simulations were performed to obtain the flow fields for the different configurations at different operating conditions using the RANS-Solver TRACE. The stage geometry and the multi-block structured grid were generated by G3DMESH and a grid sensitivity analysis was conducted. For the clearance gap region, a fully gridded special H-grid was chosen. Comparisons were made between the flow characteristic at design speed, representative for a transonic flow regime, and at 65% speed, representative for a subsonic flow regime. The computations were used to analyse the flow phenomena through the tip clearance region for the different configurations and their impact on the performance of the compressor stage. 展开更多
关键词 Flow in axial compressor surge and stall tip clearance flow multimode transition model.
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部