期刊文献+
共找到14篇文章
< 1 >
每页显示 20 50 100
Multimodality Prediction of Chaotic Time Series with Sparse Hard-Cut EM Learning of the Gaussian Process Mixture Model 被引量:1
1
作者 周亚同 樊煜 +1 位作者 陈子一 孙建成 《Chinese Physics Letters》 SCIE CAS CSCD 2017年第5期22-26,共5页
The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It au... The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expec- tation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHO-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval SHC-EM outperforms the traditional variational 1earning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning. 展开更多
关键词 GPM multimodality Prediction of Chaotic Time Series with Sparse Hard-Cut EM learning of the Gaussian Process Mixture Model EM SHC
下载PDF
Multimodal Machine Learning Guides Low Carbon Aeration Strategies in Urban Wastewater Treatment
2
作者 Hong-Cheng Wang Yu-Qi Wang +4 位作者 Xu Wang Wan-Xin Yin Ting-Chao Yu Chen-Hao Xue Ai-Jie Wang 《Engineering》 SCIE EI CAS CSCD 2024年第5期51-62,共12页
The potential for reducing greenhouse gas(GHG)emissions and energy consumption in wastewater treatment can be realized through intelligent control,with machine learning(ML)and multimodality emerging as a promising sol... The potential for reducing greenhouse gas(GHG)emissions and energy consumption in wastewater treatment can be realized through intelligent control,with machine learning(ML)and multimodality emerging as a promising solution.Here,we introduce an ML technique based on multimodal strategies,focusing specifically on intelligent aeration control in wastewater treatment plants(WWTPs).The generalization of the multimodal strategy is demonstrated on eight ML models.The results demonstrate that this multimodal strategy significantly enhances model indicators for ML in environmental science and the efficiency of aeration control,exhibiting exceptional performance and interpretability.Integrating random forest with visual models achieves the highest accuracy in forecasting aeration quantity in multimodal models,with a mean absolute percentage error of 4.4%and a coefficient of determination of 0.948.Practical testing in a full-scale plant reveals that the multimodal model can reduce operation costs by 19.8%compared to traditional fuzzy control methods.The potential application of these strategies in critical water science domains is discussed.To foster accessibility and promote widespread adoption,the multimodal ML models are freely available on GitHub,thereby eliminating technical barriers and encouraging the application of artificial intelligence in urban wastewater treatment. 展开更多
关键词 Wastewater treatment Multimodal machine learning Deep learning Aeration control Interpretable machine learning
下载PDF
Solving Geometry Problems via Feature Learning and Contrastive Learning of Multimodal Data 被引量:1
3
作者 Pengpeng Jian Fucheng Guo +1 位作者 Yanli Wang Yang Li 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第8期1707-1728,共22页
This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to... This paper presents an end-to-end deep learning method to solve geometry problems via feature learning and contrastive learning of multimodal data.A key challenge in solving geometry problems using deep learning is to automatically adapt to the task of understanding single-modal and multimodal problems.Existing methods either focus on single-modal ormultimodal problems,and they cannot fit each other.A general geometry problem solver shouldobviouslybe able toprocess variousmodalproblems at the same time.Inthispaper,a shared feature-learning model of multimodal data is adopted to learn the unified feature representation of text and image,which can solve the heterogeneity issue between multimodal geometry problems.A contrastive learning model of multimodal data enhances the semantic relevance betweenmultimodal features and maps them into a unified semantic space,which can effectively adapt to both single-modal and multimodal downstream tasks.Based on the feature extraction and fusion of multimodal data,a proposed geometry problem solver uses relation extraction,theorem reasoning,and problem solving to present solutions in a readable way.Experimental results show the effectiveness of the method. 展开更多
关键词 Geometry problems multimodal feature learning multimodal contrastive learning automatic solver
下载PDF
Learning Strategies, Motivation and Learners' Perspectives on Online Multimodal Chinese Learning
4
作者 張鵬 《汉语教学方法与技术》 2021年第1期1-26,I0002,共27页
This mixed-method empirical study investigated the role of learning strategies and motivation in predicting L2 Chinese learning outcomes in an online multimodal learning environment.Both quantitative and qualitative a... This mixed-method empirical study investigated the role of learning strategies and motivation in predicting L2 Chinese learning outcomes in an online multimodal learning environment.Both quantitative and qualitative approaches also examined the learners'perspectives on online multimodal Chinese learning.The participants in this study were fifteen pre-intermediate adult Chinese learners aged 18-26.They were originally from different countries(Spain,Italy,Argentina,Colombia,and Mexico)and lived in Barcelona.They were multilingual,speaking more than two European languages,without exposure to any other Asian languages apart from Chinese.The study's investigation was composed of Strategy Inventory for Language Learning(SILL),motivation questionnaire,learner perception questionnaire,and focus group interview.The whole trial period lasted three months;after the experiment,the statistics were analyzed via the Spearman correlation coefficient.The statistical analysis results showed that strategy use was highly correlated with online multimodal Chinese learning outcomes;this indicated that strategy use played a vital role in online multimodal Chinese learning.Motivation was also found to have a significant effect.The perception questionnaire uncovered that the students were overall satisfied and favoring the online multimodal learning experience design.The detailed insights from the participants were exhibited in the transcripted analysis of focus group interviews. 展开更多
关键词 Chinese learning Online Multimodal learning Individual Difference MOTIVATION Strategy Over the last few decades
下载PDF
Intelligent Recognition Using Ultralight Multifunctional Nano‑Layered Carbon Aerogel Sensors with Human‑Like Tactile Perception 被引量:3
5
作者 Huiqi Zhao Yizheng Zhang +8 位作者 Lei Han Weiqi Qian Jiabin Wang Heting Wu Jingchen Li Yuan Dai Zhengyou Zhang Chris RBowen Ya Yang 《Nano-Micro Letters》 SCIE EI CAS CSCD 2024年第1期172-186,共15页
Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this uniq... Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence. 展开更多
关键词 Multifunctional sensor Tactile perception Multimodal machine learning algorithms Universal tactile system Intelligent object recognition
下载PDF
Enhancing Cross-Lingual Image Description: A Multimodal Approach for Semantic Relevance and Stylistic Alignment
6
作者 Emran Al-Buraihy Dan Wang 《Computers, Materials & Continua》 SCIE EI 2024年第6期3913-3938,共26页
Cross-lingual image description,the task of generating image captions in a target language from images and descriptions in a source language,is addressed in this study through a novel approach that combines neural net... Cross-lingual image description,the task of generating image captions in a target language from images and descriptions in a source language,is addressed in this study through a novel approach that combines neural network models and semantic matching techniques.Experiments conducted on the Flickr8k and AraImg2k benchmark datasets,featuring images and descriptions in English and Arabic,showcase remarkable performance improvements over state-of-the-art methods.Our model,equipped with the Image&Cross-Language Semantic Matching module and the Target Language Domain Evaluation module,significantly enhances the semantic relevance of generated image descriptions.For English-to-Arabic and Arabic-to-English cross-language image descriptions,our approach achieves a CIDEr score for English and Arabic of 87.9%and 81.7%,respectively,emphasizing the substantial contributions of our methodology.Comparative analyses with previous works further affirm the superior performance of our approach,and visual results underscore that our model generates image captions that are both semantically accurate and stylistically consistent with the target language.In summary,this study advances the field of cross-lingual image description,offering an effective solution for generating image captions across languages,with the potential to impact multilingual communication and accessibility.Future research directions include expanding to more languages and incorporating diverse visual and textual data sources. 展开更多
关键词 Cross-language image description multimodal deep learning semantic matching reward mechanisms
下载PDF
Classifying Chinese Medicine Constitution Using Multimodal Deep-Learning Model 被引量:1
7
作者 GU Tian-yu YAN Zhuang-zhi JIANG Jie-hui 《Chinese Journal of Integrative Medicine》 SCIE CAS CSCD 2024年第2期163-170,共8页
Objective:To develop a multimodal deep-learning model for classifying Chinese medicine constitution,i.e.,the balanced and unbalanced constitutions,based on inspection of tongue and face images,pulse waves from palpati... Objective:To develop a multimodal deep-learning model for classifying Chinese medicine constitution,i.e.,the balanced and unbalanced constitutions,based on inspection of tongue and face images,pulse waves from palpation,and health information from a total of 540 subjects.Methods:This study data consisted of tongue and face images,pulse waves obtained by palpation,and health information,including personal information,life habits,medical history,and current symptoms,from 540 subjects(202 males and 338 females).Convolutional neural networks,recurrent neural networks,and fully connected neural networks were used to extract deep features from the data.Feature fusion and decision fusion models were constructed for the multimodal data.Results:The optimal models for tongue and face images,pulse waves and health information were ResNet18,Gate Recurrent Unit,and entity embedding,respectively.Feature fusion was superior to decision fusion.The multimodal analysis revealed that multimodal data compensated for the loss of information from a single mode,resulting in improved classification performance.Conclusions:Multimodal data fusion can supplement single model information and improve classification performance.Our research underscores the effectiveness of multimodal deep learning technology to identify body constitution for modernizing and improving the intelligent application of Chinese medicine. 展开更多
关键词 Chinese medicine constitution classification multimodal deep learning tongue image face image pulsewave health information
原文传递
Deep learning for drug-drug interaction prediction:A comprehensive review
8
作者 Xinyue Li Zhankun Xiong +1 位作者 Wen Zhang Shichao Liu 《Quantitative Biology》 CAS CSCD 2024年第1期30-52,共23页
The prediction of drug-drug interactions(DDIs)is a crucial task for drug safety research,and identifying potential DDIs helps us to explore the mechanism behind combinatorial therapy.Traditional wet chemical experimen... The prediction of drug-drug interactions(DDIs)is a crucial task for drug safety research,and identifying potential DDIs helps us to explore the mechanism behind combinatorial therapy.Traditional wet chemical experiments for DDI are cumbersome and time-consuming,and are too small in scale,limiting the efficiency of DDI predictions.Therefore,it is particularly crucial to develop improved computational methods for detecting drug interactions.With the development of deep learning,several computational models based on deep learning have been proposed for DDI prediction.In this review,we summarized the high-quality DDI prediction methods based on deep learning in recent years,and divided them into four categories:neural network-based methods,graph neural network-based methods,knowledge graph-based methods,and multimodal-based methods.Furthermore,we discuss the challenges of existing methods and future potential perspectives.This review reveals that deep learning can significantly improve DDI prediction performance compared to traditional machine learning.Deep learning models can scale to large-scale datasets and accept multiple data types as input,thus making DDI predictions more efficient and accurate. 展开更多
关键词 deep learning drug-drug interactions graph neural network knowledge graph multimodal deep learning neural network
原文传递
Enhancing 3D Reconstruction Accuracy of FIB Tomography Data Using Multi‑voltage Images and Multimodal Machine Learning
9
作者 Trushal Sardhara Alexander Shkurmanov +5 位作者 Yong Li Lukas Riedel Shan Shi Christian J.Cyron Roland C.Aydin Martin Ritter 《Nanomanufacturing and Metrology》 EI 2024年第1期48-60,共13页
FIB-SEM tomography is a powerful technique that integrates a focused ion beam(FIB)and a scanning electron microscope(SEM)to capture high-resolution imaging data of nanostructures.This approach involves collecting in-p... FIB-SEM tomography is a powerful technique that integrates a focused ion beam(FIB)and a scanning electron microscope(SEM)to capture high-resolution imaging data of nanostructures.This approach involves collecting in-plane SEM imagesand using FIB to remove material layers for imaging subsequent planes,thereby producing image stacks.However,theseimage stacks in FIB-SEM tomography are subject to the shine-through effect,which makes structures visible from theposterior regions of the current plane.This artifact introduces an ambiguity between image intensity and structures in thecurrent plane,making conventional segmentation methods such as thresholding or the k-means algorithm insufficient.Inthis study,we propose a multimodal machine learning approach that combines intensity information obtained at differentelectron beam accelerating voltages to improve the three-dimensional(3D)reconstruction of nanostructures.By treatingthe increased shine-through effect at higher accelerating voltages as a form of additional information,the proposed methodsignificantly improves segmentation accuracy and leads to more precise 3D reconstructions for real FIB tomography data. 展开更多
关键词 Multimodal machine learning Multi-voltage images FIB-SEM Overdeterministic systems 3D reconstruction FIB tomography
原文传递
Deep multimodal learning for municipal solid waste sorting 被引量:2
10
作者 LU Gang WANG YuanBin +2 位作者 XU HuXiu YANG HuaYong ZOU Jun 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2022年第2期324-335,共12页
Automated waste sorting can dramatically increase waste sorting efficiency and reduce its regulation cost. Most of the current methods only use a single modality such as image data or acoustic data for waste classific... Automated waste sorting can dramatically increase waste sorting efficiency and reduce its regulation cost. Most of the current methods only use a single modality such as image data or acoustic data for waste classification, which makes it difficult to classify mixed and confusable wastes. In these complex situations, using multiple modalities becomes necessary to achieve a high classification accuracy. Traditionally, the fusion of multiple modalities has been limited by fixed handcrafted features. In this study, the deep-learning approach was applied to the multimodal fusion at the feature level for municipal solid-waste sorting.More specifically, the pre-trained VGG16 and one-dimensional convolutional neural networks(1 D CNNs) were utilized to extract features from visual data and acoustic data, respectively. These deeply learned features were then fused in the fully connected layers for classification. The results of comparative experiments proved that the proposed method was superior to the single-modality methods. Additionally, the feature-based fusion strategy performed better than the decision-based strategy with deeply learned features. 展开更多
关键词 deep multimodal learning municipal waste sorting multimodal fusion convolutional neural networks
原文传递
Label distribution for multimodal machine learning 被引量:1
11
作者 Yi REN Ning XU +1 位作者 Miaogen LING Xin GENG 《Frontiers of Computer Science》 SCIE EI CSCD 2022年第1期33-43,共11页
Multimodal machine learning(MML)aims to understand the world from multiple related modalities.It has attracted much attention as multimodal data has become increasingly available in real-world application.It is shown ... Multimodal machine learning(MML)aims to understand the world from multiple related modalities.It has attracted much attention as multimodal data has become increasingly available in real-world application.It is shown that MML can perform better than single-modal machine learning,since multi-modalities containing more information which could complement each other.However,it is a key challenge to fuse the multi-modalities in MML.Different from previous work,we further consider the side-information,which reflects the situation and influences the fusion of multi-modalities.We recover multimodal label distribution(MLD)by leveraging the side-information,representing the degree to which each modality contributes to describing the instance.Accordingly,a novel framework named multimodal label distribution learning(MLDL)is proposed to recover the MLD,and fuse the multimodalities with its guidance to learn an in-depth understanding of the jointly feature representation.Moreover,two versions of MLDL are proposed to deal with the sequential data.Experiments on multimodal sentiment analysis and disease prediction show that the proposed approaches perform favorably against state-of-the-art methods. 展开更多
关键词 multimodal machine learning label distribution learning sentiment analysis disease prediction
原文传递
Brain-inspired multimodal learning based on neural networks 被引量:1
12
作者 Chang Liu Fuchun Sun Bo Zhang 《Translational Neuroscience and Clinics》 2018年第1期61-72,共12页
Modern computational models have leveraged biological advances in human brain research. This study addresses the problem of multimodal learning with the help of brain-inspired models. Specifically, a unified multimoda... Modern computational models have leveraged biological advances in human brain research. This study addresses the problem of multimodal learning with the help of brain-inspired models. Specifically, a unified multimodal learning architecture is proposed based on deep neural networks, which are inspired by the biology of the visual cortex of the human brain. This unified framework is validated by two practical multimodal learning tasks: image captioning, involving visual and natural language signals, and visual-haptic fusion, involving haptic and visual signals. Extensive experiments are conducted under the framework, and competitive results are achieved. 展开更多
关键词 multimodal learning brain-inspired learning deep learning neural networks
原文传递
Federated Learning on Multimodal Data:A Comprehensive Survey
13
作者 Yi-Ming Lin Yuan Gao +3 位作者 Mao-Guo Gong Si-Jia Zhang Yuan-Qiao Zhang Zhi-Yuan Li 《Machine Intelligence Research》 EI CSCD 2023年第4期539-553,共15页
With the growing awareness of data privacy,federated learning(FL)has gained increasing attention in recent years as a major paradigm for training models with privacy protection in mind,which allows building models in ... With the growing awareness of data privacy,federated learning(FL)has gained increasing attention in recent years as a major paradigm for training models with privacy protection in mind,which allows building models in a collaborative but private way without exchanging data.However,most FL clients are currently unimodal.With the rise of edge computing,various types of sensors and wearable devices generate a large amount of data from different modalities,which has inspired research efforts in multimodal federated learning(MMFL).In this survey,we explore the area of MMFL to address the fundamental challenges of FL on multimodal data.First,we analyse the key motivations for MMFL.Second,the currently proposed MMFL methods are technically classified according to the modality distributions and modality annotations in MMFL.Then,we discuss the datasets and application scenarios of MMFL.Finally,we highlight the limitations and challenges of MMFL and provide insights and methods for future research. 展开更多
关键词 Federated learning multimodal learning heterogeneous data edge computing collaborative learning
原文传递
VLP:A Survey on Vision-language Pre-training 被引量:5
14
作者 Fei-Long Chen Du-Zhen Zhang +4 位作者 Ming-Lun Han Xiu-Yi Chen Jing Shi Shuang Xu Bo Xu 《Machine Intelligence Research》 EI CSCD 2023年第1期38-56,共19页
In the past few years,the emergence of pre-training models has brought uni-modal fields such as computer vision(CV)and natural language processing(NLP)to a new era.Substantial works have shown that they are beneficial... In the past few years,the emergence of pre-training models has brought uni-modal fields such as computer vision(CV)and natural language processing(NLP)to a new era.Substantial works have shown that they are beneficial for downstream uni-modal tasks and avoid training a new model from scratch.So can such pre-trained models be applied to multi-modal tasks?Researchers have ex-plored this problem and made significant progress.This paper surveys recent advances and new frontiers in vision-language pre-training(VLP),including image-text and video-text pre-training.To give readers a better overall grasp of VLP,we first review its recent ad-vances in five aspects:feature extraction,model architecture,pre-training objectives,pre-training datasets,and downstream tasks.Then,we summarize the specific VLP models in detail.Finally,we discuss the new frontiers in VLP.To the best of our knowledge,this is the first survey focused on VLP.We hope that this survey can shed light on future research in the VLP field. 展开更多
关键词 Vision and language pre-training TRANSFORMERS multimodal learning representation learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部