期刊文献+

《Machine Intelligence Research》 CSCD

作品数120被引量56H指数4
International Journal of Automation and computing is a publication of Institute of Automation, the C...查看详情>>
  • 曾用名 国际自动化与计算杂志(英文版)
  • 主办单位中国科学院自动化研究所
  • 国际标准连续出版物号2731-538X
  • 国内统一连续出版物号10-1799/TP
  • 出版周期双月刊
共找到120篇文章
< 1 2 8 >
每页显示 20 50 100
Federated Learning with Privacy-preserving and Model IP-right-protection 被引量:1
1
作者 Qiang Yang Anbu Huang +5 位作者 Lixin Fan Chee Seng Chan Jian Han Lim Kam Woh Ng Ding Sheng Ong Bowen Li 《Machine Intelligence Research》 EI CSCD 2023年第1期19-37,共19页
In the past decades,artificial intelligence(AI)has achieved unprecedented success,where statistical models become the central entity in AI.However,the centralized training and inference paradigm for building and using... In the past decades,artificial intelligence(AI)has achieved unprecedented success,where statistical models become the central entity in AI.However,the centralized training and inference paradigm for building and using these models is facing more and more privacy and legal challenges.To bridge the gap between data privacy and the need for data fusion,an emerging AI paradigm feder-ated learning(FL)has emerged as an approach for solving data silos and data privacy problems.Based on secure distributed AI,feder-ated learning emphasizes data security throughout the lifecycle,which includes the following steps:data preprocessing,training,evalu-ation,and deployments.FL keeps data security by using methods,such as secure multi-party computation(MPC),differential privacy,and hardware solutions,to build and use distributed multiple-party machine-learning systems and statistical models over different data sources.Besides data privacy concerns,we argue that the concept of“model”matters,when developing and deploying federated models,they are easy to expose to various kinds of risks including plagiarism,illegal copy,and misuse.To address these issues,we introduce FedIPR,a novel ownership verification scheme,by embedding watermarks into FL models to verify the ownership of FL models and protect model intellectual property rights(IPR or IP-right for short).While security is at the core of FL,there are still many articles re-ferred to distributed machine learning with no security guarantee as“federated learning”,which are not satisfied with the FL definition supposed to be.To this end,in this paper,we reiterate the concept of federated learning and propose secure federated learning(SFL),where the ultimate goal is to build trustworthy and safe AI with strong privacy-preserving and IP-right-preserving.We provide a com-prehensive overview of existing works,including threats,attacks,and defenses in each phase of SFL from the lifecycle perspective. 展开更多
关键词 Federated learning privacy-preserving machine learning SECURITY decentralized learning intellectual property protection
原文传递
Multimodal Fusion of Brain Imaging Data: Methods and Applications
2
作者 Na Luo Weiyang Shi +2 位作者 Zhengyi Yang Ming Song Tianzi Jiang 《Machine Intelligence Research》 EI CSCD 2024年第1期136-152,共17页
Neuroimaging data typically include multiple modalities,such as structural or functional magnetic resonance imaging,dif-fusion tensor imaging,and positron emission tomography,which provide multiple views for observing... Neuroimaging data typically include multiple modalities,such as structural or functional magnetic resonance imaging,dif-fusion tensor imaging,and positron emission tomography,which provide multiple views for observing and analyzing the brain.To lever-age the complementary representations of different modalities,multimodal fusion is consequently needed to dig out both inter-modality and intra-modality information.With the exploited rich information,it is becoming popular to combine multiple modality data to ex-plore the structural and functional characteristics of the brain in both health and disease status.In this paper,we first review a wide spectrum of advanced machine learning methodologies for fusing multimodal brain imaging data,broadly categorized into unsupervised and supervised learning strategies.Followed by this,some representative applications are discussed,including how they help to under-stand the brain arealization,how they improve the prediction of behavioral phenotypes and brain aging,and how they accelerate the biomarker exploration of brain diseases.Finally,we discuss some exciting emerging trends and important future directions.Collectively,we intend to offer a comprehensive overview of brain imaging fusion methods and their successful applications,along with the chal-lenges imposed by multi-scale and big data,which arises an urgent demand on developing new models and platforms. 展开更多
关键词 Multimodal fusion supervised learning unsupervised learning brain atlas COGNITION brain disorders
原文传递
OTB-morph:One-time Biometrics via Morphing
3
作者 Mahdi Ghafourian Julian Fierrez +2 位作者 Ruben Vera-Rodriguez Aythami Morales Ignacio Serna 《Machine Intelligence Research》 EI CSCD 2023年第6期855-871,共17页
Cancelable biometrics are a group of techniques to transform the input biometric to an irreversible feature intentionally using a transformation function and usually a key in order to provide security and privacy in b... Cancelable biometrics are a group of techniques to transform the input biometric to an irreversible feature intentionally using a transformation function and usually a key in order to provide security and privacy in biometric recognition systems.This transformation is repeatable enabling subsequent biometric comparisons.This paper introduces a new idea to be exploited as a transformation function for cancelable biometrics aimed at protecting templates against iterative optimization attacks.Our proposed scheme is based on time-varying keys(random biometrics in our case)and morphing transformations.An experimental implementation of the proposed scheme is given for face biometrics.The results confirm that the proposed approach is able to withstand leakage attacks while improving the recognition performance. 展开更多
关键词 BIOMETRICS face recognition template protection MORPHING SECURITY
原文传递
A Simple yet Effective Framework for Active Learning to Rank
4
作者 Qingzhong Wang Haifang Li +7 位作者 Haoyi Xiong Wen Wang Jiang Bian Yu Lu Shuaiqiang Wang Zhicong Cheng Dejing Dou Dawei Yin 《Machine Intelligence Research》 EI CSCD 2024年第1期169-183,共15页
While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world's largest Chinese search engine serving more than hundreds of millions of daily active... While China has become the largest online market in the world with approximately 1 billion internet users,Baidu runs the world's largest Chinese search engine serving more than hundreds of millions of daily active users and responding to billions of queries per day.To handle the diverse query requests from users at the web-scale,Baidu has made tremendous efforts in understanding users'queries,retrieving relevant content from a pool of trillions of webpages,and ranking the most relevant webpages on the top of the res-ults.Among the components used in Baidu search,learning to rank(LTR)plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR models.To reduce the costs and time con-sumption of query/webpage labelling,we study the problem of active learning to rank(active LTR)that selects unlabeled queries for an-notation and training in this work.Specifically,we first investigate the criterion-Ranking entropy(RE)characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints,using a query-by-com-mittee(QBC)method.Then,we explore a new criterion namely prediction variances(PV)that measures the variance of prediction res-ults for all relevant webpages under a query.Our empirical studies find that RE may favor low-frequency queries from the pool for la-belling while PV prioritizes high-frequency queries more.Finally,we combine these two complementary criteria as the sample selection strategies for active learning.Extensive experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models to achieve higher discounted cumulative gain(i.e.,the relative improvement DCG4=1.38%)with the same budgeted labellingefforts. 展开更多
关键词 SEARCH information retrieval learning to rank active learning query by committee
原文传递
Sharing Weights in Shallow Layers via Rotation Group Equivariant Convolutions 被引量:1
5
作者 Zhiqiang Chen Ting-Bing Xu +1 位作者 Jinpeng Li Huiguang He 《Machine Intelligence Research》 EI CSCD 2022年第2期115-126,共12页
The convolution operation possesses the characteristic of translation group equivariance. To achieve more group equivariances, rotation group equivariant convolutions(RGEC) are proposed to acquire both translation and... The convolution operation possesses the characteristic of translation group equivariance. To achieve more group equivariances, rotation group equivariant convolutions(RGEC) are proposed to acquire both translation and rotation group equivariances.However, previous work paid more attention to the number of parameters and usually ignored other resource costs. In this paper, we construct our networks without introducing extra resource costs. Specifically, a convolution kernel is rotated to different orientations for feature extractions of multiple channels. Meanwhile, much fewer kernels than previous works are used to ensure that the output channel does not increase. To further enhance the orthogonality of kernels in different orientations, we construct the non-maximum-suppression loss on the rotation dimension to suppress the other directions except the most activated one. Considering that the low-level-features benefit more from the rotational symmetry, we only share weights in the shallow layers(SWSL) via RGEC. Extensive experiments on multiple datasets(i.e., Image Net, CIFAR, and MNIST) demonstrate that SWSL can effectively benefit from the higher-degree weight sharing and improve the performances of various networks, including plain and Res Net architectures. Meanwhile, the convolutional kernels and parameters are much fewer(e.g., 75%, 87.5% fewer) in the shallow layers, and no extra computation costs are introduced. 展开更多
关键词 Convolutional neural networks(CNNs) group equivariance higher-degree weight sharing parameter efficiency
原文传递
Editorial for Special Issue on Commonsense Knowledge and Reasoning:Representation,Acquisition and Applications
6
作者 Kang Liu Yangqiu Song Jeff Z.Pan 《Machine Intelligence Research》 EI CSCD 2024年第2期215-216,共2页
Commonsense knowledge is an important resource for humans to understand the meanings or semantics of the data.The ability to learn and own commonsense knowledge is one of the major gaps between humans and machines.Alt... Commonsense knowledge is an important resource for humans to understand the meanings or semantics of the data.The ability to learn and own commonsense knowledge is one of the major gaps between humans and machines.Although recent research progresses about deep learning,like Transformer,pre-trained models,etc.,have made amazing breakthroughs in many fields,including computer version,natural language learning,etc.,letting the machines have rich commonsense knowledge and possess the reasoning ability is still difficult and under-resolved. 展开更多
关键词 OWN LET BREAKTHROUGH
原文传递
Corporate Credit Ratings Based on Hierarchical Heterogeneous Graph Neural Networks
7
作者 Bo-Jing Feng Xi Cheng +1 位作者 Hao-Nan Xu Wen-Fang Xue 《Machine Intelligence Research》 EI CSCD 2024年第2期257-271,共15页
order to help investors understand the credit status of target corporations and reduce investment risks,the corporate credit rating model has become an important evaluation tool in the financial market.These models ar... order to help investors understand the credit status of target corporations and reduce investment risks,the corporate credit rating model has become an important evaluation tool in the financial market.These models are based on statistical learning,machine learning and deep learning especially graph neural networks(GNNs).However,we found that only few models take the hierarchy,heterogeneity or unlabeled data into account in the actual corporate credit rating process.Therefore,we propose a novel framework named hierarchical heterogeneous graph neural networks(HHGNN),which can fully model the hierarchy of corporate features and the heterogeneity of relationships between corporations.In addition,we design an adversarial learning block to make full use of the rich unlabeled samples in the financial data.Extensive experiments conducted on the public-listed corporate rating dataset prove that HHGNN achieves SOTA compared to the baseline methods. 展开更多
关键词 Corporate credit rating hierarchical relation heterogeneous graph neural networks adversarial learning
原文传递
Image De-occlusion via Event-enhanced Multi-modal Fusion Hybrid Network
8
作者 Si-Qi Li Yue Gao Qiong-Hai Dai 《Machine Intelligence Research》 EI CSCD 2022年第4期307-318,共12页
Seeing through dense occlusions and reconstructing scene images is an important but challenging task.Traditional framebased image de-occlusion methods may lead to fatal errors when facing extremely dense occlusions du... Seeing through dense occlusions and reconstructing scene images is an important but challenging task.Traditional framebased image de-occlusion methods may lead to fatal errors when facing extremely dense occlusions due to the lack of valid information available from the limited input occluded frames.Event cameras are bio-inspired vision sensors that record the brightness changes at each pixel asynchronously with high temporal resolution.However,synthesizing images solely from event streams is ill-posed since only the brightness changes are recorded in the event stream,and the initial brightness is unknown.In this paper,we propose an event-enhanced multi-modal fusion hybrid network for image de-occlusion,which uses event streams to provide complete scene information and frames to provide color and texture information.An event stream encoder based on the spiking neural network(SNN)is proposed to encode and denoise the event stream efficiently.A comparison loss is proposed to generate clearer results.Experimental results on a largescale event-based and frame-based image de-occlusion dataset demonstrate that our proposed method achieves state-of-the-art performance. 展开更多
关键词 Event camera multi-modal fusion image de-occlusion spiking neural network(SNN) image reconstruction
原文传递
Swarm Intelligence Research:From Bio-inspired Single-population Swarm Intelligence to Human-machine Hybrid Swarm Intelligence 被引量:1
9
作者 Guo-Yin Wang Dong-Dong Cheng +1 位作者 De-You Xia Hai-Huan Jiang 《Machine Intelligence Research》 EI CSCD 2023年第1期121-144,共24页
Swarm intelligence has become a hot research field of artificial intelligence.Considering the importance of swarm intelli-gence for the future development of artificial intelligence,we discuss and analyze swarm intell... Swarm intelligence has become a hot research field of artificial intelligence.Considering the importance of swarm intelli-gence for the future development of artificial intelligence,we discuss and analyze swarm intelligence from a broader and deeper perspect-ive.In a broader sense,we are talking about not only bio-inspired swarm intelligence,but also human-machine hybrid swarm intelli-gence.In a deeper sense,we discuss the research using a three-layer hierarchy:in the first layer,we divide the research of swarm intelli-gence into bio-inspired swarm intelligence and human-machine hybrid swarm intelligence;in the second layer,the bio-inspired swarm intelligence is divided into single-population swarm intelligence and multi-population swarm intelligence;and in the third layer,we re-view single-population,multi-population and human-machine hybrid models from different perspectives.Single-population swarm intel-ligence is inspired by biological intelligence.To further solve complex optimization problems,researchers have made preliminary explor-ations in multi-population swarm intelligence.However,it is difficult for bio-inspired swarm intelligence to realize dynamic cognitive in-telligent behavior that meets the needs of human cognition.Researchers have introduced human intelligence into computing systems and proposed human-machine hybrid swarm intelligence.In addition to single-population swarm intelligence,we thoroughly review multi-population and human-machine hybrid swarm intelligence in this paper.We also discuss the applications of swarm intelligence in optimization,big data analysis,unmanned systems and other fields.Finally,we discuss future research directions and key issues to be studied in swarm intelligence. 展开更多
关键词 Swarm intelligence single-population MULTI-POPULATION human-machine hybrid MULTI-GRANULARITY
原文传递
A Framework for Distributed Semi-supervised Learning Using Single-layer Feedforward Networks 被引量:1
10
作者 Jin Xie San-Yang Liu Jia-Xi Chen 《Machine Intelligence Research》 EI CSCD 2022年第1期63-74,共12页
This paper aims to propose a framework for manifold regularization(MR) based distributed semi-supervised learning(DSSL) using single layer feed-forward neural network(SLFNN). The proposed framework, denoted as DSSL-SL... This paper aims to propose a framework for manifold regularization(MR) based distributed semi-supervised learning(DSSL) using single layer feed-forward neural network(SLFNN). The proposed framework, denoted as DSSL-SLFNN is based on the SLFNN, MR framework, and distributed optimization strategy. Then, a series of algorithms are derived to solve DSSL problems. In DSSL problems, data consisting of labeled and unlabeled samples are distributed over a communication network, where each node has only access to its own data and can only communicate with its neighbors. In some scenarios, DSSL problems cannot be solved by centralized algorithms. According to the DSSL-SLFNN framework, each node over the communication network exchanges the initial parameters of the SLFNN with the same basis functions for semi-supervised learning(SSL). All nodes calculate the global optimal coefficients of the SLFNN by using distributed datasets and local updates. During the learning process, each node only exchanges local coefficients with its neighbors rather than raw data. It means that DSSL-SLFNN based algorithms work in a fully distributed fashion and are privacy preserving methods. Finally, several simulations are presented to show the efficiency of the proposed framework and the derived algorithms. 展开更多
关键词 Distributed learning(DL) semi-supervised learning(SSL) manifold regularization(MR) single layer feed-forward neural network(SLFNN) privacy preserving
原文传递
Text Difficulty Study:Do Machines Behave the Same as Humans Regarding Text Difficulty?
11
作者 Bowen Chen Xiao Ding +4 位作者 Yi Zhao Bo Fu Tingmao Lin Bing Qin Ting Liu 《Machine Intelligence Research》 EI CSCD 2024年第2期283-293,共11页
With the emergence of pre-trained models,current neural networks are able to give task performance that is comparable to humans.However,we know little about the fundamental working mechanism of pre-trained models in w... With the emergence of pre-trained models,current neural networks are able to give task performance that is comparable to humans.However,we know little about the fundamental working mechanism of pre-trained models in which we do not know how they approach such performance and how the task is solved by the model.For example,given a task,human learns from easy to hard,whereas the model learns randomly.Undeniably,difficulty-insensitive learning leads to great success in natural language processing(NLP),but little attention has been paid to the effect of text difficulty in NLP.We propose a human learning matching index(HLM Index)to investigate the effect of text difficulty.Experiment results show:1)LSTM gives more human-like learning behavior than BERT.Additionally,UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria.Among nine tasks,some tasks’performance is related to text difficulty,whereas others are not.2)Model trained on easy data performs best in both easy and medium test data,whereas trained on hard data only performs well on hard test data.3)Train the model from easy to hard,leading to quicker convergence. 展开更多
关键词 Cognition inspired natural language processing PSYCHOLINGUISTICS explainability text difficulty curriculum learning
原文传递
Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
12
作者 Liqiang Jing Yiren Li +3 位作者 Junhao Xu Yongcan Yu Pei Shen Xuemeng Song 《Machine Intelligence Research》 EI CSCD 2023年第2期289-298,共10页
Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MM... Multimodal sentence summarization(MMSS)is a new yet challenging task that aims to generate a concise summary of a long sentence and its corresponding image.Although existing methods have gained promising success in MMSS,they overlook the powerful generation ability of generative pre-trained language models(GPLMs),which have shown to be effective in many text generation tasks.To fill this research gap,we propose to using GPLMs to promote the performance of MMSS.Notably,adopting GPLMs to solve MMSS inevitably faces two challenges:1)What fusion strategy should we use to inject visual information into GPLMs properly?2)How to keep the GPLM′s generation ability intact to the utmost extent when the visual feature is injected into the GPLM.To address these two challenges,we propose a vision enhanced generative pre-trained language model for MMSS,dubbed as Vision-GPLM.In Vision-GPLM,we obtain features of visual and textual modalities with two separate encoders and utilize a text decoder to produce a summary.In particular,we utilize multi-head attention to fuse the features extracted from visual and textual modalities to inject the visual feature into the GPLM.Meanwhile,we train Vision-GPLM in two stages:the vision-oriented pre-training stage and fine-tuning stage.In the vision-oriented pre-training stage,we particularly train the visual encoder by the masked language model task while the other components are frozen,aiming to obtain homogeneous representations of text and image.In the fine-tuning stage,we train all the components of Vision-GPLM by the MMSS task.Extensive experiments on a public MMSS dataset verify the superiority of our model over existing baselines. 展开更多
关键词 Multimodal sentence summarization(MMSS) generative pre-trained language model(GPLM) natural language generation deep learning artificial intelligence
原文传递
Rethinking Global Context in Crowd Counting
13
作者 Guolei Sun Yun Liu +3 位作者 Thomas Probst Danda Pani Paudel Nikola Popovic Luc Van Gool 《Machine Intelligence Research》 EI CSCD 2024年第4期640-651,共12页
This paper investigates the role of global context for crowd counting.Specifically,a pure transformer is used to extract features with global information from overlapping image patches.Inspired by classification,we ad... This paper investigates the role of global context for crowd counting.Specifically,a pure transformer is used to extract features with global information from overlapping image patches.Inspired by classification,we add a context token to the input sequence,to facilitate information exchange with tokens corresponding to image patches throughout transformer layers.Due to the fact that transformers do not explicitly model the tried-and-true channel-wise interactions,we propose a token-attention module(TAM)to recalibrate encoded features through channel-wise attention informed by the context token.Beyond that,it is adopted to predict the total person count of the image through regression-token module(RTM).Extensive experiments on various datasets,including ShanghaiTech,UCFQNRF,JHU-CROWD++and NWPU,demonstrate that the proposed context extraction techniques can significantly improve the performanceover the baselines. 展开更多
关键词 Crowd counting vision transformer global context ATTENTION density map.
原文传递
The Life Cycle of Knowledge in Big Language Models:A Survey 被引量:1
14
作者 Boxi Cao Hongyu Lin +1 位作者 Xianpei Han Le Sun 《Machine Intelligence Research》 EI CSCD 2024年第2期217-238,共22页
Knowledge plays a critical role in artificial intelligence.Recently,the extensive success of pre-trained language models(PLMs)has raised significant attention about how knowledge can be acquired,maintained,updated and... Knowledge plays a critical role in artificial intelligence.Recently,the extensive success of pre-trained language models(PLMs)has raised significant attention about how knowledge can be acquired,maintained,updated and used by language models.Despite the enormous amount of related studies,there is still a lack of a unified view of how knowledge circulates within language models throughout the learning,tuning,and application processes,which may prevent us from further understanding the connections between current progress or realizing existing limitations.In this survey,we revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods,and investigating how knowledge circulates when it is built,maintained and used.To this end,we systematically review existing studies of each period of the knowledge life cycle,summarize the main challenges and current limitations,and discuss future directions. 展开更多
关键词 Pre-trained language model knowledge acquisition knowledge representation knowledge probing knowledge editing knowledge application
原文传递
Interpretability of Neural Networks Based on Game-theoretic Interactions
15
作者 Huilin Zhou Jie Ren +3 位作者 Huiqi Deng Xu Cheng Jinpeng Zhang Quanshi Zhang 《Machine Intelligence Research》 EI CSCD 2024年第4期718-739,共22页
This paper introduces the system of game-theoretic interactions,which connects both the explanation of knowledge encoded in a deep neural networks(DNN)and the explanation of the representation power of a DNN.In this s... This paper introduces the system of game-theoretic interactions,which connects both the explanation of knowledge encoded in a deep neural networks(DNN)and the explanation of the representation power of a DNN.In this system,we define two gametheoretic interaction indexes,namely the multi-order interaction and the multivariate interaction.More crucially,we use these interaction indexes to explain feature representations encoded in a DNN from the following four aspects:(1)Quantifying knowledge concepts encoded by a DNN;(2)Exploring how a DNN encodes visual concepts,and extracting prototypical concepts encoded in the DNN;(3)Learning optimal baseline values for the Shapley value,and providing a unified perspective to compare fourteen different attribution methods;(4)Theoretically explaining the representation bottleneck of DNNs.Furthermore,we prove the relationship between the interaction encoded in a DNN and the representation power of a DNN(e.g.,generalization power,adversarial transferability,and adversarial robustness).In this way,game-theoretic interactions successfully bridge the gap between“the explanation of knowledge concepts encoded in a DNN”and"the explanation of the representation capacity of a DNN"as a unified explanation. 展开更多
关键词 Model interpretability and transparency explainable AI game theory INTERACTION deep learning.
原文传递
GraphFlow+:Exploiting Conversation Flow in Conversational Machine Comprehension with Graph Neural Networks
16
作者 Jing Hu Lingfei Wu +2 位作者 Yu Chen Po Hu Mohammed J.Zaki 《Machine Intelligence Research》 EI CSCD 2024年第2期272-282,共11页
The conversation machine comprehension(MC)task aims to answer questions in the multi-turn conversation for a single passage.However,recent approaches don’t exploit information from historical conversations effectivel... The conversation machine comprehension(MC)task aims to answer questions in the multi-turn conversation for a single passage.However,recent approaches don’t exploit information from historical conversations effectively,which results in some references and ellipsis in the current question cannot be recognized.In addition,these methods do not consider the rich semantic relationships between words when reasoning about the passage text.In this paper,we propose a novel model GraphFlow+,which constructs a context graph for each conversation turn and uses a unique recurrent graph neural network(GNN)to model the temporal dependencies between the context graphs of each turn.Specifically,we exploit three different ways to construct text graphs,including the dynamic graph,static graph,and hybrid graph that combines the two.Our experiments on CoQA,QuAC and DoQA show that the GraphFlow+model can outperform the state-of-the-art approaches. 展开更多
关键词 Conversational machine comprehension(MC) reading comprehension question answering graph neural networks(GNNs) natural language processing(NLP)
原文传递
Federated Learning on Multimodal Data:A Comprehensive Survey
17
作者 Yi-Ming Lin Yuan Gao +3 位作者 Mao-Guo Gong Si-Jia Zhang Yuan-Qiao Zhang Zhi-Yuan Li 《Machine Intelligence Research》 EI CSCD 2023年第4期539-553,共15页
With the growing awareness of data privacy,federated learning(FL)has gained increasing attention in recent years as a major paradigm for training models with privacy protection in mind,which allows building models in ... With the growing awareness of data privacy,federated learning(FL)has gained increasing attention in recent years as a major paradigm for training models with privacy protection in mind,which allows building models in a collaborative but private way without exchanging data.However,most FL clients are currently unimodal.With the rise of edge computing,various types of sensors and wearable devices generate a large amount of data from different modalities,which has inspired research efforts in multimodal federated learning(MMFL).In this survey,we explore the area of MMFL to address the fundamental challenges of FL on multimodal data.First,we analyse the key motivations for MMFL.Second,the currently proposed MMFL methods are technically classified according to the modality distributions and modality annotations in MMFL.Then,we discuss the datasets and application scenarios of MMFL.Finally,we highlight the limitations and challenges of MMFL and provide insights and methods for future research. 展开更多
关键词 Federated learning multimodal learning heterogeneous data edge computing collaborative learning
原文传递
Paradigm Shift in Natural Language Processing 被引量:9
18
作者 Tian-Xiang Sun Xiang-Yang Liu +1 位作者 Xi-Peng Qiu Xuan-Jing Huang 《Machine Intelligence Research》 EI CSCD 2022年第3期169-183,共15页
In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of... In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, named entity recognition (NER), and chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have witnessed a rising trend of paradigm shift, which is solving one NLP task in a new paradigm by reformulating the task. The paradigm shift has achieved great success on many tasks and is becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks. 展开更多
关键词 Natural language processing pre-trained language models deep learning sequence-to-sequence paradigm shift
原文传递
Visual Superordinate Abstraction for Robust Concept Learning
19
作者 Qi Zheng Chao-Yue Wang +1 位作者 Dadong Wang Da-Cheng Tao 《Machine Intelligence Research》 EI CSCD 2023年第1期79-91,共13页
Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are st... Concept learning constructs visual representations that are connected to linguistic semantics, which is fundamental to vision-language tasks. Although promising progress has been made, existing concept learners are still vulnerable to attribute perturbations and out-of-distribution compositions during inference. We ascribe the bottleneck to a failure to explore the intrinsic semantic hierarchy of visual concepts, e.g., {red, blue,···} ∈“color” subspace yet cube ∈“shape”. In this paper, we propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces(i.e., visual superordinates). With only natural visual question answering data, our model first acquires the semantic hierarchy from a linguistic view and then explores mutually exclusive visual superordinates under the guidance of linguistic hierarchy. In addition, a quasi-center visual concept clustering and superordinate shortcut learning schemes are proposed to enhance the discrimination and independence of concepts within each visual superordinate. Experiments demonstrate the superiority of the proposed framework under diverse settings, which increases the overall answering accuracy relatively by 7.5% for reasoning with perturbations and 15.6% for compositional generalization tests. 展开更多
关键词 Concept learning visual question answering weakly-supervised learning multi-modal learning curriculum learning
原文传递
Editorial for Special Issue on Large-scale Pre-training:Data,Models,and Fine-tuning
20
作者 Ji-Rong Wen Ji-Rong Wen +1 位作者 Zi Huang Hanwang Zhang 《Machine Intelligence Research》 EI CSCD 2023年第2期145-146,共2页
In recent years,there has been a surge of interest and rapid development in large-scale pre-training due to the explosive growth of both data and model parameters.Large-scale training has achieved impressive performan... In recent years,there has been a surge of interest and rapid development in large-scale pre-training due to the explosive growth of both data and model parameters.Large-scale training has achieved impressive performance milestones across a wide range of practical problems,including natural language processing,computer vision,recommendation systems,robotics,and other basic research areas like bioinformatics. 展开更多
关键词 COMPUTER SCALE PARAMETERS
原文传递
上一页 1 2 8 下一页 到第
使用帮助 返回顶部