Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features.Our exploration of how the genomes orchestrate the formation and maintenance of each cell,and co...Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features.Our exploration of how the genomes orchestrate the formation and maintenance of each cell,and control the cellular phenotypes of various organismsis,is both captivating and intricate.Since the inception of the first single-cell RNA technology,technologies related to single-cell sequencing have experienced rapid advancements in recent years.These technologies have expanded horizontally to include single-cell genome,epigenome,proteome,and metabolome,while vertically,they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening.Single-cell omics represent a groundbreaking advancement in the biomedical field,offering profound insights into the understanding of complex diseases,including cancers.Here,we comprehensively summarize recent advances in single-cell omics technologies,with a specific focus on the methodology section.This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.展开更多
Multimodal sensor fusion can make full use of the advantages of various sensors,make up for the shortcomings of a single sensor,achieve information verification or information security through information redundancy,a...Multimodal sensor fusion can make full use of the advantages of various sensors,make up for the shortcomings of a single sensor,achieve information verification or information security through information redundancy,and improve the reliability and safety of the system.Artificial intelligence(AI),referring to the simulation of human intelligence in machines that are programmed to think and learn like humans,represents a pivotal frontier in modern scientific research.With the continuous development and promotion of AI technology in Sensor 4.0 age,multimodal sensor fusion is becoming more and more intelligent and automated,and is expected to go further in the future.With this context,this review article takes a comprehensive look at the recent progress on AI-enhanced multimodal sensors and their integrated devices and systems.Based on the concept and principle of sensor technologies and AI algorithms,the theoretical underpinnings,technological breakthroughs,and pragmatic applications of AI-enhanced multimodal sensors in various fields such as robotics,healthcare,and environmental monitoring are highlighted.Through a comparative study of the dual/tri-modal sensors with and without using AI technologies(especially machine learning and deep learning),AI-enhanced multimodal sensors highlight the potential of AI to improve sensor performance,data processing,and decision-making capabilities.Furthermore,the review analyzes the challenges and opportunities afforded by AI-enhanced multimodal sensors,and offers a prospective outlook on the forthcoming advancements.展开更多
BACKGROUND Phelan-McDermid syndrome(PMS)is a rare genetic disorder characterized by intellectual disability,delayed language development,autism spectrum disorders,motor tone abnormalities,and a high risk of psychiatri...BACKGROUND Phelan-McDermid syndrome(PMS)is a rare genetic disorder characterized by intellectual disability,delayed language development,autism spectrum disorders,motor tone abnormalities,and a high risk of psychiatric symptoms,including bipolar disorder.CASE SUMMARY This report presented an 18-year clinical history of a 36-year-old woman with PMS,marked by intellectual disabilities,social withdrawal,and stereotyped behaviors.Diagnosed with bipolar disorder at the age of 18 years old,she encountered significant treatment challenges,including severe adverse reactions to antipsychotic medications in 2022,which led to speech and functional regression.Through rehabilitation and comprehensive therapy,her condition gradually improved.In 2024,after further treatment,her symptoms stabilized,highlighting the complexities and successes of long-term management.CONCLUSION Effective management of PMS requires a thorough clinical history,genetic testing,and long-term supportive care.展开更多
BACKGROUND Gastric cancer(GC)is a prevalent tumor in the digestive system,with around one million new cases reported annually,ranking it as the third most common malignancy.Reducing pain is a key research focus.This s...BACKGROUND Gastric cancer(GC)is a prevalent tumor in the digestive system,with around one million new cases reported annually,ranking it as the third most common malignancy.Reducing pain is a key research focus.This study evaluates the effect of nalbuphine on the analgesic effect and the expression of pain factors in patients after radical resection.AIM To provide a reference for postoperative analgesia methods.METHODS One hundred eight patients with GC,admitted between January 2022 and June 2024,underwent radical gastrectomy.They received a controlled analgesia pump and a transverse abdominis muscle plane block,divided into two groups of 54 patients in each group.The control group received sufentanil,while the observation group received nalbuphine as an analgesic.Postoperative analgesic effects,pain factor expression,and adverse effects were compared.RESULTS The resting pain and activity pain scores in the observation group at 6,12,24 and 48 hours were significantly lower than those in the control group.Additionally,the number of presses and consumption of the observation group at 48 hours were lower than those of the control group;and the response rate of the observation group was higher than that of the control group(P<0.05).The prostaglandin E2,substance P,and serotonin levels 24 hours after the observation group were lower than those in the control group,and the incidence of adverse reactions was 5.56%lower than 22.22%in the control group(P<0.05).CONCLUSION The findings suggest that nalbuphine enhances postoperative multimodal analgesia in patients with radical GC,effectively improving postoperative analgesic effect,relieving postoperative resting and active pain,and reducing postoperative pain factor expression,demonstrating its potential for clinical application.展开更多
BACKGROUND Pancreatic cancer involving the pancreas neck and body often invades the retroperitoneal vessels,making its radical resection challenging.Multimodal treatment strategies,including neoadjuvant therapy,surger...BACKGROUND Pancreatic cancer involving the pancreas neck and body often invades the retroperitoneal vessels,making its radical resection challenging.Multimodal treatment strategies,including neoadjuvant therapy,surgery,and postoperative adjuvant therapy,are contributing to a paradigm shift in the treatment of pancreatic cancer.This strategy is also promising in the treatment of pancreatic neckbody cancer.AIM To evaluate the feasibility and effectiveness of a multimodal strategy for the treatment of borderline/locally advanced pancreatic neck-body cancer.METHODS From January 2019 to December 2021,we reviewed the demographic characteristics,neoadjuvant and adjuvant treatment data,intraoperative and postoperative variables,and follow-up outcomes of patients who underwent multimodal treatment for pancreatic neck-body cancer in a prospectively collected database of our hospital.This investigation was reported in line with the Preferred Reporting of Case Series in Surgery criteria.RESULTS A total of 11 patients with pancreatic neck-body cancer were included in this study,of whom 6 patients were borderline resectable and 5 were locally advanced.Through multidisciplinary team discussion,all patients received neoadjuvant therapy,of whom 8(73%)patients achieved a partial response and 3 patients maintained stable disease.After multidisciplinary team reassessment,all patients underwent laparoscopic subtotal distal pancreatectomy and portal vein reconstruction and achieved R0 resection.Postoperatively,two patients(18%)developed ascites,and two patients(18%)developed pancreatic fistulae.The median length of stay of the patients was 11 days(range:10-15 days).All patients received postoperative adjuvant therapy.During the follow-up,three patients experienced tumor recurrence,with a median disease-free survival time of 13.3 months and a median overall survival time of 20.5 months.CONCLUSION A multimodal treatment strategy combining neoadjuvant therapy,laparoscopic subtotal distal pancreatectomy,and adjuvant therapy is safe and feasible in patients with pancreatic neck-body cancer.展开更多
BACKGROUND Recent advancements in artificial intelligence(AI)have significantly enhanced the capabilities of endoscopic-assisted diagnosis for gastrointestinal diseases.AI has shown great promise in clinical practice,...BACKGROUND Recent advancements in artificial intelligence(AI)have significantly enhanced the capabilities of endoscopic-assisted diagnosis for gastrointestinal diseases.AI has shown great promise in clinical practice,particularly for diagnostic support,offering real-time insights into complex conditions such as esophageal squamous cell carcinoma.CASE SUMMARY In this study,we introduce a multimodal AI system that successfully identified and delineated a small and flat carcinoma during esophagogastroduodenoscopy,highlighting its potential for early detection of malignancies.The lesion was confirmed as high-grade squamous intraepithelial neoplasia,with pathology results supporting the AI system’s accuracy.The multimodal AI system offers an integrated solution that provides real-time,accurate diagnostic information directly within the endoscopic device interface,allowing for single-monitor use without disrupting endoscopist’s workflow.CONCLUSION This work underscores the transformative potential of AI to enhance endoscopic diagnosis by enabling earlier,more accurate interventions.展开更多
Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.There...Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.Therefore,it is necessary to establish thunderstorm wind gust identification techniques based on multisource high-resolution observations.This paper introduces a new algorithm,called thunderstorm wind gust identification network(TGNet).It leverages multimodal feature fusion to fuse the temporal and spatial features of thunderstorm wind gust events.The shapelet transform is first used to extract the temporal features of wind speeds from automatic weather stations,which is aimed at distinguishing thunderstorm wind gusts from those caused by synoptic-scale systems or typhoons.Then,the encoder,structured upon the U-shaped network(U-Net)and incorporating recurrent residual convolutional blocks(R2U-Net),is employed to extract the corresponding spatial convective characteristics of satellite,radar,and lightning observations.Finally,by using the multimodal deep fusion module based on multi-head cross-attention,the temporal features of wind speed at each automatic weather station are incorporated into the spatial features to obtain 10-minutely classification of thunderstorm wind gusts.TGNet products have high accuracy,with a critical success index reaching 0.77.Compared with those of U-Net and R2U-Net,the false alarm rate of TGNet products decreases by 31.28%and 24.15%,respectively.The new algorithm provides grid products of thunderstorm wind gusts with a spatial resolution of 0.01°,updated every 10minutes.The results are finer and more accurate,thereby helping to improve the accuracy of operational warnings for thunderstorm wind gusts.展开更多
Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this uniq...Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence.展开更多
Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of...Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.展开更多
As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocrea...As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocreate a misleading perception among users. While early research primarily focused on text-based features forfake news detection mechanisms, there has been relatively limited exploration of learning shared representationsin multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal modelfor detecting fake news, which relies on similarity reasoning and adversarial networks. The model employsBidirectional Encoder Representation from Transformers (BERT) and Text Convolutional Neural Network (Text-CNN) for extracting textual features while utilizing the pre-trained Visual Geometry Group 19-layer (VGG-19) toextract visual features. Subsequently, the model establishes similarity representations between the textual featuresextracted by Text-CNN and visual features through similarity learning and reasoning. Finally, these features arefused to enhance the accuracy of fake news detection, and adversarial networks have been employed to investigatethe relationship between fake news and events. This paper validates the proposed model using publicly availablemultimodal datasets from Weibo and Twitter. Experimental results demonstrate that our proposed approachachieves superior performance on Twitter, with an accuracy of 86%, surpassing traditional unimodalmodalmodelsand existing multimodal models. In contrast, the overall better performance of our model on the Weibo datasetsurpasses the benchmark models across multiple metrics. The application of similarity reasoning and adversarialnetworks in multimodal fake news detection significantly enhances detection effectiveness in this paper. However,current research is limited to the fusion of only text and image modalities. Future research directions should aimto further integrate features fromadditionalmodalities to comprehensively represent themultifaceted informationof fake news.展开更多
Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the ...Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the last two decades.Recently,transformer-based Pre-trained Language Models(PLM)have excelled in Natural Language Processing(NLP)tasks by leveraging large-scale training corpora.Increasing the scale of these models enhances performance significantly,introducing abilities like context learning that smaller models lack.The advancement in Large Language Models,exemplified by the development of ChatGPT,has made significant impacts both academically and industrially,capturing widespread societal interest.This survey provides an overview of the development and prospects from Large Language Models(LLM)to Large Multimodal Models(LMM).It first discusses the contributions and technological advancements of LLMs in the field of natural language processing,especially in text generation and language understanding.Then,it turns to the discussion of LMMs,which integrates various data modalities such as text,images,and sound,demonstrating advanced capabilities in understanding and generating cross-modal content,paving new pathways for the adaptability and flexibility of AI systems.Finally,the survey highlights the prospects of LMMs in terms of technological development and application potential,while also pointing out challenges in data integration,cross-modal understanding accuracy,providing a comprehensive perspective on the latest developments in this field.展开更多
BACKGROUND Tuberous sclerosis complex(TSC)and primary lymphedema(PLE)are both rare diseases,and it is even rarer for both to occur in the same patient.In this work,we have provided a detailed description of a patient&...BACKGROUND Tuberous sclerosis complex(TSC)and primary lymphedema(PLE)are both rare diseases,and it is even rarer for both to occur in the same patient.In this work,we have provided a detailed description of a patient's clinical presentation,imaging findings,and treatment.And a retrospective analysis was conducted on 14 published relevant case reports.CASE SUMMARY A 16-year-old male came to our hospital for treatment due to right lower limb swelling.This swelling is already present from birth.The patient’s memory had been progressively declining.Seizures had occurred 1 year prior at an unknown frequency.The patient was diagnosed with TSC combined with PLE through multimodal imaging examination:Computed tomography,magnetic resonance imaging,and lymphoscintigraphy.The patient underwent liposuction.The swelling of the patient's right lower limb significantly improved after surgery.Epilepsy did not occur.after taking antiepileptic drugs and sirolimus.CONCLUSION TSC with PLE is a rare and systemic disease.Imaging can detect lesions of this disease,which are important for diagnosis and treatment.展开更多
Optical endoscopy has become an essential diagnostic and therapeutic approach in modern biomedicine for directly observing organs and tissues deep inside the human body,enabling non-invasive,rapid diagnosis and treatm...Optical endoscopy has become an essential diagnostic and therapeutic approach in modern biomedicine for directly observing organs and tissues deep inside the human body,enabling non-invasive,rapid diagnosis and treatment.Optical fiber endoscopy is highly competitive among various endoscopic imaging techniques due to its high flexibility,compact structure,excellent resolution,and resistance to electromagnetic interference.Over the past decade,endoscopes based on a single multimode optical fiber(MMF)have attracted widespread research interest due to their potential to significantly reduce the footprint of optical fiber endoscopes and enhance imaging capabilities.In comparison with other imaging principles of MMF endoscopes,the scanning imaging method based on the wavefront shaping technique is highly developed and provides benefits including excellent imaging contrast,broad applicability to complex imaging scenarios,and good compatibility with various well-established scanning imaging modalities.In this review,various technical routes to achieve light focusing through MMF and procedures to conduct the scanning imaging of MMF endoscopes are introduced.The advancements in imaging performance enhancements,integrations of various imaging modalities with MMF scanning endoscopes,and applications are summarized.Challenges specific to this endoscopic imaging technology are analyzed,and potential remedies and avenues for future developments are discussed.展开更多
To improve locomotion and operation integration, this paper presents an integrated leg-arm quadruped robot(ILQR) that has a reconfigurable joint. First, the reconfigurable joint is designed and assembled at the end of...To improve locomotion and operation integration, this paper presents an integrated leg-arm quadruped robot(ILQR) that has a reconfigurable joint. First, the reconfigurable joint is designed and assembled at the end of the legarm chain. When the robot performs a task, reconfigurable configuration and mode switching can be achieved using this joint. In contrast from traditional quadruped robots, this robot can stack in a designated area to optimize the occupied volume in a nonworking state. Kinematics modeling and dynamics modeling are established to evaluate the mechanical properties for multiple modes. All working modes of the robot are classified, which can be defined as deployable mode, locomotion mode and operation mode. Based on the stability margin and mechanical modeling, switching analysis and evaluation between each mode is carried out. Finally, the prototype experimental results verify the function realization and switching stability of multimode and provide a design method to integrate and perform multimode for quadruped robots with deployable characteristics.展开更多
BACKGROUND The incidence of patients with early-onset pancreatic cancer(EOPC;age≤50 years at diagnosis)is on the rise,placing a heavy burden on individuals,families,and society.The role of combination therapy includi...BACKGROUND The incidence of patients with early-onset pancreatic cancer(EOPC;age≤50 years at diagnosis)is on the rise,placing a heavy burden on individuals,families,and society.The role of combination therapy including surgery,radiotherapy,and chemotherapy in non-metastatic EOPC is not well-defined.AIM To investigate the treatment patterns and survival outcomes in patients with non-metastatic EOPC.METHODS A total of 277 patients with non-metastatic EOPC who were treated at our institution between 2017 and 2021 were investigated retrospectively.Overall survival(OS),disease-free survival,and progression-free survival were estimated using the Kaplan-Meier method.Univariate and multivariate analyses with the Cox proportional hazards model were used to identify prognostic factors.RESULTS With a median follow-up time of 34.6 months,the 1-year,2-year,and 3-year OS rates for the entire cohort were 84.3%,51.5%,and 27.6%,respectively.The median OS of patients with localized disease who received surgery alone and adjuvant therapy(AT)were 21.2 months and 28.8 months,respectively(P=0.007).The median OS of patients with locally advanced disease who received radiotherapy-based combination therapy(RCT),surgery after neoadjuvant therapy(NAT),and chemotherapy were 28.5 months,25.6 months,and 14.0 months,respectively(P=0.002).The median OS after regional recurrence were 16.0 months,13.4 months,and 8.9 months in the RCT,chemotherapy,and supportive therapy groups,respectively(P=0.035).Multivariate analysis demonstrated that carbohydrate antigen 19-9 level,pathological grade,T-stage,N-stage,and resection were independent prognostic factors for non-metastatic EOPC.CONCLUSION AT improves postoperative survival in localized patients.Surgery after NAT and RCT are the preferred therapeutic options for patients with locally advanced EOPC.展开更多
The fracture toughness of extruded Mg-1Zn-2Y(at.%)alloys,featuring a multimodal microstructure containing fine dynamically recrystallized(DRXed)grains with random crystallographic orientation and coarse-worked grains ...The fracture toughness of extruded Mg-1Zn-2Y(at.%)alloys,featuring a multimodal microstructure containing fine dynamically recrystallized(DRXed)grains with random crystallographic orientation and coarse-worked grains with a strong fiber texture,was investigated.The DRXed grains comprised randomly oriented equiaxedα-Mg grains.In contrast,the worked grains includedα-Mg and long-period stacking ordered(LPSO)phases that extended in the extrusion direction(ED).Both types displayed a strong texture,aligning the(10.10)direction parallel to the ED.The volume fractions of the DRXed and worked grains were controlled by adjusting the extrusion temperature.In the longitudinal-transverse(L-T)orientation,where the loading direction was aligned parallel to the ED,there was a tendency for the conditional fracture toughness,KQ,tended to increase as the volume fraction of the worked grains increased.However,the KQ values in the T-L orientation,where the loading direction was perpendicular to the ED,decreased with an increase in the volume fraction of the worked grains.This suggests strong anisotropy in the fracture toughness of the specimen with a high volume fraction of the worked grains,relative to the test direction.The worked grains,which included the LPSO phase and were elongated perpendicular to the initial crack plane,suppressed the straight crack extension,causing crack deflection,and generating secondary cracks.Thus,these worked grains significantly contributed to the fracture toughness of the extruded Mg-1Zn-2Y alloys in the L-T orientation.展开更多
The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to real...The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to realize the upgrading of the digital twin industrial chain,it is urgent to introduce more modalities,such as vision,haptics,hearing and smell,into the virtual digital space,which assists physical entities and virtual objects in creating a closer connection.Therefore,perceptual understanding and object recognition have become an urgent hot topic in the digital twin.Existing surface material classification schemes often achieve recognition through machine learning or deep learning in a single modality,ignoring the complementarity between multiple modalities.In order to overcome this dilemma,we propose a multimodal fusion network in our article that combines two modalities,visual and haptic,for surface material recognition.On the one hand,the network makes full use of the potential correlations between multiple modalities to deeply mine the modal semantics and complete the data mapping.On the other hand,the network is extensible and can be used as a universal architecture to include more modalities.Experiments show that the constructed multimodal fusion network can achieve 99.42%classification accuracy while reducing complexity.展开更多
Visual Question Answering(VQA)is an interdisciplinary artificial intelligence(AI)activity that integrates com-puter vision and natural language processing.Its purpose is to empower machines to respond to questions by ...Visual Question Answering(VQA)is an interdisciplinary artificial intelligence(AI)activity that integrates com-puter vision and natural language processing.Its purpose is to empower machines to respond to questions by utilizing visual information.A VQA system typically takes an image and a natural language query as input and produces a textual answer as output.One major obstacle in VQA is identifying a successful method to extract and merge textual and visual data.We examine“Fusion”Models that use information from both the text encoder and picture encoder to efficiently perform the visual question-answering challenge.For the transformer model,we utilize BERT and RoBERTa,which analyze textual data.The image encoder designed for processing image data utilizes ViT(Vision Transformer),Deit(Data-efficient Image Transformer),and BeIT(Image Transformers).The reasoning module of VQA was updated and layer normalization was incorporated to enhance the performance outcome of our effort.In comparison to the results of previous research,our proposed method suggests a substantial enhancement in efficacy.Our experiment obtained a 60.4%accuracy with the PathVQA dataset and a 69.2%accuracy with the VizWiz dataset.展开更多
Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information....Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional se-mantics. The other is fusing complementary cross‐modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross‐modal attention. As a result, the located nonverbal fea-tures are not only salient but also complementary to sentiment words directly. Experi-mental results show that the authors’ method achieves state‐of‐the‐art performance on several multimodal affective analysis datasets.展开更多
In multimodal multiobjective optimization problems(MMOPs),there are several Pareto optimal solutions corre-sponding to the identical objective vector.This paper proposes a new differential evolution algorithm to solve...In multimodal multiobjective optimization problems(MMOPs),there are several Pareto optimal solutions corre-sponding to the identical objective vector.This paper proposes a new differential evolution algorithm to solve MMOPs with higher-dimensional decision variables.Due to the increase in the dimensions of decision variables in real-world MMOPs,it is diffi-cult for current multimodal multiobjective optimization evolu-tionary algorithms(MMOEAs)to find multiple Pareto optimal solutions.The proposed algorithm adopts a dual-population framework and an improved environmental selection method.It utilizes a convergence archive to help the first population improve the quality of solutions.The improved environmental selection method enables the other population to search the remaining decision space and reserve more Pareto optimal solutions through the information of the first population.The combination of these two strategies helps to effectively balance and enhance conver-gence and diversity performance.In addition,to study the per-formance of the proposed algorithm,a novel set of multimodal multiobjective optimization test functions with extensible decision variables is designed.The proposed MMOEA is certified to be effective through comparison with six state-of-the-art MMOEAs on the test functions.展开更多
基金supported by the National Natural Science Foundation of China(32130020,32025009,82030099,30700397,31970638,61572361,81973701,U23A20513,32222026,82373446)the National Key Research and Development Program of China(2021YFF1201200,2021YFF1200900,2022YFA1106000)+10 种基金the Shanghai Municipal Science and Technology Major Project(2017SHZDZX01)the Fundamental Research Funds for the Central Universities(20002150110,22120230292)Beihang University&Capital Medical University Plan(BHME-201904)the Special Fund of the Pediatric Medical Coordinated Development Center of Beijing Hospitals Authority(XTCX201809)the Cooperative Research Fund of the Affiliated Wuhu Hospital of East China Normal University(40500-20104-222400)the Fundamental Research Funds for the Central Universities(226-2024-00001)the Shanghai Municipal Science and Technology Commission“Science and Technology Innovation Action Plan”technical standard project(21DZ2201700)the Shanghai Municipal Science and Technology Commission“Science and Technology Innovation Action Plan”natural science foundation project(23ZR1435800)the Shanghai Natural Science Foundation Program(17ZR1449400)the Shanghai Artificial Intelligence Technology Standard Project(19DZ2200900)the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE,ECNU,Key Laboratory of MEA,Ministry of Education,ECNU.
文摘Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features.Our exploration of how the genomes orchestrate the formation and maintenance of each cell,and control the cellular phenotypes of various organismsis,is both captivating and intricate.Since the inception of the first single-cell RNA technology,technologies related to single-cell sequencing have experienced rapid advancements in recent years.These technologies have expanded horizontally to include single-cell genome,epigenome,proteome,and metabolome,while vertically,they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening.Single-cell omics represent a groundbreaking advancement in the biomedical field,offering profound insights into the understanding of complex diseases,including cancers.Here,we comprehensively summarize recent advances in single-cell omics technologies,with a specific focus on the methodology section.This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
基金supported by the National Natural Science Foundation of China(No.62404111)Natural Science Foundation of Jiangsu Province(No.BK20240635)+2 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions of China(No.24KJB510025)Natural Science Research Start-up Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications(No.NY223157 and NY223156)Opening Project of Advanced Inte-grated Circuit Package and Testing Research Center of Jiangsu Province(No.NTIKFJJ202303).
文摘Multimodal sensor fusion can make full use of the advantages of various sensors,make up for the shortcomings of a single sensor,achieve information verification or information security through information redundancy,and improve the reliability and safety of the system.Artificial intelligence(AI),referring to the simulation of human intelligence in machines that are programmed to think and learn like humans,represents a pivotal frontier in modern scientific research.With the continuous development and promotion of AI technology in Sensor 4.0 age,multimodal sensor fusion is becoming more and more intelligent and automated,and is expected to go further in the future.With this context,this review article takes a comprehensive look at the recent progress on AI-enhanced multimodal sensors and their integrated devices and systems.Based on the concept and principle of sensor technologies and AI algorithms,the theoretical underpinnings,technological breakthroughs,and pragmatic applications of AI-enhanced multimodal sensors in various fields such as robotics,healthcare,and environmental monitoring are highlighted.Through a comparative study of the dual/tri-modal sensors with and without using AI technologies(especially machine learning and deep learning),AI-enhanced multimodal sensors highlight the potential of AI to improve sensor performance,data processing,and decision-making capabilities.Furthermore,the review analyzes the challenges and opportunities afforded by AI-enhanced multimodal sensors,and offers a prospective outlook on the forthcoming advancements.
基金Supported by the Zhejiang Province Medicine and Health Science and Technology Program,No.2023KY980Hangzhou Municipal Health and Family Planning Commission,No.A20220133.
文摘BACKGROUND Phelan-McDermid syndrome(PMS)is a rare genetic disorder characterized by intellectual disability,delayed language development,autism spectrum disorders,motor tone abnormalities,and a high risk of psychiatric symptoms,including bipolar disorder.CASE SUMMARY This report presented an 18-year clinical history of a 36-year-old woman with PMS,marked by intellectual disabilities,social withdrawal,and stereotyped behaviors.Diagnosed with bipolar disorder at the age of 18 years old,she encountered significant treatment challenges,including severe adverse reactions to antipsychotic medications in 2022,which led to speech and functional regression.Through rehabilitation and comprehensive therapy,her condition gradually improved.In 2024,after further treatment,her symptoms stabilized,highlighting the complexities and successes of long-term management.CONCLUSION Effective management of PMS requires a thorough clinical history,genetic testing,and long-term supportive care.
文摘BACKGROUND Gastric cancer(GC)is a prevalent tumor in the digestive system,with around one million new cases reported annually,ranking it as the third most common malignancy.Reducing pain is a key research focus.This study evaluates the effect of nalbuphine on the analgesic effect and the expression of pain factors in patients after radical resection.AIM To provide a reference for postoperative analgesia methods.METHODS One hundred eight patients with GC,admitted between January 2022 and June 2024,underwent radical gastrectomy.They received a controlled analgesia pump and a transverse abdominis muscle plane block,divided into two groups of 54 patients in each group.The control group received sufentanil,while the observation group received nalbuphine as an analgesic.Postoperative analgesic effects,pain factor expression,and adverse effects were compared.RESULTS The resting pain and activity pain scores in the observation group at 6,12,24 and 48 hours were significantly lower than those in the control group.Additionally,the number of presses and consumption of the observation group at 48 hours were lower than those of the control group;and the response rate of the observation group was higher than that of the control group(P<0.05).The prostaglandin E2,substance P,and serotonin levels 24 hours after the observation group were lower than those in the control group,and the incidence of adverse reactions was 5.56%lower than 22.22%in the control group(P<0.05).CONCLUSION The findings suggest that nalbuphine enhances postoperative multimodal analgesia in patients with radical GC,effectively improving postoperative analgesic effect,relieving postoperative resting and active pain,and reducing postoperative pain factor expression,demonstrating its potential for clinical application.
基金Supported by the Hunan Province Clinical Medical Technology Innovation Guidance Project,No.2020SK50912Annual Scientific Research Plan Project of Hunan Provincial Health Commission,No.C2019057Hunan Provincial Natural Science Foundation of China,No.2023JJ40381.
文摘BACKGROUND Pancreatic cancer involving the pancreas neck and body often invades the retroperitoneal vessels,making its radical resection challenging.Multimodal treatment strategies,including neoadjuvant therapy,surgery,and postoperative adjuvant therapy,are contributing to a paradigm shift in the treatment of pancreatic cancer.This strategy is also promising in the treatment of pancreatic neckbody cancer.AIM To evaluate the feasibility and effectiveness of a multimodal strategy for the treatment of borderline/locally advanced pancreatic neck-body cancer.METHODS From January 2019 to December 2021,we reviewed the demographic characteristics,neoadjuvant and adjuvant treatment data,intraoperative and postoperative variables,and follow-up outcomes of patients who underwent multimodal treatment for pancreatic neck-body cancer in a prospectively collected database of our hospital.This investigation was reported in line with the Preferred Reporting of Case Series in Surgery criteria.RESULTS A total of 11 patients with pancreatic neck-body cancer were included in this study,of whom 6 patients were borderline resectable and 5 were locally advanced.Through multidisciplinary team discussion,all patients received neoadjuvant therapy,of whom 8(73%)patients achieved a partial response and 3 patients maintained stable disease.After multidisciplinary team reassessment,all patients underwent laparoscopic subtotal distal pancreatectomy and portal vein reconstruction and achieved R0 resection.Postoperatively,two patients(18%)developed ascites,and two patients(18%)developed pancreatic fistulae.The median length of stay of the patients was 11 days(range:10-15 days).All patients received postoperative adjuvant therapy.During the follow-up,three patients experienced tumor recurrence,with a median disease-free survival time of 13.3 months and a median overall survival time of 20.5 months.CONCLUSION A multimodal treatment strategy combining neoadjuvant therapy,laparoscopic subtotal distal pancreatectomy,and adjuvant therapy is safe and feasible in patients with pancreatic neck-body cancer.
基金Supported by the 135 High-end Talent Project of West China Hospital,Sichuan University,No.ZYDG23029.
文摘BACKGROUND Recent advancements in artificial intelligence(AI)have significantly enhanced the capabilities of endoscopic-assisted diagnosis for gastrointestinal diseases.AI has shown great promise in clinical practice,particularly for diagnostic support,offering real-time insights into complex conditions such as esophageal squamous cell carcinoma.CASE SUMMARY In this study,we introduce a multimodal AI system that successfully identified and delineated a small and flat carcinoma during esophagogastroduodenoscopy,highlighting its potential for early detection of malignancies.The lesion was confirmed as high-grade squamous intraepithelial neoplasia,with pathology results supporting the AI system’s accuracy.The multimodal AI system offers an integrated solution that provides real-time,accurate diagnostic information directly within the endoscopic device interface,allowing for single-monitor use without disrupting endoscopist’s workflow.CONCLUSION This work underscores the transformative potential of AI to enhance endoscopic diagnosis by enabling earlier,more accurate interventions.
基金supported by the National Key Research and Development Program of China(Grant No.2022YFC3004104)the National Natural Science Foundation of China(Grant No.U2342204)+4 种基金the Innovation and Development Program of the China Meteorological Administration(Grant No.CXFZ2024J001)the Open Research Project of the Key Open Laboratory of Hydrology and Meteorology of the China Meteorological Administration(Grant No.23SWQXZ010)the Science and Technology Plan Project of Zhejiang Province(Grant No.2022C03150)the Open Research Fund Project of Anyang National Climate Observatory(Grant No.AYNCOF202401)the Open Bidding for Selecting the Best Candidates Program(Grant No.CMAJBGS202318)。
文摘Thunderstorm wind gusts are small in scale,typically occurring within a range of a few kilometers.It is extremely challenging to monitor and forecast thunderstorm wind gusts using only automatic weather stations.Therefore,it is necessary to establish thunderstorm wind gust identification techniques based on multisource high-resolution observations.This paper introduces a new algorithm,called thunderstorm wind gust identification network(TGNet).It leverages multimodal feature fusion to fuse the temporal and spatial features of thunderstorm wind gust events.The shapelet transform is first used to extract the temporal features of wind speeds from automatic weather stations,which is aimed at distinguishing thunderstorm wind gusts from those caused by synoptic-scale systems or typhoons.Then,the encoder,structured upon the U-shaped network(U-Net)and incorporating recurrent residual convolutional blocks(R2U-Net),is employed to extract the corresponding spatial convective characteristics of satellite,radar,and lightning observations.Finally,by using the multimodal deep fusion module based on multi-head cross-attention,the temporal features of wind speed at each automatic weather station are incorporated into the spatial features to obtain 10-minutely classification of thunderstorm wind gusts.TGNet products have high accuracy,with a critical success index reaching 0.77.Compared with those of U-Net and R2U-Net,the false alarm rate of TGNet products decreases by 31.28%and 24.15%,respectively.The new algorithm provides grid products of thunderstorm wind gusts with a spatial resolution of 0.01°,updated every 10minutes.The results are finer and more accurate,thereby helping to improve the accuracy of operational warnings for thunderstorm wind gusts.
基金the National Natural Science Foundation of China(Grant No.52072041)the Beijing Natural Science Foundation(Grant No.JQ21007)+2 种基金the University of Chinese Academy of Sciences(Grant No.Y8540XX2D2)the Robotics Rhino-Bird Focused Research Project(No.2020-01-002)the Tencent Robotics X Laboratory.
文摘Humans can perceive our complex world through multi-sensory fusion.Under limited visual conditions,people can sense a variety of tactile signals to identify objects accurately and rapidly.However,replicating this unique capability in robots remains a significant challenge.Here,we present a new form of ultralight multifunctional tactile nano-layered carbon aerogel sensor that provides pressure,temperature,material recognition and 3D location capabilities,which is combined with multimodal supervised learning algorithms for object recognition.The sensor exhibits human-like pressure(0.04–100 kPa)and temperature(21.5–66.2℃)detection,millisecond response times(11 ms),a pressure sensitivity of 92.22 kPa^(−1)and triboelectric durability of over 6000 cycles.The devised algorithm has universality and can accommodate a range of application scenarios.The tactile system can identify common foods in a kitchen scene with 94.63%accuracy and explore the topographic and geomorphic features of a Mars scene with 100%accuracy.This sensing approach empowers robots with versatile tactile perception to advance future society toward heightened sensing,recognition and intelligence.
基金supported in part by the National Natural Science Foundation of China(82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)+5 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Mainland-Hong Kong Joint Funding Scheme(MHKJFS)(MHP/005/20),the Project of Strategic Importance Fund(P0035421)the Projects of RISA(P0043001)from the Hong Kong Polytechnic University,the Natural Science Foundation of Jiangsu Province(BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038,SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575),and the Henan Province Science and Technology Research(222102310322).
文摘Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.
基金the National Natural Science Foundation of China(No.62302540)with author F.F.S.For more information,please visit their website at https://www.nsfc.gov.cn/.Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020)+1 种基金where F.F.S is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/.The research is also supported by the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422)for more information,you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html.Lastly,it receives funding from the Natural Science Foundation of Zhongyuan University of Technology(No.K2023QN018),where F.F.S is an author.You can find more information at https://www.zut.edu.cn/.
文摘As social networks become increasingly complex, contemporary fake news often includes textual descriptionsof events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely tocreate a misleading perception among users. While early research primarily focused on text-based features forfake news detection mechanisms, there has been relatively limited exploration of learning shared representationsin multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal modelfor detecting fake news, which relies on similarity reasoning and adversarial networks. The model employsBidirectional Encoder Representation from Transformers (BERT) and Text Convolutional Neural Network (Text-CNN) for extracting textual features while utilizing the pre-trained Visual Geometry Group 19-layer (VGG-19) toextract visual features. Subsequently, the model establishes similarity representations between the textual featuresextracted by Text-CNN and visual features through similarity learning and reasoning. Finally, these features arefused to enhance the accuracy of fake news detection, and adversarial networks have been employed to investigatethe relationship between fake news and events. This paper validates the proposed model using publicly availablemultimodal datasets from Weibo and Twitter. Experimental results demonstrate that our proposed approachachieves superior performance on Twitter, with an accuracy of 86%, surpassing traditional unimodalmodalmodelsand existing multimodal models. In contrast, the overall better performance of our model on the Weibo datasetsurpasses the benchmark models across multiple metrics. The application of similarity reasoning and adversarialnetworks in multimodal fake news detection significantly enhances detection effectiveness in this paper. However,current research is limited to the fusion of only text and image modalities. Future research directions should aimto further integrate features fromadditionalmodalities to comprehensively represent themultifaceted informationof fake news.
基金We acknowledge funding from NSFC Grant 62306283.
文摘Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the last two decades.Recently,transformer-based Pre-trained Language Models(PLM)have excelled in Natural Language Processing(NLP)tasks by leveraging large-scale training corpora.Increasing the scale of these models enhances performance significantly,introducing abilities like context learning that smaller models lack.The advancement in Large Language Models,exemplified by the development of ChatGPT,has made significant impacts both academically and industrially,capturing widespread societal interest.This survey provides an overview of the development and prospects from Large Language Models(LLM)to Large Multimodal Models(LMM).It first discusses the contributions and technological advancements of LLMs in the field of natural language processing,especially in text generation and language understanding.Then,it turns to the discussion of LMMs,which integrates various data modalities such as text,images,and sound,demonstrating advanced capabilities in understanding and generating cross-modal content,paving new pathways for the adaptability and flexibility of AI systems.Finally,the survey highlights the prospects of LMMs in terms of technological development and application potential,while also pointing out challenges in data integration,cross-modal understanding accuracy,providing a comprehensive perspective on the latest developments in this field.
基金Supported by National Natural Science Foundation of China,No.61876216.
文摘BACKGROUND Tuberous sclerosis complex(TSC)and primary lymphedema(PLE)are both rare diseases,and it is even rarer for both to occur in the same patient.In this work,we have provided a detailed description of a patient's clinical presentation,imaging findings,and treatment.And a retrospective analysis was conducted on 14 published relevant case reports.CASE SUMMARY A 16-year-old male came to our hospital for treatment due to right lower limb swelling.This swelling is already present from birth.The patient’s memory had been progressively declining.Seizures had occurred 1 year prior at an unknown frequency.The patient was diagnosed with TSC combined with PLE through multimodal imaging examination:Computed tomography,magnetic resonance imaging,and lymphoscintigraphy.The patient underwent liposuction.The swelling of the patient's right lower limb significantly improved after surgery.Epilepsy did not occur.after taking antiepileptic drugs and sirolimus.CONCLUSION TSC with PLE is a rare and systemic disease.Imaging can detect lesions of this disease,which are important for diagnosis and treatment.
基金supported by National Natural Science Foundation of China(62135007 and 61925502).
文摘Optical endoscopy has become an essential diagnostic and therapeutic approach in modern biomedicine for directly observing organs and tissues deep inside the human body,enabling non-invasive,rapid diagnosis and treatment.Optical fiber endoscopy is highly competitive among various endoscopic imaging techniques due to its high flexibility,compact structure,excellent resolution,and resistance to electromagnetic interference.Over the past decade,endoscopes based on a single multimode optical fiber(MMF)have attracted widespread research interest due to their potential to significantly reduce the footprint of optical fiber endoscopes and enhance imaging capabilities.In comparison with other imaging principles of MMF endoscopes,the scanning imaging method based on the wavefront shaping technique is highly developed and provides benefits including excellent imaging contrast,broad applicability to complex imaging scenarios,and good compatibility with various well-established scanning imaging modalities.In this review,various technical routes to achieve light focusing through MMF and procedures to conduct the scanning imaging of MMF endoscopes are introduced.The advancements in imaging performance enhancements,integrations of various imaging modalities with MMF scanning endoscopes,and applications are summarized.Challenges specific to this endoscopic imaging technology are analyzed,and potential remedies and avenues for future developments are discussed.
基金Supported by National Natural Science Foundation of China (Grant Nos. 52375003, 52205006)National Key R&D Program of China (Grant No. 2019YFB1309600)。
文摘To improve locomotion and operation integration, this paper presents an integrated leg-arm quadruped robot(ILQR) that has a reconfigurable joint. First, the reconfigurable joint is designed and assembled at the end of the legarm chain. When the robot performs a task, reconfigurable configuration and mode switching can be achieved using this joint. In contrast from traditional quadruped robots, this robot can stack in a designated area to optimize the occupied volume in a nonworking state. Kinematics modeling and dynamics modeling are established to evaluate the mechanical properties for multiple modes. All working modes of the robot are classified, which can be defined as deployable mode, locomotion mode and operation mode. Based on the stability margin and mechanical modeling, switching analysis and evaluation between each mode is carried out. Finally, the prototype experimental results verify the function realization and switching stability of multimode and provide a design method to integrate and perform multimode for quadruped robots with deployable characteristics.
文摘BACKGROUND The incidence of patients with early-onset pancreatic cancer(EOPC;age≤50 years at diagnosis)is on the rise,placing a heavy burden on individuals,families,and society.The role of combination therapy including surgery,radiotherapy,and chemotherapy in non-metastatic EOPC is not well-defined.AIM To investigate the treatment patterns and survival outcomes in patients with non-metastatic EOPC.METHODS A total of 277 patients with non-metastatic EOPC who were treated at our institution between 2017 and 2021 were investigated retrospectively.Overall survival(OS),disease-free survival,and progression-free survival were estimated using the Kaplan-Meier method.Univariate and multivariate analyses with the Cox proportional hazards model were used to identify prognostic factors.RESULTS With a median follow-up time of 34.6 months,the 1-year,2-year,and 3-year OS rates for the entire cohort were 84.3%,51.5%,and 27.6%,respectively.The median OS of patients with localized disease who received surgery alone and adjuvant therapy(AT)were 21.2 months and 28.8 months,respectively(P=0.007).The median OS of patients with locally advanced disease who received radiotherapy-based combination therapy(RCT),surgery after neoadjuvant therapy(NAT),and chemotherapy were 28.5 months,25.6 months,and 14.0 months,respectively(P=0.002).The median OS after regional recurrence were 16.0 months,13.4 months,and 8.9 months in the RCT,chemotherapy,and supportive therapy groups,respectively(P=0.035).Multivariate analysis demonstrated that carbohydrate antigen 19-9 level,pathological grade,T-stage,N-stage,and resection were independent prognostic factors for non-metastatic EOPC.CONCLUSION AT improves postoperative survival in localized patients.Surgery after NAT and RCT are the preferred therapeutic options for patients with locally advanced EOPC.
基金supported by the JST CREST for Research Area“Nanomechanics”[JPMJCR2094]the JSPS KAKENHI for Scientific Research B[JP21H01673]the AMADA Foundation[AF-2023044-C2].
文摘The fracture toughness of extruded Mg-1Zn-2Y(at.%)alloys,featuring a multimodal microstructure containing fine dynamically recrystallized(DRXed)grains with random crystallographic orientation and coarse-worked grains with a strong fiber texture,was investigated.The DRXed grains comprised randomly oriented equiaxedα-Mg grains.In contrast,the worked grains includedα-Mg and long-period stacking ordered(LPSO)phases that extended in the extrusion direction(ED).Both types displayed a strong texture,aligning the(10.10)direction parallel to the ED.The volume fractions of the DRXed and worked grains were controlled by adjusting the extrusion temperature.In the longitudinal-transverse(L-T)orientation,where the loading direction was aligned parallel to the ED,there was a tendency for the conditional fracture toughness,KQ,tended to increase as the volume fraction of the worked grains increased.However,the KQ values in the T-L orientation,where the loading direction was perpendicular to the ED,decreased with an increase in the volume fraction of the worked grains.This suggests strong anisotropy in the fracture toughness of the specimen with a high volume fraction of the worked grains,relative to the test direction.The worked grains,which included the LPSO phase and were elongated perpendicular to the initial crack plane,suppressed the straight crack extension,causing crack deflection,and generating secondary cracks.Thus,these worked grains significantly contributed to the fracture toughness of the extruded Mg-1Zn-2Y alloys in the L-T orientation.
基金the National Natural Science Foundation of China(62001246,62001248,62171232)Key R&D Program of Jiangsu Province Key project and topics under Grant BE2021095+3 种基金the Natural Science Foundation of Jiangsu Province Higher Education Institutions(20KJB510020)the Future Network Scientific Research Fund Project(FNSRFP-2021-YB-16)the open research fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology(JZNY202110)the NUPTSF under Grant(NY220070).
文摘The digital twin is the concept of transcending reality,which is the reverse feedback from the real physical space to the virtual digital space.People hold great prospects for this emerging technology.In order to realize the upgrading of the digital twin industrial chain,it is urgent to introduce more modalities,such as vision,haptics,hearing and smell,into the virtual digital space,which assists physical entities and virtual objects in creating a closer connection.Therefore,perceptual understanding and object recognition have become an urgent hot topic in the digital twin.Existing surface material classification schemes often achieve recognition through machine learning or deep learning in a single modality,ignoring the complementarity between multiple modalities.In order to overcome this dilemma,we propose a multimodal fusion network in our article that combines two modalities,visual and haptic,for surface material recognition.On the one hand,the network makes full use of the potential correlations between multiple modalities to deeply mine the modal semantics and complete the data mapping.On the other hand,the network is extensible and can be used as a universal architecture to include more modalities.Experiments show that the constructed multimodal fusion network can achieve 99.42%classification accuracy while reducing complexity.
基金supported by the National Science and Technology Council,Taiwan(Grant number:NSTC 111-2637-H-324-001-).
文摘Visual Question Answering(VQA)is an interdisciplinary artificial intelligence(AI)activity that integrates com-puter vision and natural language processing.Its purpose is to empower machines to respond to questions by utilizing visual information.A VQA system typically takes an image and a natural language query as input and produces a textual answer as output.One major obstacle in VQA is identifying a successful method to extract and merge textual and visual data.We examine“Fusion”Models that use information from both the text encoder and picture encoder to efficiently perform the visual question-answering challenge.For the transformer model,we utilize BERT and RoBERTa,which analyze textual data.The image encoder designed for processing image data utilizes ViT(Vision Transformer),Deit(Data-efficient Image Transformer),and BeIT(Image Transformers).The reasoning module of VQA was updated and layer normalization was incorporated to enhance the performance outcome of our effort.In comparison to the results of previous research,our proposed method suggests a substantial enhancement in efficacy.Our experiment obtained a 60.4%accuracy with the PathVQA dataset and a 69.2%accuracy with the VizWiz dataset.
基金National Key Research and Development Plan of China, Grant/Award Number: 2021YFB3600503National Natural Science Foundation of China, Grant/Award Numbers: 62276065, U21A20472。
文摘Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional se-mantics. The other is fusing complementary cross‐modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross‐modal attention. As a result, the located nonverbal fea-tures are not only salient but also complementary to sentiment words directly. Experi-mental results show that the authors’ method achieves state‐of‐the‐art performance on several multimodal affective analysis datasets.
基金supported in part by National Natural Science Foundation of China(62106230,U23A20340,62376253,62176238)China Postdoctoral Science Foundation(2023M743185)Key Laboratory of Big Data Intelligent Computing,Chongqing University of Posts and Telecommunications Open Fundation(BDIC-2023-A-007)。
文摘In multimodal multiobjective optimization problems(MMOPs),there are several Pareto optimal solutions corre-sponding to the identical objective vector.This paper proposes a new differential evolution algorithm to solve MMOPs with higher-dimensional decision variables.Due to the increase in the dimensions of decision variables in real-world MMOPs,it is diffi-cult for current multimodal multiobjective optimization evolu-tionary algorithms(MMOEAs)to find multiple Pareto optimal solutions.The proposed algorithm adopts a dual-population framework and an improved environmental selection method.It utilizes a convergence archive to help the first population improve the quality of solutions.The improved environmental selection method enables the other population to search the remaining decision space and reserve more Pareto optimal solutions through the information of the first population.The combination of these two strategies helps to effectively balance and enhance conver-gence and diversity performance.In addition,to study the per-formance of the proposed algorithm,a novel set of multimodal multiobjective optimization test functions with extensible decision variables is designed.The proposed MMOEA is certified to be effective through comparison with six state-of-the-art MMOEAs on the test functions.