Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of...Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.展开更多
Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the ...Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the last two decades.Recently,transformer-based Pre-trained Language Models(PLM)have excelled in Natural Language Processing(NLP)tasks by leveraging large-scale training corpora.Increasing the scale of these models enhances performance significantly,introducing abilities like context learning that smaller models lack.The advancement in Large Language Models,exemplified by the development of ChatGPT,has made significant impacts both academically and industrially,capturing widespread societal interest.This survey provides an overview of the development and prospects from Large Language Models(LLM)to Large Multimodal Models(LMM).It first discusses the contributions and technological advancements of LLMs in the field of natural language processing,especially in text generation and language understanding.Then,it turns to the discussion of LMMs,which integrates various data modalities such as text,images,and sound,demonstrating advanced capabilities in understanding and generating cross-modal content,paving new pathways for the adaptability and flexibility of AI systems.Finally,the survey highlights the prospects of LMMs in terms of technological development and application potential,while also pointing out challenges in data integration,cross-modal understanding accuracy,providing a comprehensive perspective on the latest developments in this field.展开更多
Post-earthquake rescue missions are full of challenges due to the unstable structure of ruins and successive aftershocks.Most of the current rescue robots lack the ability to interact with environments,leading to low ...Post-earthquake rescue missions are full of challenges due to the unstable structure of ruins and successive aftershocks.Most of the current rescue robots lack the ability to interact with environments,leading to low rescue efficiency.The multimodal electronic skin(e-skin)proposed not only reproduces the pressure,temperature,and humidity sensing capabilities of natural skin but also develops sensing functions beyond it—perceiving object proximity and NO2 gas.Its multilayer stacked structure based on Ecoflex and organohydrogel endows the e-skin with mechanical properties similar to natural skin.Rescue robots integrated with multimodal e-skin and artificial intelligence(AI)algorithms show strong environmental perception capabilities and can accurately distinguish objects and identify human limbs through grasping,laying the foundation for automated post-earthquake rescue.Besides,the combination of e-skin and NO2 wireless alarm circuits allows robots to sense toxic gases in the environment in real time,thereby adopting appropriate measures to protect trapped people from the toxic environment.Multimodal e-skin powered by AI algorithms and hardware circuits exhibits powerful environmental perception and information processing capabilities,which,as an interface for interaction with the physical world,dramatically expands intelligent robots’application scenarios.展开更多
Artificial intelligence (AI)-based radiomics has attracted considerable research attention in the field of medical imaging, including ultrasound diagnosis. Ultrasound imaging has unique advantages such as high tempora...Artificial intelligence (AI)-based radiomics has attracted considerable research attention in the field of medical imaging, including ultrasound diagnosis. Ultrasound imaging has unique advantages such as high temporal resolution, low cost, and no radiation exposure. This renders it a preferred imaging modality for several clinical scenarios. This review includes a detailed introduction to imaging modalities, including Brightness-mode ultrasound, color Doppler flow imaging, ultrasound elastography, contrast-enhanced ultrasound, and multi-modal fusion analysis. It provides an overview of the current status and prospects of AI-based radiomics in ultrasound diagnosis, highlighting the application of AI-based radiomics to static ultrasound images, dynamic ultrasound videos, and multi-modal ultrasound fusion analysis.展开更多
Gland cancer is a high-incidence disease that endangers human health,and its early detection and treatment require efficient,accurate,and objective intelligent diagnosis methods.In recent years,the advent of machine l...Gland cancer is a high-incidence disease that endangers human health,and its early detection and treatment require efficient,accurate,and objective intelligent diagnosis methods.In recent years,the advent of machine learning techniques has yielded satisfactory results in intelligent gland cancer diagnosis based on clinical images,significantly improving the accuracy and efficiency of medical image interpretation while reducing the workload of doctors.The focus of this study is to review,classify,and analyze intelligent diagnosis methods for imaging gland cancer based on machine learning and deep learning.This paper briefly introduces some basic imaging principles of multimodal medical images,such as the commonly used computed tomography(CT),magnetic resonance imaging(MRI),ultrasound(US),positron emission tomography(PET),and pathology.In addition,the intelligent diagnosis methods for imaging gland cancer were further classified into supervised learning and weakly supervised learning.Supervised learning consists of traditional machine learning methods,such as K-nearest neighbor algorithm(KNN),support vector machine(SVM),and multilayer perceptron,and deep learning methods evolving from convolutional neural network(CNN).By contrast,weakly supervised learning can be further categorized into active learning,semisupervised learning,and transfer learning.State-of-the-art methods are illustrated with implementation details,including image segmentation,feature extraction,and optimization of classifiers.Their performances are evaluated through indicators,such as accuracy,precision,and sensitivity.In conclusion,the challenges and development trends of intelligent diagnosis methods for imaging gland cancer were addressed and discussed.展开更多
As the field of artificial intelligence continues to evolve,so too does the application of multimodal learning analysis and intelligent adaptive learning systems.This trend has the potential to promote the equalizatio...As the field of artificial intelligence continues to evolve,so too does the application of multimodal learning analysis and intelligent adaptive learning systems.This trend has the potential to promote the equalization of educational resources,the intellectualization of educational methods,and the modernization of educational reform,among other benefits.This study proposes a construction framework for an intelligent adaptive learning system that is supported by multimodal data.It provides a detailed explanation of the system’s working principles and patterns,which aim to enhance learners’online engagement in behavior,emotion,and cognition.The study seeks to address the issue of intelligent adaptive learning systems diagnosing learners’learning behavior based solely on learning achievement,to improve learners’online engagement,enable them to master more required knowledge,and ultimately achieve better learning outcomes.展开更多
基金supported in part by the National Natural Science Foundation of China(82072019)the Shenzhen Basic Research Program(JCYJ20210324130209023)+5 种基金the Shenzhen-Hong Kong-Macao S&T Program(Category C)(SGDX20201103095002019)the Mainland-Hong Kong Joint Funding Scheme(MHKJFS)(MHP/005/20),the Project of Strategic Importance Fund(P0035421)the Projects of RISA(P0043001)from the Hong Kong Polytechnic University,the Natural Science Foundation of Jiangsu Province(BK20201441)the Provincial and Ministry Co-constructed Project of Henan Province Medical Science and Technology Research(SBGJ202103038,SBGJ202102056)the Henan Province Key R&D and Promotion Project(Science and Technology Research)(222102310015)the Natural Science Foundation of Henan Province(222300420575),and the Henan Province Science and Technology Research(222102310322).
文摘Modern medicine is reliant on various medical imaging technologies for non-invasively observing patients’anatomy.However,the interpretation of medical images can be highly subjective and dependent on the expertise of clinicians.Moreover,some potentially useful quantitative information in medical images,especially that which is not visible to the naked eye,is often ignored during clinical practice.In contrast,radiomics performs high-throughput feature extraction from medical images,which enables quantitative analysis of medical images and prediction of various clinical endpoints.Studies have reported that radiomics exhibits promising performance in diagnosis and predicting treatment responses and prognosis,demonstrating its potential to be a non-invasive auxiliary tool for personalized medicine.However,radiomics remains in a developmental phase as numerous technical challenges have yet to be solved,especially in feature engineering and statistical modeling.In this review,we introduce the current utility of radiomics by summarizing research on its application in the diagnosis,prognosis,and prediction of treatment responses in patients with cancer.We focus on machine learning approaches,for feature extraction and selection during feature engineering and for imbalanced datasets and multi-modality fusion during statistical modeling.Furthermore,we introduce the stability,reproducibility,and interpretability of features,and the generalizability and interpretability of models.Finally,we offer possible solutions to current challenges in radiomics research.
基金We acknowledge funding from NSFC Grant 62306283.
文摘Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the last two decades.Recently,transformer-based Pre-trained Language Models(PLM)have excelled in Natural Language Processing(NLP)tasks by leveraging large-scale training corpora.Increasing the scale of these models enhances performance significantly,introducing abilities like context learning that smaller models lack.The advancement in Large Language Models,exemplified by the development of ChatGPT,has made significant impacts both academically and industrially,capturing widespread societal interest.This survey provides an overview of the development and prospects from Large Language Models(LLM)to Large Multimodal Models(LMM).It first discusses the contributions and technological advancements of LLMs in the field of natural language processing,especially in text generation and language understanding.Then,it turns to the discussion of LMMs,which integrates various data modalities such as text,images,and sound,demonstrating advanced capabilities in understanding and generating cross-modal content,paving new pathways for the adaptability and flexibility of AI systems.Finally,the survey highlights the prospects of LMMs in terms of technological development and application potential,while also pointing out challenges in data integration,cross-modal understanding accuracy,providing a comprehensive perspective on the latest developments in this field.
基金supports from the National Natural Science Foundation of China(61801525)the independent fund of the State Key Laboratory of Optoelectronic Materials and Technologies(Sun Yat-sen University)under grant No.OEMT-2022-ZRC-05+3 种基金the Opening Project of State Key Laboratory of Polymer Materials Engineering(Sichuan University)(Grant No.sklpme2023-3-5))the Foundation of the state key Laboratory of Transducer Technology(No.SKT2301),Shenzhen Science and Technology Program(JCYJ20220530161809020&JCYJ20220818100415033)the Young Top Talent of Fujian Young Eagle Program of Fujian Province and Natural Science Foundation of Fujian Province(2023J02013)National Key R&D Program of China(2022YFB2802051).
文摘Post-earthquake rescue missions are full of challenges due to the unstable structure of ruins and successive aftershocks.Most of the current rescue robots lack the ability to interact with environments,leading to low rescue efficiency.The multimodal electronic skin(e-skin)proposed not only reproduces the pressure,temperature,and humidity sensing capabilities of natural skin but also develops sensing functions beyond it—perceiving object proximity and NO2 gas.Its multilayer stacked structure based on Ecoflex and organohydrogel endows the e-skin with mechanical properties similar to natural skin.Rescue robots integrated with multimodal e-skin and artificial intelligence(AI)algorithms show strong environmental perception capabilities and can accurately distinguish objects and identify human limbs through grasping,laying the foundation for automated post-earthquake rescue.Besides,the combination of e-skin and NO2 wireless alarm circuits allows robots to sense toxic gases in the environment in real time,thereby adopting appropriate measures to protect trapped people from the toxic environment.Multimodal e-skin powered by AI algorithms and hardware circuits exhibits powerful environmental perception and information processing capabilities,which,as an interface for interaction with the physical world,dramatically expands intelligent robots’application scenarios.
基金the National Natural Science Foundation of China,Nos.92159305,92259303,62027901,81930053,and 82272029Beijing Science Fund for Distinguished Young Scholars,No.JQ22013and Excellent Member Project of the Youth Innovation Promotion Association CAS,No.2016124.
文摘Artificial intelligence (AI)-based radiomics has attracted considerable research attention in the field of medical imaging, including ultrasound diagnosis. Ultrasound imaging has unique advantages such as high temporal resolution, low cost, and no radiation exposure. This renders it a preferred imaging modality for several clinical scenarios. This review includes a detailed introduction to imaging modalities, including Brightness-mode ultrasound, color Doppler flow imaging, ultrasound elastography, contrast-enhanced ultrasound, and multi-modal fusion analysis. It provides an overview of the current status and prospects of AI-based radiomics in ultrasound diagnosis, highlighting the application of AI-based radiomics to static ultrasound images, dynamic ultrasound videos, and multi-modal ultrasound fusion analysis.
基金Supported by National Natural Science Foundation of China(62102036).
文摘Gland cancer is a high-incidence disease that endangers human health,and its early detection and treatment require efficient,accurate,and objective intelligent diagnosis methods.In recent years,the advent of machine learning techniques has yielded satisfactory results in intelligent gland cancer diagnosis based on clinical images,significantly improving the accuracy and efficiency of medical image interpretation while reducing the workload of doctors.The focus of this study is to review,classify,and analyze intelligent diagnosis methods for imaging gland cancer based on machine learning and deep learning.This paper briefly introduces some basic imaging principles of multimodal medical images,such as the commonly used computed tomography(CT),magnetic resonance imaging(MRI),ultrasound(US),positron emission tomography(PET),and pathology.In addition,the intelligent diagnosis methods for imaging gland cancer were further classified into supervised learning and weakly supervised learning.Supervised learning consists of traditional machine learning methods,such as K-nearest neighbor algorithm(KNN),support vector machine(SVM),and multilayer perceptron,and deep learning methods evolving from convolutional neural network(CNN).By contrast,weakly supervised learning can be further categorized into active learning,semisupervised learning,and transfer learning.State-of-the-art methods are illustrated with implementation details,including image segmentation,feature extraction,and optimization of classifiers.Their performances are evaluated through indicators,such as accuracy,precision,and sensitivity.In conclusion,the challenges and development trends of intelligent diagnosis methods for imaging gland cancer were addressed and discussed.
文摘As the field of artificial intelligence continues to evolve,so too does the application of multimodal learning analysis and intelligent adaptive learning systems.This trend has the potential to promote the equalization of educational resources,the intellectualization of educational methods,and the modernization of educational reform,among other benefits.This study proposes a construction framework for an intelligent adaptive learning system that is supported by multimodal data.It provides a detailed explanation of the system’s working principles and patterns,which aim to enhance learners’online engagement in behavior,emotion,and cognition.The study seeks to address the issue of intelligent adaptive learning systems diagnosing learners’learning behavior based solely on learning achievement,to improve learners’online engagement,enable them to master more required knowledge,and ultimately achieve better learning outcomes.