Astronomical imaging technologies are basic tools for the exploration of the universe,providing basic data for the research of astronomy and space physics.The Soft X-ray Imager(SXI)carried by the Solar wind Magnetosph...Astronomical imaging technologies are basic tools for the exploration of the universe,providing basic data for the research of astronomy and space physics.The Soft X-ray Imager(SXI)carried by the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)aims to capture two-dimensional(2-D)images of the Earth’s magnetosheath by using soft X-ray imaging.However,the observed 2-D images are affected by many noise factors,destroying the contained information,which is not conducive to the subsequent reconstruction of the three-dimensional(3-D)structure of the magnetopause.The analysis of SXI-simulated observation images shows that such damage cannot be evaluated with traditional restoration models.This makes it difficult to establish the mapping relationship between SXIsimulated observation images and target images by using mathematical models.We propose an image restoration algorithm for SXIsimulated observation images that can recover large-scale structure information on the magnetosphere.The idea is to train a patch estimator by selecting noise–clean patch pairs with the same distribution through the Classification–Expectation Maximization algorithm to achieve the restoration estimation of the SXI-simulated observation image,whose mapping relationship with the target image is established by the patch estimator.The Classification–Expectation Maximization algorithm is used to select multiple patch clusters with the same distribution and then train different patch estimators so as to improve the accuracy of the estimator.Experimental results showed that our image restoration algorithm is superior to other classical image restoration algorithms in the SXI-simulated observation image restoration task,according to the peak signal-to-noise ratio and structural similarity.The restoration results of SXI-simulated observation images are used in the tangent fitting approach and the computed tomography approach toward magnetospheric reconstruction techniques,significantly improving the reconstruction results.Hence,the proposed technology may be feasible for processing SXI-simulated observation images.展开更多
Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but...Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but also to dayglow emissions produced by photoelectrons induced by sunlight.Nightglow emissions and scattered sunlight can contribute to the background signal.To fully utilize such images in space science,background contamination must be removed to isolate the auroral signal.Here we outline a data-driven approach to modeling the background intensity in multiple images by formulating linear inverse problems based on B-splines and spherical harmonics.The approach is robust,flexible,and iteratively deselects outliers,such as auroral emissions.The final model is smooth across the terminator and accounts for slow temporal variations and large-scale asymmetries in the dayglow.We demonstrate the model by using the three far ultraviolet cameras on the Imager for Magnetopause-to-Aurora Global Exploration(IMAGE)mission.The method can be applied to historical missions and is relevant for upcoming missions,such as the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.展开更多
Limited by the dynamic range of the detector,saturation artifacts usually occur in optical coherence tomography(OCT)imaging for high scattering media.The available methods are difficult to remove saturation artifacts ...Limited by the dynamic range of the detector,saturation artifacts usually occur in optical coherence tomography(OCT)imaging for high scattering media.The available methods are difficult to remove saturation artifacts and restore texture completely in OCT images.We proposed a deep learning-based inpainting method of saturation artifacts in this paper.The generation mechanism of saturation artifacts was analyzed,and experimental and simulated datasets were built based on the mechanism.Enhanced super-resolution generative adversarial networks were trained by the clear–saturated phantom image pairs.The perfect reconstructed results of experimental zebrafish and thyroid OCT images proved its feasibility,strong generalization,and robustness.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR ...Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR detection methods have mainly relied on manual feature extraction and classification,leading to errors.This paper proposes a novel VTDR detection and classification model that combines different models through majority voting.Our proposed methodology involves preprocessing,data augmentation,feature extraction,and classification stages.We use a hybrid convolutional neural network-singular value decomposition(CNN-SVD)model for feature extraction and selection and an improved SVM-RBF with a Decision Tree(DT)and K-Nearest Neighbor(KNN)for classification.We tested our model on the IDRiD dataset and achieved an accuracy of 98.06%,a sensitivity of 83.67%,and a specificity of 100%for DR detection and evaluation tests,respectively.Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection.展开更多
Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit...Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.展开更多
This paper emphasizes a faster digital processing time while presenting an accurate method for identifying spinefractures in X-ray pictures. The study focuses on efficiency by utilizing many methods that include pictu...This paper emphasizes a faster digital processing time while presenting an accurate method for identifying spinefractures in X-ray pictures. The study focuses on efficiency by utilizing many methods that include picturesegmentation, feature reduction, and image classification. Two important elements are investigated to reducethe classification time: Using feature reduction software and leveraging the capabilities of sophisticated digitalprocessing hardware. The researchers use different algorithms for picture enhancement, including theWiener andKalman filters, and they look into two background correction techniques. The article presents a technique forextracting textural features and evaluates three picture segmentation algorithms and three fractured spine detectionalgorithms using transformdomain, PowerDensity Spectrum(PDS), andHigher-Order Statistics (HOS) for featureextraction.With an emphasis on reducing digital processing time, this all-encompassing method helps to create asimplified system for classifying fractured spine fractures. A feature reduction program code has been built toimprove the processing speed for picture classification. Overall, the proposed approach shows great potential forsignificantly reducing classification time in clinical settings where time is critical. In comparison to other transformdomains, the texture features’ discrete cosine transform (DCT) yielded an exceptional classification rate, and theprocess of extracting features from the transform domain took less time. More capable hardware can also result inquicker execution times for the feature extraction algorithms.展开更多
The act of transmitting photos via the Internet has become a routine and significant activity.Enhancing the security measures to safeguard these images from counterfeiting and modifications is a critical domain that c...The act of transmitting photos via the Internet has become a routine and significant activity.Enhancing the security measures to safeguard these images from counterfeiting and modifications is a critical domain that can still be further enhanced.This study presents a system that employs a range of approaches and algorithms to ensure the security of transmitted venous images.The main goal of this work is to create a very effective system for compressing individual biometrics in order to improve the overall accuracy and security of digital photographs by means of image compression.This paper introduces a content-based image authentication mechanism that is suitable for usage across an untrusted network and resistant to data loss during transmission.By employing scale attributes and a key-dependent parametric Long Short-Term Memory(LSTM),it is feasible to improve the resilience of digital signatures against image deterioration and strengthen their security against malicious actions.Furthermore,the successful implementation of transmitting biometric data in a compressed format over a wireless network has been accomplished.For applications involving the transmission and sharing of images across a network.The suggested technique utilizes the scalability of a structural digital signature to attain a satisfactory equilibrium between security and picture transfer.An effective adaptive compression strategy was created to lengthen the overall lifetime of the network by sharing the processing of responsibilities.This scheme ensures a large reduction in computational and energy requirements while minimizing image quality loss.This approach employs multi-scale characteristics to improve the resistance of signatures against image deterioration.The proposed system attained a Gaussian noise value of 98%and a rotation accuracy surpassing 99%.展开更多
In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia...In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia and the extent to which cancer has established throughout the body.Identifying leukemia in the initial stage is vital to providing timely patient care.Medical image-analysis-related approaches grant safer,quicker,and less costly solutions while ignoring the difficulties of these invasive processes.It can be simple to generalize Computer vision(CV)-based and image-processing techniques and eradicate human error.Many researchers have implemented computer-aided diagnosticmethods andmachine learning(ML)for laboratory image analysis,hopefully overcoming the limitations of late leukemia detection and determining its subgroups.This study establishes a Marine Predators Algorithm with Deep Learning Leukemia Cancer Classification(MPADL-LCC)algorithm onMedical Images.The projectedMPADL-LCC system uses a bilateral filtering(BF)technique to pre-process medical images.The MPADL-LCC system uses Faster SqueezeNet withMarine Predators Algorithm(MPA)as a hyperparameter optimizer for feature extraction.Lastly,the denoising autoencoder(DAE)methodology can be executed to accurately detect and classify leukemia cancer.The hyperparameter tuning process using MPA helps enhance leukemia cancer classification performance.Simulation results are compared with other recent approaches concerning various measurements and the MPADL-LCC algorithm exhibits the best results over other recent approaches.展开更多
Obtaining high precision is an important consideration for astrometric studies using images from the Narrow Angle Camera(NAC)of the Cassini Imaging Science Subsystem(ISS).Selecting the best centering algorithm is key ...Obtaining high precision is an important consideration for astrometric studies using images from the Narrow Angle Camera(NAC)of the Cassini Imaging Science Subsystem(ISS).Selecting the best centering algorithm is key to enhancing astrometric accuracy.In this study,we compared the accuracy of five centering algorithms:Gaussian fitting,the modified moments method,and three point-spread function(PSF)fitting methods(effective PSF(ePSF),PSFEx,and extended PSF(x PSF)from the Cassini Imaging Central Laboratory for Operations(CICLOPS)).We assessed these algorithms using 70 ISS NAC star field images taken with CL1 and CL2 filters across different stellar magnitudes.The ePSF method consistently demonstrated the highest accuracy,achieving precision below 0.03 pixels for stars of magnitude 8-9.Compared to the previously considered best,the modified moments method,the e PSF method improved overall accuracy by about 10%and 21%in the sample and line directions,respectively.Surprisingly,the xPSF model provided by CICLOPS had lower precision than the ePSF.Conversely,the ePSF exhibits an improvement in measurement precision of 23%and 17%in the sample and line directions,respectively,over the xPSF.This discrepancy might be attributed to the xPSF focusing on photometry rather than astrometry.These findings highlight the necessity of constructing PSF models specifically tailored for astrometric purposes in NAC images and provide guidance for enhancing astrometric measurements using these ISS NAC images.展开更多
Deep Convolutional Neural Networks(CNNs)have achieved high accuracy in image classification tasks,however,most existing models are trained on high-quality images that are not subject to image degradation.In practice,i...Deep Convolutional Neural Networks(CNNs)have achieved high accuracy in image classification tasks,however,most existing models are trained on high-quality images that are not subject to image degradation.In practice,images are often affected by various types of degradation which can significantly impact the performance of CNNs.In this work,we investigate the influence of image degradation on three typical image classification CNNs and propose a Degradation Type Adaptive Image Classification Model(DTA-ICM)to improve the existing CNNs’classification accuracy on degraded images.The proposed DTA-ICM comprises two key components:a Degradation Type Predictor(DTP)and a Degradation Type Specified Image Classifier(DTS-IC)set,which is trained on existing CNNs for specified types of degradation.The DTP predicts the degradation type of a test image,and the corresponding DTS-IC is then selected to classify the image.We evaluate the performance of both the proposed DTP and the DTA-ICMon the Caltech 101 database.The experimental results demonstrate that the proposed DTP achieves an average accuracy of 99.70%.Moreover,the proposed DTA-ICM,based on AlexNet,VGG19,and ResNet152,exhibits an average accuracy improvement of 20.63%,18.22%,and 12.9%,respectively,compared with the original CNNs in classifying degraded images.It suggests that the proposed DTA-ICM can effectively improve the classification performance of existing CNNs on degraded images,which has important practical implications.展开更多
We redesign the parameterized quantum circuit in the quantum deep neural network, construct a three-layer structure as the hidden layer, and then use classical optimization algorithms to train the parameterized quantu...We redesign the parameterized quantum circuit in the quantum deep neural network, construct a three-layer structure as the hidden layer, and then use classical optimization algorithms to train the parameterized quantum circuit, thereby propose a novel hybrid quantum deep neural network(HQDNN) used for image classification. After bilinear interpolation reduces the original image to a suitable size, an improved novel enhanced quantum representation(INEQR) is used to encode it into quantum states as the input of the HQDNN. Multi-layer parameterized quantum circuits are used as the main structure to implement feature extraction and classification. The output results of parameterized quantum circuits are converted into classical data through quantum measurements and then optimized on a classical computer. To verify the performance of the HQDNN, we conduct binary classification and three classification experiments on the MNIST(Modified National Institute of Standards and Technology) data set. In the first binary classification, the accuracy of 0 and 4 exceeds98%. Then we compare the performance of three classification with other algorithms, the results on two datasets show that the classification accuracy is higher than that of quantum deep neural network and general quantum convolutional neural network.展开更多
The intuitive fuzzy set has found important application in decision-making and machine learning.To enrich and utilize the intuitive fuzzy set,this study designed and developed a deep neural network-based glaucoma eye ...The intuitive fuzzy set has found important application in decision-making and machine learning.To enrich and utilize the intuitive fuzzy set,this study designed and developed a deep neural network-based glaucoma eye detection using fuzzy difference equations in the domain where the retinal images converge.Retinal image detections are categorized as normal eye recognition,suspected glaucomatous eye recognition,and glaucomatous eye recognition.Fuzzy degrees associated with weighted values are calculated to determine the level of concentration between the fuzzy partition and the retinal images.The proposed model was used to diagnose glaucoma using retinal images and involved utilizing the Convolutional Neural Network(CNN)and deep learning to identify the fuzzy weighted regularization between images.This methodology was used to clarify the input images and make them adequate for the process of glaucoma detection.The objective of this study was to propose a novel approach to the early diagnosis of glaucoma using the Fuzzy Expert System(FES)and Fuzzy differential equation(FDE).The intensities of the different regions in the images and their respective peak levels were determined.Once the peak regions were identified,the recurrence relationships among those peaks were then measured.Image partitioning was done due to varying degrees of similar and dissimilar concentrations in the image.Similar and dissimilar concentration levels and spatial frequency generated a threshold image from the combined fuzzy matrix and FDE.This distinguished between a normal and abnormal eye condition,thus detecting patients with glaucomatous eyes.展开更多
Meta-learning of dental X-rays is a machine learning technique that can be used to train models to perform new tasks quickly and with minimal input.Instead of just memorizing a task,this is accomplished through teachi...Meta-learning of dental X-rays is a machine learning technique that can be used to train models to perform new tasks quickly and with minimal input.Instead of just memorizing a task,this is accomplished through teaching a model how to learn.Algorithms for meta-learning are typically trained on a collection of training problems,each of which has a limited number of labelled instances.Multiple Xray classification tasks,including the detection of pneumonia,coronavirus disease 2019,and other disorders,have demonstrated the effectiveness of meta-learning.Meta-learning has the benefit of allowing models to be trained on dental X-ray datasets that are too few for more conventional machine learning methods.Due to the high cost and lengthy collection process associated with dental imaging datasets,this is significant for dental X-ray classification jobs.The ability to train models that are more resistant to fresh input is another benefit of meta-learning.展开更多
In the context of high compression rates applied to Joint Photographic Experts Group(JPEG)images through lossy compression techniques,image-blocking artifacts may manifest.This necessitates the restoration of the imag...In the context of high compression rates applied to Joint Photographic Experts Group(JPEG)images through lossy compression techniques,image-blocking artifacts may manifest.This necessitates the restoration of the image to its original quality.The challenge lies in regenerating significantly compressed images into a state in which these become identifiable.Therefore,this study focuses on the restoration of JPEG images subjected to substantial degradation caused by maximum lossy compression using Generative Adversarial Networks(GAN).The generator in this network is based on theU-Net architecture.It features a newhourglass structure that preserves the characteristics of the deep layers.In addition,the network incorporates two loss functions to generate natural and high-quality images:Low Frequency(LF)loss and High Frequency(HF)loss.HF loss uses a pretrained VGG-16 network and is configured using a specific layer that best represents features.This can enhance the performance in the high-frequency region.In contrast,LF loss is used to handle the low-frequency region.The two loss functions facilitate the generation of images by the generator,which can mislead the discriminator while accurately generating high-and low-frequency regions.Consequently,by removing the blocking effects frommaximum lossy compressed images,images inwhich identities could be recognized are generated.This study represents a significant improvement over previous research in terms of the image resolution performance.展开更多
1 Introduction.Chinese medicine has a long and rich history,dating back to the classics of the Qin and Han dynasties and extending to the integration of Chinese and Western medicine in the modern era.The vast amount o...1 Introduction.Chinese medicine has a long and rich history,dating back to the classics of the Qin and Han dynasties and extending to the integration of Chinese and Western medicine in the modern era.The vast amount of literature and scholarly works in this field makes it essential to thoroughly study the history of traditional Chinese medicine(TCM)in order to understand its development path throughout the ages and boost innovation based on tradition.This is why the sages emphasized the importance of“classifying the works into different schools and tracing back to their origins”(辨章学术,考镜源流).展开更多
Recovering high-quality inscription images from unknown and complex inscription noisy images is a challenging research issue.Different fromnatural images,character images pay more attention to stroke information.Howev...Recovering high-quality inscription images from unknown and complex inscription noisy images is a challenging research issue.Different fromnatural images,character images pay more attention to stroke information.However,existingmodelsmainly consider pixel-level informationwhile ignoring structural information of the character,such as its edge and glyph,resulting in reconstructed images with mottled local structure and character damage.To solve these problems,we propose a novel generative adversarial network(GAN)framework based on an edge-guided generator and a discriminator constructed by a dual-domain U-Net framework,i.e.,EDU-GAN.Unlike existing frameworks,the generator introduces the edge extractionmodule,guiding it into the denoising process through the attention mechanism,which maintains the edge detail of the restored inscription image.Moreover,a dual-domain U-Net-based discriminator is proposed to learn the global and local discrepancy between the denoised and the label images in both image and morphological domains,which is helpful to blind denoising tasks.The proposed dual-domain discriminator and generator for adversarial training can reduce local artifacts and keep the denoised character structure intact.Due to the lack of a real-inscription image,we built the real-inscription dataset to provide an effective benchmark for studying inscription image denoising.The experimental results show the superiority of our method both in the synthetic and real-inscription datasets.展开更多
AIM:To establish pupil diameter measurement algorithms based on infrared images that can be used in real-world clinical settings.METHODS:A total of 188 patients from outpatient clinic at He Eye Specialist Shenyang Hos...AIM:To establish pupil diameter measurement algorithms based on infrared images that can be used in real-world clinical settings.METHODS:A total of 188 patients from outpatient clinic at He Eye Specialist Shenyang Hospital from Spetember to December 2022 were included,and 13470 infrared pupil images were collected for the study.All infrared images for pupil segmentation were labeled using the Labelme software.The computation of pupil diameter is divided into four steps:image pre-processing,pupil identification and localization,pupil segmentation,and diameter calculation.Two major models are used in the computation process:the modified YoloV3 and Deeplabv 3+models,which must be trained beforehand.RESULTS:The test dataset included 1348 infrared pupil images.On the test dataset,the modified YoloV3 model had a detection rate of 99.98% and an average precision(AP)of 0.80 for pupils.The DeeplabV3+model achieved a background intersection over union(IOU)of 99.23%,a pupil IOU of 93.81%,and a mean IOU of 96.52%.The pupil diameters in the test dataset ranged from 20 to 56 pixels,with a mean of 36.06±6.85 pixels.The absolute error in pupil diameters between predicted and actual values ranged from 0 to 7 pixels,with a mean absolute error(MAE)of 1.06±0.96 pixels.CONCLUSION:This study successfully demonstrates a robust infrared image-based pupil diameter measurement algorithm,proven to be highly accurate and reliable for clinical application.展开更多
In the intelligent medical diagnosis area,Artificial Intelligence(AI)’s trustworthiness,reliability,and interpretability are critical,especially in cancer diagnosis.Traditional neural networks,while excellent at proc...In the intelligent medical diagnosis area,Artificial Intelligence(AI)’s trustworthiness,reliability,and interpretability are critical,especially in cancer diagnosis.Traditional neural networks,while excellent at processing natural images,often lack interpretability and adaptability when processing high-resolution digital pathological images.This limitation is particularly evident in pathological diagnosis,which is the gold standard of cancer diagnosis and relies on a pathologist’s careful examination and analysis of digital pathological slides to identify the features and progression of the disease.Therefore,the integration of interpretable AI into smart medical diagnosis is not only an inevitable technological trend but also a key to improving diagnostic accuracy and reliability.In this paper,we introduce an innovative Multi-Scale Multi-Branch Feature Encoder(MSBE)and present the design of the CrossLinkNet Framework.The MSBE enhances the network’s capability for feature extraction by allowing the adjustment of hyperparameters to configure the number of branches and modules.The CrossLinkNet Framework,serving as a versatile image segmentation network architecture,employs cross-layer encoder-decoder connections for multi-level feature fusion,thereby enhancing feature integration and segmentation accuracy.Comprehensive quantitative and qualitative experiments on two datasets demonstrate that CrossLinkNet,equipped with the MSBE encoder,not only achieves accurate segmentation results but is also adaptable to various tumor segmentation tasks and scenarios by replacing different feature encoders.Crucially,CrossLinkNet emphasizes the interpretability of the AI model,a crucial aspect for medical professionals,providing an in-depth understanding of the model’s decisions and thereby enhancing trust and reliability in AI-assisted diagnostics.展开更多
Introduction: Ultrafast latest developments in artificial intelligence (ΑΙ) have recently multiplied concerns regarding the future of robotic autonomy in surgery. However, the literature on the topic is still scarce...Introduction: Ultrafast latest developments in artificial intelligence (ΑΙ) have recently multiplied concerns regarding the future of robotic autonomy in surgery. However, the literature on the topic is still scarce. Aim: To test a novel AI commercially available tool for image analysis on a series of laparoscopic scenes. Methods: The research tools included OPENAI CHATGPT 4.0 with its corresponding image recognition plugin which was fed with a list of 100 laparoscopic selected snapshots from common surgical procedures. In order to score reliability of received responses from image-recognition bot, two corresponding scales were developed ranging from 0 - 5. The set of images was divided into two groups: unlabeled (Group A) and labeled (Group B), and according to the type of surgical procedure or image resolution. Results: AI was able to recognize correctly the context of surgical-related images in 97% of its reports. For the labeled surgical pictures, the image-processing bot scored 3.95/5 (79%), whilst for the unlabeled, it scored 2.905/5 (58.1%). Phases of the procedure were commented in detail, after all successful interpretations. With rates 4 - 5/5, the chatbot was able to talk in detail about the indications, contraindications, stages, instrumentation, complications and outcome rates of the operation discussed. Conclusion: Interaction between surgeon and chatbot appears to be an interesting frontend for further research by clinicians in parallel with evolution of its complex underlying infrastructure. In this early phase of using artificial intelligence for image recognition in surgery, no safe conclusions can be drawn by small cohorts with commercially available software. Further development of medically-oriented AI software and clinical world awareness are expected to bring fruitful information on the topic in the years to come.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.42322408,42188101,41974211,and 42074202)the Key Research Program of Frontier Sciences,Chinese Academy of Sciences(Grant No.QYZDJ-SSW-JSC028)+1 种基金the Strategic Priority Program on Space Science,Chinese Academy of Sciences(Grant Nos.XDA15052500,XDA15350201,and XDA15014800)supported by the Youth Innovation Promotion Association of the Chinese Academy of Sciences(Grant No.Y202045)。
文摘Astronomical imaging technologies are basic tools for the exploration of the universe,providing basic data for the research of astronomy and space physics.The Soft X-ray Imager(SXI)carried by the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)aims to capture two-dimensional(2-D)images of the Earth’s magnetosheath by using soft X-ray imaging.However,the observed 2-D images are affected by many noise factors,destroying the contained information,which is not conducive to the subsequent reconstruction of the three-dimensional(3-D)structure of the magnetopause.The analysis of SXI-simulated observation images shows that such damage cannot be evaluated with traditional restoration models.This makes it difficult to establish the mapping relationship between SXIsimulated observation images and target images by using mathematical models.We propose an image restoration algorithm for SXIsimulated observation images that can recover large-scale structure information on the magnetosphere.The idea is to train a patch estimator by selecting noise–clean patch pairs with the same distribution through the Classification–Expectation Maximization algorithm to achieve the restoration estimation of the SXI-simulated observation image,whose mapping relationship with the target image is established by the patch estimator.The Classification–Expectation Maximization algorithm is used to select multiple patch clusters with the same distribution and then train different patch estimators so as to improve the accuracy of the estimator.Experimental results showed that our image restoration algorithm is superior to other classical image restoration algorithms in the SXI-simulated observation image restoration task,according to the peak signal-to-noise ratio and structural similarity.The restoration results of SXI-simulated observation images are used in the tangent fitting approach and the computed tomography approach toward magnetospheric reconstruction techniques,significantly improving the reconstruction results.Hence,the proposed technology may be feasible for processing SXI-simulated observation images.
基金supported by the Research Council of Norway under contracts 223252/F50 and 300844/F50the Trond Mohn Foundation。
文摘Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but also to dayglow emissions produced by photoelectrons induced by sunlight.Nightglow emissions and scattered sunlight can contribute to the background signal.To fully utilize such images in space science,background contamination must be removed to isolate the auroral signal.Here we outline a data-driven approach to modeling the background intensity in multiple images by formulating linear inverse problems based on B-splines and spherical harmonics.The approach is robust,flexible,and iteratively deselects outliers,such as auroral emissions.The final model is smooth across the terminator and accounts for slow temporal variations and large-scale asymmetries in the dayglow.We demonstrate the model by using the three far ultraviolet cameras on the Imager for Magnetopause-to-Aurora Global Exploration(IMAGE)mission.The method can be applied to historical missions and is relevant for upcoming missions,such as the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.
基金supported by the National Natural Science Foundation of China(62375144 and 61875092)Tianjin Foundation of Natural Science(21JCYBJC00260)Beijing-Tianjin-Hebei Basic Research Cooperation Special Program(19JCZDJC65300).
文摘Limited by the dynamic range of the detector,saturation artifacts usually occur in optical coherence tomography(OCT)imaging for high scattering media.The available methods are difficult to remove saturation artifacts and restore texture completely in OCT images.We proposed a deep learning-based inpainting method of saturation artifacts in this paper.The generation mechanism of saturation artifacts was analyzed,and experimental and simulated datasets were built based on the mechanism.Enhanced super-resolution generative adversarial networks were trained by the clear–saturated phantom image pairs.The perfect reconstructed results of experimental zebrafish and thyroid OCT images proved its feasibility,strong generalization,and robustness.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金This research was funded by the National Natural Science Foundation of China(Nos.71762010,62262019,62162025,61966013,12162012)the Hainan Provincial Natural Science Foundation of China(Nos.823RC488,623RC481,620RC603,621QN241,620RC602,121RC536)+1 种基金the Haikou Science and Technology Plan Project of China(No.2022-016)the Project supported by the Education Department of Hainan Province,No.Hnky2021-23.
文摘Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR detection methods have mainly relied on manual feature extraction and classification,leading to errors.This paper proposes a novel VTDR detection and classification model that combines different models through majority voting.Our proposed methodology involves preprocessing,data augmentation,feature extraction,and classification stages.We use a hybrid convolutional neural network-singular value decomposition(CNN-SVD)model for feature extraction and selection and an improved SVM-RBF with a Decision Tree(DT)and K-Nearest Neighbor(KNN)for classification.We tested our model on the IDRiD dataset and achieved an accuracy of 98.06%,a sensitivity of 83.67%,and a specificity of 100%for DR detection and evaluation tests,respectively.Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection.
基金supported by a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT),Republic of KoreaThe authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/13/40)+2 种基金Also,the authors are thankful to Prince Satam bin Abdulaziz University for supporting this study via funding from Prince Satam bin Abdulaziz University project number(PSAU/2024/R/1445)This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R54)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.
基金the appreciation to the Deanship of Postgraduate Studies and ScientificResearch atMajmaah University for funding this research work through the Project Number R-2024-922.
文摘This paper emphasizes a faster digital processing time while presenting an accurate method for identifying spinefractures in X-ray pictures. The study focuses on efficiency by utilizing many methods that include picturesegmentation, feature reduction, and image classification. Two important elements are investigated to reducethe classification time: Using feature reduction software and leveraging the capabilities of sophisticated digitalprocessing hardware. The researchers use different algorithms for picture enhancement, including theWiener andKalman filters, and they look into two background correction techniques. The article presents a technique forextracting textural features and evaluates three picture segmentation algorithms and three fractured spine detectionalgorithms using transformdomain, PowerDensity Spectrum(PDS), andHigher-Order Statistics (HOS) for featureextraction.With an emphasis on reducing digital processing time, this all-encompassing method helps to create asimplified system for classifying fractured spine fractures. A feature reduction program code has been built toimprove the processing speed for picture classification. Overall, the proposed approach shows great potential forsignificantly reducing classification time in clinical settings where time is critical. In comparison to other transformdomains, the texture features’ discrete cosine transform (DCT) yielded an exceptional classification rate, and theprocess of extracting features from the transform domain took less time. More capable hardware can also result inquicker execution times for the feature extraction algorithms.
文摘The act of transmitting photos via the Internet has become a routine and significant activity.Enhancing the security measures to safeguard these images from counterfeiting and modifications is a critical domain that can still be further enhanced.This study presents a system that employs a range of approaches and algorithms to ensure the security of transmitted venous images.The main goal of this work is to create a very effective system for compressing individual biometrics in order to improve the overall accuracy and security of digital photographs by means of image compression.This paper introduces a content-based image authentication mechanism that is suitable for usage across an untrusted network and resistant to data loss during transmission.By employing scale attributes and a key-dependent parametric Long Short-Term Memory(LSTM),it is feasible to improve the resilience of digital signatures against image deterioration and strengthen their security against malicious actions.Furthermore,the successful implementation of transmitting biometric data in a compressed format over a wireless network has been accomplished.For applications involving the transmission and sharing of images across a network.The suggested technique utilizes the scalability of a structural digital signature to attain a satisfactory equilibrium between security and picture transfer.An effective adaptive compression strategy was created to lengthen the overall lifetime of the network by sharing the processing of responsibilities.This scheme ensures a large reduction in computational and energy requirements while minimizing image quality loss.This approach employs multi-scale characteristics to improve the resistance of signatures against image deterioration.The proposed system attained a Gaussian noise value of 98%and a rotation accuracy surpassing 99%.
基金funded by Researchers Supporting Program at King Saud University,(RSPD2024R809).
文摘In blood or bone marrow,leukemia is a form of cancer.A person with leukemia has an expansion of white blood cells(WBCs).It primarily affects children and rarely affects adults.Treatment depends on the type of leukemia and the extent to which cancer has established throughout the body.Identifying leukemia in the initial stage is vital to providing timely patient care.Medical image-analysis-related approaches grant safer,quicker,and less costly solutions while ignoring the difficulties of these invasive processes.It can be simple to generalize Computer vision(CV)-based and image-processing techniques and eradicate human error.Many researchers have implemented computer-aided diagnosticmethods andmachine learning(ML)for laboratory image analysis,hopefully overcoming the limitations of late leukemia detection and determining its subgroups.This study establishes a Marine Predators Algorithm with Deep Learning Leukemia Cancer Classification(MPADL-LCC)algorithm onMedical Images.The projectedMPADL-LCC system uses a bilateral filtering(BF)technique to pre-process medical images.The MPADL-LCC system uses Faster SqueezeNet withMarine Predators Algorithm(MPA)as a hyperparameter optimizer for feature extraction.Lastly,the denoising autoencoder(DAE)methodology can be executed to accurately detect and classify leukemia cancer.The hyperparameter tuning process using MPA helps enhance leukemia cancer classification performance.Simulation results are compared with other recent approaches concerning various measurements and the MPADL-LCC algorithm exhibits the best results over other recent approaches.
基金supported by the National Natural Science Foundation of China(No.12373073,U2031104,No.12173015)Guangdong Basic and Applied Basic Research Foundation(No.2023A1515011340)。
文摘Obtaining high precision is an important consideration for astrometric studies using images from the Narrow Angle Camera(NAC)of the Cassini Imaging Science Subsystem(ISS).Selecting the best centering algorithm is key to enhancing astrometric accuracy.In this study,we compared the accuracy of five centering algorithms:Gaussian fitting,the modified moments method,and three point-spread function(PSF)fitting methods(effective PSF(ePSF),PSFEx,and extended PSF(x PSF)from the Cassini Imaging Central Laboratory for Operations(CICLOPS)).We assessed these algorithms using 70 ISS NAC star field images taken with CL1 and CL2 filters across different stellar magnitudes.The ePSF method consistently demonstrated the highest accuracy,achieving precision below 0.03 pixels for stars of magnitude 8-9.Compared to the previously considered best,the modified moments method,the e PSF method improved overall accuracy by about 10%and 21%in the sample and line directions,respectively.Surprisingly,the xPSF model provided by CICLOPS had lower precision than the ePSF.Conversely,the ePSF exhibits an improvement in measurement precision of 23%and 17%in the sample and line directions,respectively,over the xPSF.This discrepancy might be attributed to the xPSF focusing on photometry rather than astrometry.These findings highlight the necessity of constructing PSF models specifically tailored for astrometric purposes in NAC images and provide guidance for enhancing astrometric measurements using these ISS NAC images.
基金This work was supported by Special Funds for the Construction of an Innovative Province of Hunan(GrantNo.2020GK2028)lNatural Science Foundation of Hunan Province(Grant No.2022JJ30002)lScientific Research Project of Hunan Provincial EducationDepartment(GrantNo.21B0833)lScientific Research Key Project of Hunan Education Department(Grant No.21A0592)lScientific Research Project of Hunan Provincial Education Department(Grant No.22A0663).
文摘Deep Convolutional Neural Networks(CNNs)have achieved high accuracy in image classification tasks,however,most existing models are trained on high-quality images that are not subject to image degradation.In practice,images are often affected by various types of degradation which can significantly impact the performance of CNNs.In this work,we investigate the influence of image degradation on three typical image classification CNNs and propose a Degradation Type Adaptive Image Classification Model(DTA-ICM)to improve the existing CNNs’classification accuracy on degraded images.The proposed DTA-ICM comprises two key components:a Degradation Type Predictor(DTP)and a Degradation Type Specified Image Classifier(DTS-IC)set,which is trained on existing CNNs for specified types of degradation.The DTP predicts the degradation type of a test image,and the corresponding DTS-IC is then selected to classify the image.We evaluate the performance of both the proposed DTP and the DTA-ICMon the Caltech 101 database.The experimental results demonstrate that the proposed DTP achieves an average accuracy of 99.70%.Moreover,the proposed DTA-ICM,based on AlexNet,VGG19,and ResNet152,exhibits an average accuracy improvement of 20.63%,18.22%,and 12.9%,respectively,compared with the original CNNs in classifying degraded images.It suggests that the proposed DTA-ICM can effectively improve the classification performance of existing CNNs on degraded images,which has important practical implications.
基金Project supported by the Natural Science Foundation of Shandong Province,China (Grant No. ZR2021MF049)the Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos. ZR2022LLZ012 and ZR2021LLZ001)。
文摘We redesign the parameterized quantum circuit in the quantum deep neural network, construct a three-layer structure as the hidden layer, and then use classical optimization algorithms to train the parameterized quantum circuit, thereby propose a novel hybrid quantum deep neural network(HQDNN) used for image classification. After bilinear interpolation reduces the original image to a suitable size, an improved novel enhanced quantum representation(INEQR) is used to encode it into quantum states as the input of the HQDNN. Multi-layer parameterized quantum circuits are used as the main structure to implement feature extraction and classification. The output results of parameterized quantum circuits are converted into classical data through quantum measurements and then optimized on a classical computer. To verify the performance of the HQDNN, we conduct binary classification and three classification experiments on the MNIST(Modified National Institute of Standards and Technology) data set. In the first binary classification, the accuracy of 0 and 4 exceeds98%. Then we compare the performance of three classification with other algorithms, the results on two datasets show that the classification accuracy is higher than that of quantum deep neural network and general quantum convolutional neural network.
基金funding the publication of this research through the Researchers Supporting Program (RSPD2023R809),King Saud University,Riyadh,Saudi Arabia.
文摘The intuitive fuzzy set has found important application in decision-making and machine learning.To enrich and utilize the intuitive fuzzy set,this study designed and developed a deep neural network-based glaucoma eye detection using fuzzy difference equations in the domain where the retinal images converge.Retinal image detections are categorized as normal eye recognition,suspected glaucomatous eye recognition,and glaucomatous eye recognition.Fuzzy degrees associated with weighted values are calculated to determine the level of concentration between the fuzzy partition and the retinal images.The proposed model was used to diagnose glaucoma using retinal images and involved utilizing the Convolutional Neural Network(CNN)and deep learning to identify the fuzzy weighted regularization between images.This methodology was used to clarify the input images and make them adequate for the process of glaucoma detection.The objective of this study was to propose a novel approach to the early diagnosis of glaucoma using the Fuzzy Expert System(FES)and Fuzzy differential equation(FDE).The intensities of the different regions in the images and their respective peak levels were determined.Once the peak regions were identified,the recurrence relationships among those peaks were then measured.Image partitioning was done due to varying degrees of similar and dissimilar concentrations in the image.Similar and dissimilar concentration levels and spatial frequency generated a threshold image from the combined fuzzy matrix and FDE.This distinguished between a normal and abnormal eye condition,thus detecting patients with glaucomatous eyes.
文摘Meta-learning of dental X-rays is a machine learning technique that can be used to train models to perform new tasks quickly and with minimal input.Instead of just memorizing a task,this is accomplished through teaching a model how to learn.Algorithms for meta-learning are typically trained on a collection of training problems,each of which has a limited number of labelled instances.Multiple Xray classification tasks,including the detection of pneumonia,coronavirus disease 2019,and other disorders,have demonstrated the effectiveness of meta-learning.Meta-learning has the benefit of allowing models to be trained on dental X-ray datasets that are too few for more conventional machine learning methods.Due to the high cost and lengthy collection process associated with dental imaging datasets,this is significant for dental X-ray classification jobs.The ability to train models that are more resistant to fresh input is another benefit of meta-learning.
基金supported by the Technology Development Program(S3344882)funded by the Ministry of SMEs and Startups(MSS,Korea).
文摘In the context of high compression rates applied to Joint Photographic Experts Group(JPEG)images through lossy compression techniques,image-blocking artifacts may manifest.This necessitates the restoration of the image to its original quality.The challenge lies in regenerating significantly compressed images into a state in which these become identifiable.Therefore,this study focuses on the restoration of JPEG images subjected to substantial degradation caused by maximum lossy compression using Generative Adversarial Networks(GAN).The generator in this network is based on theU-Net architecture.It features a newhourglass structure that preserves the characteristics of the deep layers.In addition,the network incorporates two loss functions to generate natural and high-quality images:Low Frequency(LF)loss and High Frequency(HF)loss.HF loss uses a pretrained VGG-16 network and is configured using a specific layer that best represents features.This can enhance the performance in the high-frequency region.In contrast,LF loss is used to handle the low-frequency region.The two loss functions facilitate the generation of images by the generator,which can mislead the discriminator while accurately generating high-and low-frequency regions.Consequently,by removing the blocking effects frommaximum lossy compressed images,images inwhich identities could be recognized are generated.This study represents a significant improvement over previous research in terms of the image resolution performance.
基金financed by the grant from Beijing Social Science(No. 18LSB002)。
文摘1 Introduction.Chinese medicine has a long and rich history,dating back to the classics of the Qin and Han dynasties and extending to the integration of Chinese and Western medicine in the modern era.The vast amount of literature and scholarly works in this field makes it essential to thoroughly study the history of traditional Chinese medicine(TCM)in order to understand its development path throughout the ages and boost innovation based on tradition.This is why the sages emphasized the importance of“classifying the works into different schools and tracing back to their origins”(辨章学术,考镜源流).
基金supported by the Key R&D Program of Shaanxi Province,China(Grant Nos.2022GY-274,2023-YBSF-505)the National Natural Science Foundation of China(Grant No.62273273).
文摘Recovering high-quality inscription images from unknown and complex inscription noisy images is a challenging research issue.Different fromnatural images,character images pay more attention to stroke information.However,existingmodelsmainly consider pixel-level informationwhile ignoring structural information of the character,such as its edge and glyph,resulting in reconstructed images with mottled local structure and character damage.To solve these problems,we propose a novel generative adversarial network(GAN)framework based on an edge-guided generator and a discriminator constructed by a dual-domain U-Net framework,i.e.,EDU-GAN.Unlike existing frameworks,the generator introduces the edge extractionmodule,guiding it into the denoising process through the attention mechanism,which maintains the edge detail of the restored inscription image.Moreover,a dual-domain U-Net-based discriminator is proposed to learn the global and local discrepancy between the denoised and the label images in both image and morphological domains,which is helpful to blind denoising tasks.The proposed dual-domain discriminator and generator for adversarial training can reduce local artifacts and keep the denoised character structure intact.Due to the lack of a real-inscription image,we built the real-inscription dataset to provide an effective benchmark for studying inscription image denoising.The experimental results show the superiority of our method both in the synthetic and real-inscription datasets.
文摘AIM:To establish pupil diameter measurement algorithms based on infrared images that can be used in real-world clinical settings.METHODS:A total of 188 patients from outpatient clinic at He Eye Specialist Shenyang Hospital from Spetember to December 2022 were included,and 13470 infrared pupil images were collected for the study.All infrared images for pupil segmentation were labeled using the Labelme software.The computation of pupil diameter is divided into four steps:image pre-processing,pupil identification and localization,pupil segmentation,and diameter calculation.Two major models are used in the computation process:the modified YoloV3 and Deeplabv 3+models,which must be trained beforehand.RESULTS:The test dataset included 1348 infrared pupil images.On the test dataset,the modified YoloV3 model had a detection rate of 99.98% and an average precision(AP)of 0.80 for pupils.The DeeplabV3+model achieved a background intersection over union(IOU)of 99.23%,a pupil IOU of 93.81%,and a mean IOU of 96.52%.The pupil diameters in the test dataset ranged from 20 to 56 pixels,with a mean of 36.06±6.85 pixels.The absolute error in pupil diameters between predicted and actual values ranged from 0 to 7 pixels,with a mean absolute error(MAE)of 1.06±0.96 pixels.CONCLUSION:This study successfully demonstrates a robust infrared image-based pupil diameter measurement algorithm,proven to be highly accurate and reliable for clinical application.
基金supported by the National Natural Science Foundation of China(Grant Numbers:62372083,62072074,62076054,62027827,62002047)the Sichuan Provincial Science and Technology Innovation Platform and Talent Program(Grant Number:2022JDJQ0039)+1 种基金the Sichuan Provincial Science and Technology Support Program(Grant Numbers:2022YFQ0045,2022YFS0220,2021YFG0131,2023YFS0020,2023YFS0197,2023YFG0148)the CCF-Baidu Open Fund(Grant Number:202312).
文摘In the intelligent medical diagnosis area,Artificial Intelligence(AI)’s trustworthiness,reliability,and interpretability are critical,especially in cancer diagnosis.Traditional neural networks,while excellent at processing natural images,often lack interpretability and adaptability when processing high-resolution digital pathological images.This limitation is particularly evident in pathological diagnosis,which is the gold standard of cancer diagnosis and relies on a pathologist’s careful examination and analysis of digital pathological slides to identify the features and progression of the disease.Therefore,the integration of interpretable AI into smart medical diagnosis is not only an inevitable technological trend but also a key to improving diagnostic accuracy and reliability.In this paper,we introduce an innovative Multi-Scale Multi-Branch Feature Encoder(MSBE)and present the design of the CrossLinkNet Framework.The MSBE enhances the network’s capability for feature extraction by allowing the adjustment of hyperparameters to configure the number of branches and modules.The CrossLinkNet Framework,serving as a versatile image segmentation network architecture,employs cross-layer encoder-decoder connections for multi-level feature fusion,thereby enhancing feature integration and segmentation accuracy.Comprehensive quantitative and qualitative experiments on two datasets demonstrate that CrossLinkNet,equipped with the MSBE encoder,not only achieves accurate segmentation results but is also adaptable to various tumor segmentation tasks and scenarios by replacing different feature encoders.Crucially,CrossLinkNet emphasizes the interpretability of the AI model,a crucial aspect for medical professionals,providing an in-depth understanding of the model’s decisions and thereby enhancing trust and reliability in AI-assisted diagnostics.
文摘Introduction: Ultrafast latest developments in artificial intelligence (ΑΙ) have recently multiplied concerns regarding the future of robotic autonomy in surgery. However, the literature on the topic is still scarce. Aim: To test a novel AI commercially available tool for image analysis on a series of laparoscopic scenes. Methods: The research tools included OPENAI CHATGPT 4.0 with its corresponding image recognition plugin which was fed with a list of 100 laparoscopic selected snapshots from common surgical procedures. In order to score reliability of received responses from image-recognition bot, two corresponding scales were developed ranging from 0 - 5. The set of images was divided into two groups: unlabeled (Group A) and labeled (Group B), and according to the type of surgical procedure or image resolution. Results: AI was able to recognize correctly the context of surgical-related images in 97% of its reports. For the labeled surgical pictures, the image-processing bot scored 3.95/5 (79%), whilst for the unlabeled, it scored 2.905/5 (58.1%). Phases of the procedure were commented in detail, after all successful interpretations. With rates 4 - 5/5, the chatbot was able to talk in detail about the indications, contraindications, stages, instrumentation, complications and outcome rates of the operation discussed. Conclusion: Interaction between surgeon and chatbot appears to be an interesting frontend for further research by clinicians in parallel with evolution of its complex underlying infrastructure. In this early phase of using artificial intelligence for image recognition in surgery, no safe conclusions can be drawn by small cohorts with commercially available software. Further development of medically-oriented AI software and clinical world awareness are expected to bring fruitful information on the topic in the years to come.