Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solu...Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solution for detection and monitoring.Unmanned aerial vehicles(UAVs)have recently emerged as a tool for algal bloom detection,efficiently providing on-demand images at high spatiotemporal resolutions.This study developed an image processing method for algal bloom area estimation from the aerial images(obtained from the internet)captured using UAVs.As a remote sensing method of HAB detection,analysis,and monitoring,a combination of histogram and texture analyses was used to efficiently estimate the area of HABs.Statistical features like entropy(using the Kullback-Leibler method)were emphasized with the aid of a gray-level co-occurrence matrix.The results showed that the orthogonal images demonstrated fewer errors,and the morphological filter best detected algal blooms in real time,with a precision of 80%.This study provided efficient image processing approaches using on-board UAVs for HAB monitoring.展开更多
In the context of the accelerated pace of daily life and the development of e-commerce,online shopping is a mainstreamway for consumers to access products and services.To understand their emotional expressions in faci...In the context of the accelerated pace of daily life and the development of e-commerce,online shopping is a mainstreamway for consumers to access products and services.To understand their emotional expressions in facing different shopping experience scenarios,this paper presents a sentiment analysis method that combines the ecommerce reviewkeyword-generated imagewith a hybrid machine learning-basedmodel,inwhich theWord2Vec-TextRank is used to extract keywords that act as the inputs for generating the related images by generative Artificial Intelligence(AI).Subsequently,a hybrid Convolutional Neural Network and Support Vector Machine(CNNSVM)model is applied for sentiment classification of those keyword-generated images.For method validation,the data randomly comprised of 5000 reviews from Amazon have been analyzed.With superior keyword extraction capability,the proposedmethod achieves impressive results on sentiment classification with a remarkable accuracy of up to 97.13%.Such performance demonstrates its advantages by using the text-to-image approach,providing a unique perspective for sentiment analysis in the e-commerce review data compared to the existing works.Thus,the proposed method enhances the reliability and insights of customer feedback surveys,which would also establish a novel direction in similar cases,such as social media monitoring and market trend research.展开更多
The Ki67 index (KI) is a standard clinical marker for tumor proliferation;however, its application is hindered by intratumoral heterogeneity. In this study, we used digital image analysis to comprehensively analyze Ki...The Ki67 index (KI) is a standard clinical marker for tumor proliferation;however, its application is hindered by intratumoral heterogeneity. In this study, we used digital image analysis to comprehensively analyze Ki67 heterogeneity and distribution patterns in breast carcinoma. Using Smart Pathology software, we digitized and analyzed 42 excised breast carcinoma Ki67 slides. Boxplots, histograms, and heat maps were generated to illustrate the KI distribution. We found that 30% of cases (13/42) exhibited discrepancies between global and hotspot KI when using a 14% KI threshold for classification. Patients with higher global or hotspot KI values displayed greater heterogenicity. Ki67 distribution patterns were categorized as randomly distributed (52%, 22/42), peripheral (43%, 18/42), and centered (5%, 2/42). Our sampling simulator indicated analyzing more than 10 high-power fields was typically required to accurately estimate global KI, with sampling size being correlated with heterogeneity. In conclusion, using digital image analysis in whole-slide images allows for comprehensive Ki67 profile assessment, shedding light on heterogeneity and distribution patterns. This spatial information can facilitate KI surveys of breast cancer and other malignancies.展开更多
Objective To analyze the differences in the correlation of tongue image indicators among patients with benign lung nodules and lung cancer.Methods From July 1;2020 to March 31;2022;clinical information of lung cancer ...Objective To analyze the differences in the correlation of tongue image indicators among patients with benign lung nodules and lung cancer.Methods From July 1;2020 to March 31;2022;clinical information of lung cancer patients and benign lung nodules patients was collected at the Oncology Department of Longhua Hos-pital Affiliated to Shanghai University of Traditional Chinese Medicine and the Physical Ex-amination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chi-nese Medicine;respectively.We obtained tongue images from patients with benign lung nod-ules and lung cancer using the TFDA-1 digital tongue diagnosis instrument;and analyzed these images with the TDAS V2.0 software.The extracted indicators included color space pa-rameters in the Lab system for both the tongue body(TB)and tongue coating(TC)(TB/TC-L;TB/TC-a;and TB/TC-b);textural parameters[TB/TC-contrast(CON);TB/TC-angular second moment(ASM);TB/TC-entropy(ENT);and TB/TC-MEAN];as well as TC parameters(perAll and perPart).The bivariate correlation of TB and TC features was analyzed using Pearson’s or Spearman’s correlation analysis;and the overall correlation was analyzed using canonical correlation analysis(CCA).Results Samples from 307 patients with benign lung nodules and 276 lung cancer patients were included after excluding outliers and extreme values.Simple correlation analysis indi-cated that the correlation of TB-L with TC-L;TB-b with TC-b;and TB-b with perAll in lung cancer group was higher than that in benign nodules group.Moreover;the correlation of TB-a with TC-a;TB-a with perAll;and the texture parameters of the TB(TB-CON;TB-ASM;TB-ENT;and TB-MEAN)with the texture parameters of the TC(TC-CON;TC-ASM;TC-ENT;and TC-MEAN)in benign nodules group was higher than lung cancer group.CCA further demon-strated a strong correlation between the TB and TC parameters in lung cancer group;with the first and second pairs of typical variables in benign nodules and lung cancer groups indicat-ing correlation coefficients of 0.918 and 0.817(P<0.05);and 0.940 and 0.822(P<0.05);re-spectively.Conclusion Benign lung nodules and lung cancer patients exhibited differences in correla-tion in the L;a;and b values of the TB and TC;as well as the perAll value of the TC;and the texture parameters(TB/TC-CON;TB/TC-ASM;TB/TC-ENT;and TB/TC-MEAN)between the TB and TC.Additionally;there were differences in the overall correlation of the TB and TC be-tween the two groups.Objective tongue diagnosis indicators can effectively assist in the diag-nosis of benign lung nodules and lung cancer;thereby providing a scientific basis for the ear-ly detection;diagnosis;and treatment of lung cancer.展开更多
The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method in...The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos.展开更多
In recent years,more and more directors of culture and tourism have taken part in the promotion of local cultural tourism by cross-dressing,talent shows,and pushing their limits on self-media platforms.This study inve...In recent years,more and more directors of culture and tourism have taken part in the promotion of local cultural tourism by cross-dressing,talent shows,and pushing their limits on self-media platforms.This study investigates short videos of Lingnan culture promoted by directors general and deputy directors general of the Culture,Radio,Television,Tourism,and Sports Bureau of counties and cities in Guangdong Province on social media by the method of multimodal critical discourse analysis.The analysis of 33 videos shows that Lingnan culture is a domineering and confident culture,historical culture,graceful and elegant culture,and vibrant and active culture.Domineering and confident culture is embedded in the utterances and behaviors of the directors general or deputy directors general in the video.Historical culture is realized through the conversation with historical figures through time travel.Graceful and elegant culture is constructed in the depiction of sceneries and the depiction of characters’manners.Vibrant and active culture is represented in the depiction of the characters’actional process and analytical process.展开更多
In today’s information age,video data,as an important carrier of information,is growing explosively in terms of production volume.The quick and accurate extraction of useful information from massive video data has be...In today’s information age,video data,as an important carrier of information,is growing explosively in terms of production volume.The quick and accurate extraction of useful information from massive video data has become a focus of research in the field of computer vision.AI dynamic recognition technology has become one of the key technologies to address this issue due to its powerful data processing capabilities and intelligent recognition functions.Based on this,this paper first elaborates on the development of intelligent video AI dynamic recognition technology,then proposes several optimization strategies for intelligent video AI dynamic recognition technology,and finally analyzes the performance of intelligent video AI dynamic recognition technology for reference.展开更多
Corporate identity construction of external publicity image is an important part of the development of enterprises.Based on Wodak’s discourse-historical approach,this study takes the text of COFCO’s English promotio...Corporate identity construction of external publicity image is an important part of the development of enterprises.Based on Wodak’s discourse-historical approach,this study takes the text of COFCO’s English promotional video as the research object,analyzes the corporate brand image,media image,organizational image,and environmental image constructed by the enterprises from three steps:linguistic expression,discourse strategy,and theme to provide references for Chinese enterprises to enhance their international influence.展开更多
The continuous growth in the scale of unmanned aerial vehicle (UAV) applications in transmission line inspection has resulted in a corresponding increase in the demand for UAV inspection image processing. Owing to its...The continuous growth in the scale of unmanned aerial vehicle (UAV) applications in transmission line inspection has resulted in a corresponding increase in the demand for UAV inspection image processing. Owing to its excellent performance in computer vision, deep learning has been applied to UAV inspection image processing tasks such as power line identification and insulator defect detection. Despite their excellent performance, electric power UAV inspection image processing models based on deep learning face several problems such as a small application scope, the need for constant retraining and optimization, and high R&D monetary and time costs due to the black-box and scene data-driven characteristics of deep learning. In this study, an automated deep learning system for electric power UAV inspection image analysis and processing is proposed as a solution to the aforementioned problems. This system design is based on the three critical design principles of generalizability, extensibility, and automation. Pre-trained models, fine-tuning (downstream task adaptation), and automated machine learning, which are closely related to these design principles, are reviewed. In addition, an automated deep learning system architecture for electric power UAV inspection image analysis and processing is presented. A prototype system was constructed and experiments were conducted on the two electric power UAV inspection image analysis and processing tasks of insulator self-detonation and bird nest recognition. The models constructed using the prototype system achieved 91.36% and 86.13% mAP for insulator self-detonation and bird nest recognition, respectively. This demonstrates that the system design concept is reasonable and the system architecture feasible .展开更多
An important index to evaluate the process efficiency of coal preparation is the mineral liberation degree of pulverized coal,which is greatly influenced by the particle size and shape distribution acquired by image s...An important index to evaluate the process efficiency of coal preparation is the mineral liberation degree of pulverized coal,which is greatly influenced by the particle size and shape distribution acquired by image segmentation.However,the agglomeration effect of fine powders and the edge effect of granular images caused by scanning electron microscopy greatly affect the precision of particle image segmentation.In this study,we propose a novel image segmentation method derived from mask regional convolutional neural network based on deep learning for recognizing fine coal powders.Firstly,an atrous convolution is introduced into our network to learn the image feature of multi-sized powders,which can reduce the missing segmentation of small-sized agglomerated particles.Then,a new mask loss function combing focal loss and dice coefficient is used to overcome the false segmentation caused by the edge effect.The final comparative experimental results show that our method achieves the best results of 94.43%and 91.44%on AP50 and AP75 respectively among the comparison algorithms.In addition,in order to provide an effective method for particle size analysis of coal particles,we study the particle size distribution of coal powders based on the proposed image segmentation method and obtain a good curve relationship between cumulative mass fraction and particle size.展开更多
The use of unmanned aerial vehicles(UAV)for forest monitoring has grown significantly in recent years,providing information with high spatial resolution and temporal versatility.UAV with multispectral sensors allow th...The use of unmanned aerial vehicles(UAV)for forest monitoring has grown significantly in recent years,providing information with high spatial resolution and temporal versatility.UAV with multispectral sensors allow the use of indexes such as the normalized difference vegetation index(NDVI),which determines the vigor,physiological stress and photo synthetic activity of vegetation.This study aimed to analyze the spectral responses and variations of NDVI in tree crowns,as well as their correlation with climatic factors over the course of one year.The study area encompassed a 1.6-ha site in Durango,Mexico,where Pinus cembroides,Pinus engelmannii,and Quercus grisea coexist.Multispectral images were acquired with UAV and information on meteorological variables was obtained from NASA/POWER database.An ANOVA explored possible differences in NDVI among the three species.Pearson correlation was performed to identify the linear relationship between NDVI and meteorological variables.Significant differences in NDVI values were found at the genus level(Pinus and Quercus),possibly related to the physiological features of the species and their phenology.Quercus grisea had the lowest NDVI values throughout the year which may be attributed to its sensitivity to relative humidity and temperatures.Although the use of UAV with a multispectral sensor for NDVI monitoring allowed genera differentiation,in more complex forest analyses hyperspectral and LiDAR sensors should be integrated,as well other vegetation indexes be considered.展开更多
Medical image analysis is an active research topic,with thousands of studies published in the past few years.Transfer learning(TL)including convolutional neural networks(CNNs)focused to enhance efficiency on an innova...Medical image analysis is an active research topic,with thousands of studies published in the past few years.Transfer learning(TL)including convolutional neural networks(CNNs)focused to enhance efficiency on an innovative task using the knowledge of the same tasks learnt in advance.It has played a major role in medical image analysis since it solves the data scarcity issue along with that it saves hardware resources and time.This study develops an EnhancedTunicate SwarmOptimization withTransfer Learning EnabledMedical Image Analysis System(ETSOTL-MIAS).The goal of the ETSOTL-MIAS technique lies in the identification and classification of diseases through medical imaging.The ETSOTL-MIAS technique involves the Chan Vese segmentation technique to identify the affected regions in the medical image.For feature extraction purposes,the ETSOTL-MIAS technique designs a modified DarkNet-53 model.To avoid the manual hyperparameter adjustment process,the ETSOTLMIAS technique exploits the ETSO algorithm,showing the novelty of the work.Finally,the classification of medical images takes place by random forest(RF)classifier.The performance validation of the ETSOTL-MIAS technique is tested on a benchmark medical image database.The extensive experimental analysis showed the promising performance of the ETSOTL-MIAS technique under different measures.展开更多
Biomedical image processing is widely utilized for disease detection and classification of biomedical images.Tongue color image analysis is an effective and non-invasive tool for carrying out secondary detection at an...Biomedical image processing is widely utilized for disease detection and classification of biomedical images.Tongue color image analysis is an effective and non-invasive tool for carrying out secondary detection at anytime and anywhere.For removing the qualitative aspect,tongue images are quantitatively inspected,proposing a novel disease classification model in an automated way is preferable.This article introduces a novel political optimizer with deep learning enabled tongue color image analysis(PODL-TCIA)technique.The presented PODL-TCIA model purposes to detect the occurrence of the disease by examining the color of the tongue.To attain this,the PODL-TCIA model initially performs image pre-processing to enhance medical image quality.Followed by,Inception with ResNet-v2 model is employed for feature extraction.Besides,political optimizer(PO)with twin support vector machine(TSVM)model is exploited for image classification process,shows the novelty of the work.The design of PO algorithm assists in the optimal parameter selection of the TSVM model.For ensuring the enhanced outcomes of the PODL-TCIA model,a wide-ranging experimental analysis was applied and the outcomes reported the betterment of the PODL-TCIA model over the recent approaches.展开更多
In this paper, motion analysis methods based on the moment features and flicker frequency features for early fire flame from ordinary CCD video camera were proposed, and in order to describe the changing of flame and ...In this paper, motion analysis methods based on the moment features and flicker frequency features for early fire flame from ordinary CCD video camera were proposed, and in order to describe the changing of flame and disturbance of non-flame phenomena further more, the average changing pixel number of the first-order moments of consecutive flames has been defined in the moment analysis as well. The first-order moments of all kinds of flames used in our experiments present irregularly flickering, and their average changing pixel numbers of first-order moments are greater than fire-like disturbances. For the analysis of flicker frequency of flame, which is extracted and calculated in spatial domain, and therefore it is computational simple and fast. The method of extracting flicker frequency from video images is not affected by the catalogues of combustion material and distance. In experiments, we adopted two kinds of flames, i. e. , fixed flame and movable flame. Many comparing and disturbing experiments were done and verified that the methods can be used as criteria for early fire detection.展开更多
Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional imag...Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional images of specimens with single particle size of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10 ram. Based on the in-house developed 3D image analysis programs using Matlab, the volume porosity, pore size distribution and degree of connectivity were calculated and analyzed in detail. The results indicate that the volume porosity, the mean diameter of pores and the effective pore size (d50) increase with the increasing of particle size. Lognormal distribution or Gauss distribution is mostly suitable to model the pore size distribution. The degree of connectivity investigated on the basis of cluster-labeling algorithm also increases with increasing the particle size approximately.展开更多
To develop a quick, accurate and antinoise automated image registration technique for infrared images, the wavelet analysis technique was used to extract the feature points in two images followed by the compensation f...To develop a quick, accurate and antinoise automated image registration technique for infrared images, the wavelet analysis technique was used to extract the feature points in two images followed by the compensation for input image with angle difference between them. A hi erarchical feature matching algorithm was adopted to get the final transform parameters between the two images. The simulation results for two infrared images show that the method can effectively, quickly and accurately register images and be antinoise to some extent.展开更多
Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR ...Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR detection methods have mainly relied on manual feature extraction and classification,leading to errors.This paper proposes a novel VTDR detection and classification model that combines different models through majority voting.Our proposed methodology involves preprocessing,data augmentation,feature extraction,and classification stages.We use a hybrid convolutional neural network-singular value decomposition(CNN-SVD)model for feature extraction and selection and an improved SVM-RBF with a Decision Tree(DT)and K-Nearest Neighbor(KNN)for classification.We tested our model on the IDRiD dataset and achieved an accuracy of 98.06%,a sensitivity of 83.67%,and a specificity of 100%for DR detection and evaluation tests,respectively.Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection.展开更多
Multimodal medical image fusion has attained immense popularity in recent years due to its robust technology for clinical diagnosis.It fuses multiple images into a single image to improve the quality of images by reta...Multimodal medical image fusion has attained immense popularity in recent years due to its robust technology for clinical diagnosis.It fuses multiple images into a single image to improve the quality of images by retaining significant information and aiding diagnostic practitioners in diagnosing and treating many diseases.However,recent image fusion techniques have encountered several challenges,including fusion artifacts,algorithm complexity,and high computing costs.To solve these problems,this study presents a novel medical image fusion strategy by combining the benefits of pixel significance with edge-preserving processing to achieve the best fusion performance.First,the method employs a cross-bilateral filter(CBF)that utilizes one image to determine the kernel and the other for filtering,and vice versa,by considering both geometric closeness and the gray-level similarities of neighboring pixels of the images without smoothing edges.The outputs of CBF are then subtracted from the original images to obtain detailed images.It further proposes to use edge-preserving processing that combines linear lowpass filtering with a non-linear technique that enables the selection of relevant regions in detailed images while maintaining structural properties.These regions are selected using morphologically processed linear filter residuals to identify the significant regions with high-amplitude edges and adequate size.The outputs of low-pass filtering are fused with meaningfully restored regions to reconstruct the original shape of the edges.In addition,weight computations are performed using these reconstructed images,and these weights are then fused with the original input images to produce a final fusion result by estimating the strength of horizontal and vertical details.Numerous standard quality evaluation metrics with complementary properties are used for comparison with existing,well-known algorithms objectively to validate the fusion results.Experimental results from the proposed research article exhibit superior performance compared to other competing techniques in the case of both qualitative and quantitative evaluation.In addition,the proposed method advocates less computational complexity and execution time while improving diagnostic computing accuracy.Nevertheless,due to the lower complexity of the fusion algorithm,the efficiency of fusion methods is high in practical applications.The results reveal that the proposed method exceeds the latest state-of-the-art methods in terms of providing detailed information,edge contour,and overall contrast.展开更多
Facial beauty analysis is an important topic in human society.It may be used as a guidance for face beautification applications such as cosmetic surgery.Deep neural networks(DNNs)have recently been adopted for facial ...Facial beauty analysis is an important topic in human society.It may be used as a guidance for face beautification applications such as cosmetic surgery.Deep neural networks(DNNs)have recently been adopted for facial beauty analysis and have achieved remarkable performance.However,most existing DNN-based models regard facial beauty analysis as a normal classification task.They ignore important prior knowledge in traditional machine learning models which illustrate the significant contribution of the geometric features in facial beauty analysis.To be specific,landmarks of the whole face and facial organs are introduced to extract geometric features to make the decision.Inspired by this,we introduce a novel dual-branch network for facial beauty analysis:one branch takes the Swin Transformer as the backbone to model the full face and global patterns,and another branch focuses on the masked facial organs with the residual network to model the local patterns of certain facial parts.Additionally,the designed multi-scale feature fusion module can further facilitate our network to learn complementary semantic information between the two branches.In model optimisation,we propose a hybrid loss function,where especially geometric regulation is introduced by regressing the facial landmarks and it can force the extracted features to convey facial geometric features.Experiments performed on the SCUT-FBP5500 dataset and the SCUT-FBP dataset demonstrate that our model outperforms the state-of-the-art convolutional neural networks models,which proves the effectiveness of the proposed geometric regularisation and dual-branch structure with the hybrid network.To the best of our knowledge,this is the first study to introduce a Vision Transformer into the facial beauty analysis task.展开更多
The Internet of Multimedia Things(IoMT)refers to a network of interconnected multimedia devices that communicate with each other over the Internet.Recently,smart healthcare has emerged as a significant application of ...The Internet of Multimedia Things(IoMT)refers to a network of interconnected multimedia devices that communicate with each other over the Internet.Recently,smart healthcare has emerged as a significant application of the IoMT,particularly in the context of knowledge‐based learning systems.Smart healthcare systems leverage knowledge‐based learning to become more context‐aware,adaptable,and auditable while maintain-ing the ability to learn from historical data.In smart healthcare systems,devices capture images,such as X‐rays,Magnetic Resonance Imaging.The security and integrity of these images are crucial for the databases used in knowledge‐based learning systems to foster structured decision‐making and enhance the learning abilities of AI.Moreover,in knowledge‐driven systems,the storage and transmission of HD medical images exert a burden on the limited bandwidth of the communication channel,leading to data trans-mission delays.To address the security and latency concerns,this paper presents a lightweight medical image encryption scheme utilising bit‐plane decomposition and chaos theory.The results of the experiment yield entropy,energy,and correlation values of 7.999,0.0156,and 0.0001,respectively.This validates the effectiveness of the encryption system proposed in this paper,which offers high‐quality encryption,a large key space,key sensitivity,and resistance to statistical attacks.展开更多
文摘Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solution for detection and monitoring.Unmanned aerial vehicles(UAVs)have recently emerged as a tool for algal bloom detection,efficiently providing on-demand images at high spatiotemporal resolutions.This study developed an image processing method for algal bloom area estimation from the aerial images(obtained from the internet)captured using UAVs.As a remote sensing method of HAB detection,analysis,and monitoring,a combination of histogram and texture analyses was used to efficiently estimate the area of HABs.Statistical features like entropy(using the Kullback-Leibler method)were emphasized with the aid of a gray-level co-occurrence matrix.The results showed that the orthogonal images demonstrated fewer errors,and the morphological filter best detected algal blooms in real time,with a precision of 80%.This study provided efficient image processing approaches using on-board UAVs for HAB monitoring.
基金supported in part by the Guangzhou Science and Technology Plan Project under Grants 2024B03J1361,2023B03J1327,and 2023A04J0361in part by the Open Fund Project of Hubei Province Key Laboratory of Occupational Hazard Identification and Control under Grant OHIC2023Y10+3 种基金in part by the Guangdong Province Ordinary Colleges and Universities Young Innovative Talents Project under Grant 2023KQNCX036in part by the Special Fund for Science and Technology Innovation Strategy of Guangdong Province(Climbing Plan)under Grant pdjh2024a226in part by the Key Discipline Improvement Project of Guangdong Province under Grant 2022ZDJS015in part by theResearch Fund of Guangdong Polytechnic Normal University under Grants 22GPNUZDJS17 and 2022SDKYA015.
文摘In the context of the accelerated pace of daily life and the development of e-commerce,online shopping is a mainstreamway for consumers to access products and services.To understand their emotional expressions in facing different shopping experience scenarios,this paper presents a sentiment analysis method that combines the ecommerce reviewkeyword-generated imagewith a hybrid machine learning-basedmodel,inwhich theWord2Vec-TextRank is used to extract keywords that act as the inputs for generating the related images by generative Artificial Intelligence(AI).Subsequently,a hybrid Convolutional Neural Network and Support Vector Machine(CNNSVM)model is applied for sentiment classification of those keyword-generated images.For method validation,the data randomly comprised of 5000 reviews from Amazon have been analyzed.With superior keyword extraction capability,the proposedmethod achieves impressive results on sentiment classification with a remarkable accuracy of up to 97.13%.Such performance demonstrates its advantages by using the text-to-image approach,providing a unique perspective for sentiment analysis in the e-commerce review data compared to the existing works.Thus,the proposed method enhances the reliability and insights of customer feedback surveys,which would also establish a novel direction in similar cases,such as social media monitoring and market trend research.
文摘The Ki67 index (KI) is a standard clinical marker for tumor proliferation;however, its application is hindered by intratumoral heterogeneity. In this study, we used digital image analysis to comprehensively analyze Ki67 heterogeneity and distribution patterns in breast carcinoma. Using Smart Pathology software, we digitized and analyzed 42 excised breast carcinoma Ki67 slides. Boxplots, histograms, and heat maps were generated to illustrate the KI distribution. We found that 30% of cases (13/42) exhibited discrepancies between global and hotspot KI when using a 14% KI threshold for classification. Patients with higher global or hotspot KI values displayed greater heterogenicity. Ki67 distribution patterns were categorized as randomly distributed (52%, 22/42), peripheral (43%, 18/42), and centered (5%, 2/42). Our sampling simulator indicated analyzing more than 10 high-power fields was typically required to accurately estimate global KI, with sampling size being correlated with heterogeneity. In conclusion, using digital image analysis in whole-slide images allows for comprehensive Ki67 profile assessment, shedding light on heterogeneity and distribution patterns. This spatial information can facilitate KI surveys of breast cancer and other malignancies.
基金National Natural Science Foundation of China(82305090)Science and Technology Commission of Shanghai Municipality(22YF1448900)Shanghai Municipal Health Commission(20234Y0168).
文摘Objective To analyze the differences in the correlation of tongue image indicators among patients with benign lung nodules and lung cancer.Methods From July 1;2020 to March 31;2022;clinical information of lung cancer patients and benign lung nodules patients was collected at the Oncology Department of Longhua Hos-pital Affiliated to Shanghai University of Traditional Chinese Medicine and the Physical Ex-amination Center of Shuguang Hospital Affiliated to Shanghai University of Traditional Chi-nese Medicine;respectively.We obtained tongue images from patients with benign lung nod-ules and lung cancer using the TFDA-1 digital tongue diagnosis instrument;and analyzed these images with the TDAS V2.0 software.The extracted indicators included color space pa-rameters in the Lab system for both the tongue body(TB)and tongue coating(TC)(TB/TC-L;TB/TC-a;and TB/TC-b);textural parameters[TB/TC-contrast(CON);TB/TC-angular second moment(ASM);TB/TC-entropy(ENT);and TB/TC-MEAN];as well as TC parameters(perAll and perPart).The bivariate correlation of TB and TC features was analyzed using Pearson’s or Spearman’s correlation analysis;and the overall correlation was analyzed using canonical correlation analysis(CCA).Results Samples from 307 patients with benign lung nodules and 276 lung cancer patients were included after excluding outliers and extreme values.Simple correlation analysis indi-cated that the correlation of TB-L with TC-L;TB-b with TC-b;and TB-b with perAll in lung cancer group was higher than that in benign nodules group.Moreover;the correlation of TB-a with TC-a;TB-a with perAll;and the texture parameters of the TB(TB-CON;TB-ASM;TB-ENT;and TB-MEAN)with the texture parameters of the TC(TC-CON;TC-ASM;TC-ENT;and TC-MEAN)in benign nodules group was higher than lung cancer group.CCA further demon-strated a strong correlation between the TB and TC parameters in lung cancer group;with the first and second pairs of typical variables in benign nodules and lung cancer groups indicat-ing correlation coefficients of 0.918 and 0.817(P<0.05);and 0.940 and 0.822(P<0.05);re-spectively.Conclusion Benign lung nodules and lung cancer patients exhibited differences in correla-tion in the L;a;and b values of the TB and TC;as well as the perAll value of the TC;and the texture parameters(TB/TC-CON;TB/TC-ASM;TB/TC-ENT;and TB/TC-MEAN)between the TB and TC.Additionally;there were differences in the overall correlation of the TB and TC be-tween the two groups.Objective tongue diagnosis indicators can effectively assist in the diag-nosis of benign lung nodules and lung cancer;thereby providing a scientific basis for the ear-ly detection;diagnosis;and treatment of lung cancer.
基金Science and Technology Funds from the Liaoning Education Department(Serial Number:LJKZ0104).
文摘The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos.
基金Guangzhou Municipality’s Philosophy and Social Sciences Development“14th Five-Year Plan”2021 Annual Young Scholars Research Project(2021GZQN15)。
文摘In recent years,more and more directors of culture and tourism have taken part in the promotion of local cultural tourism by cross-dressing,talent shows,and pushing their limits on self-media platforms.This study investigates short videos of Lingnan culture promoted by directors general and deputy directors general of the Culture,Radio,Television,Tourism,and Sports Bureau of counties and cities in Guangdong Province on social media by the method of multimodal critical discourse analysis.The analysis of 33 videos shows that Lingnan culture is a domineering and confident culture,historical culture,graceful and elegant culture,and vibrant and active culture.Domineering and confident culture is embedded in the utterances and behaviors of the directors general or deputy directors general in the video.Historical culture is realized through the conversation with historical figures through time travel.Graceful and elegant culture is constructed in the depiction of sceneries and the depiction of characters’manners.Vibrant and active culture is represented in the depiction of the characters’actional process and analytical process.
文摘In today’s information age,video data,as an important carrier of information,is growing explosively in terms of production volume.The quick and accurate extraction of useful information from massive video data has become a focus of research in the field of computer vision.AI dynamic recognition technology has become one of the key technologies to address this issue due to its powerful data processing capabilities and intelligent recognition functions.Based on this,this paper first elaborates on the development of intelligent video AI dynamic recognition technology,then proposes several optimization strategies for intelligent video AI dynamic recognition technology,and finally analyzes the performance of intelligent video AI dynamic recognition technology for reference.
文摘Corporate identity construction of external publicity image is an important part of the development of enterprises.Based on Wodak’s discourse-historical approach,this study takes the text of COFCO’s English promotional video as the research object,analyzes the corporate brand image,media image,organizational image,and environmental image constructed by the enterprises from three steps:linguistic expression,discourse strategy,and theme to provide references for Chinese enterprises to enhance their international influence.
基金This work was supported by Science and Technology Project of State Grid Corporation“Research on Key Technologies of Power Artificial Intelligence Open Platform”(5700-202155260A-0-0-00).
文摘The continuous growth in the scale of unmanned aerial vehicle (UAV) applications in transmission line inspection has resulted in a corresponding increase in the demand for UAV inspection image processing. Owing to its excellent performance in computer vision, deep learning has been applied to UAV inspection image processing tasks such as power line identification and insulator defect detection. Despite their excellent performance, electric power UAV inspection image processing models based on deep learning face several problems such as a small application scope, the need for constant retraining and optimization, and high R&D monetary and time costs due to the black-box and scene data-driven characteristics of deep learning. In this study, an automated deep learning system for electric power UAV inspection image analysis and processing is proposed as a solution to the aforementioned problems. This system design is based on the three critical design principles of generalizability, extensibility, and automation. Pre-trained models, fine-tuning (downstream task adaptation), and automated machine learning, which are closely related to these design principles, are reviewed. In addition, an automated deep learning system architecture for electric power UAV inspection image analysis and processing is presented. A prototype system was constructed and experiments were conducted on the two electric power UAV inspection image analysis and processing tasks of insulator self-detonation and bird nest recognition. The models constructed using the prototype system achieved 91.36% and 86.13% mAP for insulator self-detonation and bird nest recognition, respectively. This demonstrates that the system design concept is reasonable and the system architecture feasible .
基金Supported by the Research and Development Project of Experimental Technology,China University of Mining and Technology(Study on mineral occurrence in coal based on SEM and EDS,S2023Y018)the National Natural Science Foundations of China under Grant 62371451.
文摘An important index to evaluate the process efficiency of coal preparation is the mineral liberation degree of pulverized coal,which is greatly influenced by the particle size and shape distribution acquired by image segmentation.However,the agglomeration effect of fine powders and the edge effect of granular images caused by scanning electron microscopy greatly affect the precision of particle image segmentation.In this study,we propose a novel image segmentation method derived from mask regional convolutional neural network based on deep learning for recognizing fine coal powders.Firstly,an atrous convolution is introduced into our network to learn the image feature of multi-sized powders,which can reduce the missing segmentation of small-sized agglomerated particles.Then,a new mask loss function combing focal loss and dice coefficient is used to overcome the false segmentation caused by the edge effect.The final comparative experimental results show that our method achieves the best results of 94.43%and 91.44%on AP50 and AP75 respectively among the comparison algorithms.In addition,in order to provide an effective method for particle size analysis of coal particles,we study the particle size distribution of coal powders based on the proposed image segmentation method and obtain a good curve relationship between cumulative mass fraction and particle size.
基金supported by the National Council of Science and Technology of Mexico(CONACyT),which provided financial support through scholarships for postgraduate studies to J.L.G.S.(815176)and M.R.C.(507523)。
文摘The use of unmanned aerial vehicles(UAV)for forest monitoring has grown significantly in recent years,providing information with high spatial resolution and temporal versatility.UAV with multispectral sensors allow the use of indexes such as the normalized difference vegetation index(NDVI),which determines the vigor,physiological stress and photo synthetic activity of vegetation.This study aimed to analyze the spectral responses and variations of NDVI in tree crowns,as well as their correlation with climatic factors over the course of one year.The study area encompassed a 1.6-ha site in Durango,Mexico,where Pinus cembroides,Pinus engelmannii,and Quercus grisea coexist.Multispectral images were acquired with UAV and information on meteorological variables was obtained from NASA/POWER database.An ANOVA explored possible differences in NDVI among the three species.Pearson correlation was performed to identify the linear relationship between NDVI and meteorological variables.Significant differences in NDVI values were found at the genus level(Pinus and Quercus),possibly related to the physiological features of the species and their phenology.Quercus grisea had the lowest NDVI values throughout the year which may be attributed to its sensitivity to relative humidity and temperatures.Although the use of UAV with a multispectral sensor for NDVI monitoring allowed genera differentiation,in more complex forest analyses hyperspectral and LiDAR sensors should be integrated,as well other vegetation indexes be considered.
基金support for this work from the Deanship of Scientific Research (DSR),University of Tabuk,Tabuk,Saudi Arabia,under grant number S-1440-0262.
文摘Medical image analysis is an active research topic,with thousands of studies published in the past few years.Transfer learning(TL)including convolutional neural networks(CNNs)focused to enhance efficiency on an innovative task using the knowledge of the same tasks learnt in advance.It has played a major role in medical image analysis since it solves the data scarcity issue along with that it saves hardware resources and time.This study develops an EnhancedTunicate SwarmOptimization withTransfer Learning EnabledMedical Image Analysis System(ETSOTL-MIAS).The goal of the ETSOTL-MIAS technique lies in the identification and classification of diseases through medical imaging.The ETSOTL-MIAS technique involves the Chan Vese segmentation technique to identify the affected regions in the medical image.For feature extraction purposes,the ETSOTL-MIAS technique designs a modified DarkNet-53 model.To avoid the manual hyperparameter adjustment process,the ETSOTLMIAS technique exploits the ETSO algorithm,showing the novelty of the work.Finally,the classification of medical images takes place by random forest(RF)classifier.The performance validation of the ETSOTL-MIAS technique is tested on a benchmark medical image database.The extensive experimental analysis showed the promising performance of the ETSOTL-MIAS technique under different measures.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work under grant number(RGP 2/158/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R161)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4340237DSR11).
文摘Biomedical image processing is widely utilized for disease detection and classification of biomedical images.Tongue color image analysis is an effective and non-invasive tool for carrying out secondary detection at anytime and anywhere.For removing the qualitative aspect,tongue images are quantitatively inspected,proposing a novel disease classification model in an automated way is preferable.This article introduces a novel political optimizer with deep learning enabled tongue color image analysis(PODL-TCIA)technique.The presented PODL-TCIA model purposes to detect the occurrence of the disease by examining the color of the tongue.To attain this,the PODL-TCIA model initially performs image pre-processing to enhance medical image quality.Followed by,Inception with ResNet-v2 model is employed for feature extraction.Besides,political optimizer(PO)with twin support vector machine(TSVM)model is exploited for image classification process,shows the novelty of the work.The design of PO algorithm assists in the optimal parameter selection of the TSVM model.For ensuring the enhanced outcomes of the PODL-TCIA model,a wide-ranging experimental analysis was applied and the outcomes reported the betterment of the PODL-TCIA model over the recent approaches.
基金Supported by " Experimental Scale Studies in Smoke Control Strategy in Large Linear Atria in HKSAR" (B Q372)
文摘In this paper, motion analysis methods based on the moment features and flicker frequency features for early fire flame from ordinary CCD video camera were proposed, and in order to describe the changing of flame and disturbance of non-flame phenomena further more, the average changing pixel number of the first-order moments of consecutive flames has been defined in the moment analysis as well. The first-order moments of all kinds of flames used in our experiments present irregularly flickering, and their average changing pixel numbers of first-order moments are greater than fire-like disturbances. For the analysis of flicker frequency of flame, which is extracted and calculated in spatial domain, and therefore it is computational simple and fast. The method of extracting flicker frequency from video images is not affected by the catalogues of combustion material and distance. In experiments, we adopted two kinds of flames, i. e. , fixed flame and movable flame. Many comparing and disturbing experiments were done and verified that the methods can be used as criteria for early fire detection.
基金Projects(50934002,51074013,51304076,51104100)supported by the National Natural Science Foundation of ChinaProject(IRT0950)supported by the Program for Changjiang Scholars Innovative Research Team in Universities,ChinaProject(2012M510007)supported by China Postdoctoral Science Foundation
文摘Methods and procedures of three-dimensional (3D) characterization of the pore structure features in the packed ore particle bed are focused. X-ray computed tomography was applied to deriving the cross-sectional images of specimens with single particle size of 1-2, 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10 ram. Based on the in-house developed 3D image analysis programs using Matlab, the volume porosity, pore size distribution and degree of connectivity were calculated and analyzed in detail. The results indicate that the volume porosity, the mean diameter of pores and the effective pore size (d50) increase with the increasing of particle size. Lognormal distribution or Gauss distribution is mostly suitable to model the pore size distribution. The degree of connectivity investigated on the basis of cluster-labeling algorithm also increases with increasing the particle size approximately.
文摘To develop a quick, accurate and antinoise automated image registration technique for infrared images, the wavelet analysis technique was used to extract the feature points in two images followed by the compensation for input image with angle difference between them. A hi erarchical feature matching algorithm was adopted to get the final transform parameters between the two images. The simulation results for two infrared images show that the method can effectively, quickly and accurately register images and be antinoise to some extent.
基金This research was funded by the National Natural Science Foundation of China(Nos.71762010,62262019,62162025,61966013,12162012)the Hainan Provincial Natural Science Foundation of China(Nos.823RC488,623RC481,620RC603,621QN241,620RC602,121RC536)+1 种基金the Haikou Science and Technology Plan Project of China(No.2022-016)the Project supported by the Education Department of Hainan Province,No.Hnky2021-23.
文摘Artificial Intelligence(AI)is being increasingly used for diagnosing Vision-Threatening Diabetic Retinopathy(VTDR),which is a leading cause of visual impairment and blindness worldwide.However,previous automated VTDR detection methods have mainly relied on manual feature extraction and classification,leading to errors.This paper proposes a novel VTDR detection and classification model that combines different models through majority voting.Our proposed methodology involves preprocessing,data augmentation,feature extraction,and classification stages.We use a hybrid convolutional neural network-singular value decomposition(CNN-SVD)model for feature extraction and selection and an improved SVM-RBF with a Decision Tree(DT)and K-Nearest Neighbor(KNN)for classification.We tested our model on the IDRiD dataset and achieved an accuracy of 98.06%,a sensitivity of 83.67%,and a specificity of 100%for DR detection and evaluation tests,respectively.Our proposed approach outperforms baseline techniques and provides a more robust and accurate method for VTDR detection.
文摘Multimodal medical image fusion has attained immense popularity in recent years due to its robust technology for clinical diagnosis.It fuses multiple images into a single image to improve the quality of images by retaining significant information and aiding diagnostic practitioners in diagnosing and treating many diseases.However,recent image fusion techniques have encountered several challenges,including fusion artifacts,algorithm complexity,and high computing costs.To solve these problems,this study presents a novel medical image fusion strategy by combining the benefits of pixel significance with edge-preserving processing to achieve the best fusion performance.First,the method employs a cross-bilateral filter(CBF)that utilizes one image to determine the kernel and the other for filtering,and vice versa,by considering both geometric closeness and the gray-level similarities of neighboring pixels of the images without smoothing edges.The outputs of CBF are then subtracted from the original images to obtain detailed images.It further proposes to use edge-preserving processing that combines linear lowpass filtering with a non-linear technique that enables the selection of relevant regions in detailed images while maintaining structural properties.These regions are selected using morphologically processed linear filter residuals to identify the significant regions with high-amplitude edges and adequate size.The outputs of low-pass filtering are fused with meaningfully restored regions to reconstruct the original shape of the edges.In addition,weight computations are performed using these reconstructed images,and these weights are then fused with the original input images to produce a final fusion result by estimating the strength of horizontal and vertical details.Numerous standard quality evaluation metrics with complementary properties are used for comparison with existing,well-known algorithms objectively to validate the fusion results.Experimental results from the proposed research article exhibit superior performance compared to other competing techniques in the case of both qualitative and quantitative evaluation.In addition,the proposed method advocates less computational complexity and execution time while improving diagnostic computing accuracy.Nevertheless,due to the lower complexity of the fusion algorithm,the efficiency of fusion methods is high in practical applications.The results reveal that the proposed method exceeds the latest state-of-the-art methods in terms of providing detailed information,edge contour,and overall contrast.
基金Shenzhen Science and Technology Program,Grant/Award Number:ZDSYS20211021111415025Shenzhen Institute of Artificial Intelligence and Robotics for SocietyYouth Science and Technology Talents Development Project of Guizhou Education Department,Grant/Award Number:QianJiaoheKYZi[2018]459。
文摘Facial beauty analysis is an important topic in human society.It may be used as a guidance for face beautification applications such as cosmetic surgery.Deep neural networks(DNNs)have recently been adopted for facial beauty analysis and have achieved remarkable performance.However,most existing DNN-based models regard facial beauty analysis as a normal classification task.They ignore important prior knowledge in traditional machine learning models which illustrate the significant contribution of the geometric features in facial beauty analysis.To be specific,landmarks of the whole face and facial organs are introduced to extract geometric features to make the decision.Inspired by this,we introduce a novel dual-branch network for facial beauty analysis:one branch takes the Swin Transformer as the backbone to model the full face and global patterns,and another branch focuses on the masked facial organs with the residual network to model the local patterns of certain facial parts.Additionally,the designed multi-scale feature fusion module can further facilitate our network to learn complementary semantic information between the two branches.In model optimisation,we propose a hybrid loss function,where especially geometric regulation is introduced by regressing the facial landmarks and it can force the extracted features to convey facial geometric features.Experiments performed on the SCUT-FBP5500 dataset and the SCUT-FBP dataset demonstrate that our model outperforms the state-of-the-art convolutional neural networks models,which proves the effectiveness of the proposed geometric regularisation and dual-branch structure with the hybrid network.To the best of our knowledge,this is the first study to introduce a Vision Transformer into the facial beauty analysis task.
文摘The Internet of Multimedia Things(IoMT)refers to a network of interconnected multimedia devices that communicate with each other over the Internet.Recently,smart healthcare has emerged as a significant application of the IoMT,particularly in the context of knowledge‐based learning systems.Smart healthcare systems leverage knowledge‐based learning to become more context‐aware,adaptable,and auditable while maintain-ing the ability to learn from historical data.In smart healthcare systems,devices capture images,such as X‐rays,Magnetic Resonance Imaging.The security and integrity of these images are crucial for the databases used in knowledge‐based learning systems to foster structured decision‐making and enhance the learning abilities of AI.Moreover,in knowledge‐driven systems,the storage and transmission of HD medical images exert a burden on the limited bandwidth of the communication channel,leading to data trans-mission delays.To address the security and latency concerns,this paper presents a lightweight medical image encryption scheme utilising bit‐plane decomposition and chaos theory.The results of the experiment yield entropy,energy,and correlation values of 7.999,0.0156,and 0.0001,respectively.This validates the effectiveness of the encryption system proposed in this paper,which offers high‐quality encryption,a large key space,key sensitivity,and resistance to statistical attacks.