A large number of network security breaches in IoT networks have demonstrated the unreliability of current Network Intrusion Detection Systems(NIDSs).Consequently,network interruptions and loss of sensitive data have ...A large number of network security breaches in IoT networks have demonstrated the unreliability of current Network Intrusion Detection Systems(NIDSs).Consequently,network interruptions and loss of sensitive data have occurred,which led to an active research area for improving NIDS technologies.In an analysis of related works,it was observed that most researchers aim to obtain better classification results by using a set of untried combinations of Feature Reduction(FR)and Machine Learning(ML)techniques on NIDS datasets.However,these datasets are different in feature sets,attack types,and network design.Therefore,this paper aims to discover whether these techniques can be generalised across various datasets.Six ML models are utilised:a Deep Feed Forward(DFF),Convolutional Neural Network(CNN),Recurrent Neural Network(RNN),Decision Tree(DT),Logistic Regression(LR),and Naive Bayes(NB).The accuracy of three Feature Extraction(FE)algorithms is detected;Principal Component Analysis(PCA),Auto-encoder(AE),and Linear Discriminant Analysis(LDA),are evaluated using three benchmark datasets:UNSW-NB15,ToN-IoT and CSE-CIC-IDS2018.Although PCA and AE algorithms have been widely used,the determination of their optimal number of extracted dimensions has been overlooked.The results indicate that no clear FE method or ML model can achieve the best scores for all datasets.The optimal number of extracted dimensions has been identified for each dataset,and LDA degrades the performance of the ML models on two datasets.The variance is used to analyse the extracted dimensions of LDA and PCA.Finally,this paper concludes that the choice of datasets significantly alters the performance of the applied techniques.We believe that a universal(benchmark)feature set is needed to facilitate further advancement and progress of research in this field.展开更多
Cultural relics line graphic serves as a crucial form of traditional artifact information documentation,which is a simple and intuitive product with low cost of displaying compared with 3D models.Dimensionality reduct...Cultural relics line graphic serves as a crucial form of traditional artifact information documentation,which is a simple and intuitive product with low cost of displaying compared with 3D models.Dimensionality reduction is undoubtedly necessary for line drawings.However,most existing methods for artifact drawing rely on the principles of orthographic projection that always cannot avoid angle occlusion and data overlapping while the surface of cultural relics is complex.Therefore,conformal mapping was introduced as a dimensionality reduction way to compensate for the limitation of orthographic projection.Based on the given criteria for assessing surface complexity,this paper proposed a three-dimensional feature guideline extraction method for complex cultural relic surfaces.A 2D and 3D combined factor that measured the importance of points on describing surface features,vertex weight,was designed.Then the selection threshold for feature guideline extraction was determined based on the differences between vertex weight and shape index distributions.The feasibility and stability were verified through experiments conducted on real cultural relic surface data.Results demonstrated the ability of the method to address the challenges associated with the automatic generation of line drawings for complex surfaces.The extraction method and the obtained results will be useful for line graphic drawing,displaying and propaganda of cultural relics.展开更多
In the IoT(Internet of Things)domain,the increased use of encryption protocols such as SSL/TLS,VPN(Virtual Private Network),and Tor has led to a rise in attacks leveraging encrypted traffic.While research on anomaly d...In the IoT(Internet of Things)domain,the increased use of encryption protocols such as SSL/TLS,VPN(Virtual Private Network),and Tor has led to a rise in attacks leveraging encrypted traffic.While research on anomaly detection using AI(Artificial Intelligence)is actively progressing,the encrypted nature of the data poses challenges for labeling,resulting in data imbalance and biased feature extraction toward specific nodes.This study proposes a reconstruction error-based anomaly detection method using an autoencoder(AE)that utilizes packet metadata excluding specific node information.The proposed method omits biased packet metadata such as IP and Port and trains the detection model using only normal data,leveraging a small amount of packet metadata.This makes it well-suited for direct application in IoT environments due to its low resource consumption.In experiments comparing feature extraction methods for AE-based anomaly detection,we found that using flowbased features significantly improves accuracy,precision,F1 score,and AUC(Area Under the Receiver Operating Characteristic Curve)score compared to packet-based features.Additionally,for flow-based features,the proposed method showed a 30.17%increase in F1 score and improved false positive rates compared to Isolation Forest and OneClassSVM.Furthermore,the proposedmethod demonstrated a 32.43%higherAUCwhen using packet features and a 111.39%higher AUC when using flow features,compared to previously proposed oversampling methods.This study highlights the impact of feature extraction methods on attack detection in imbalanced,encrypted traffic environments and emphasizes that the one-class method using AE is more effective for attack detection and reducing false positives compared to traditional oversampling methods.展开更多
In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clini...In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness.展开更多
This paper proposes a novel open set recognition method,the Spatial Distribution Feature Extraction Network(SDFEN),to address the problem of electromagnetic signal recognition in an open environment.The spatial distri...This paper proposes a novel open set recognition method,the Spatial Distribution Feature Extraction Network(SDFEN),to address the problem of electromagnetic signal recognition in an open environment.The spatial distribution feature extraction layer in SDFEN replaces convolutional output neural networks with the spatial distribution features that focus more on inter-sample information by incorporating class center vectors.The designed hybrid loss function considers both intra-class distance and inter-class distance,thereby enhancing the similarity among samples of the same class and increasing the dissimilarity between samples of different classes during training.Consequently,this method allows unknown classes to occupy a larger space in the feature space.This reduces the possibility of overlap with known class samples and makes the boundaries between known and unknown samples more distinct.Additionally,the feature comparator threshold can be used to reject unknown samples.For signal open set recognition,seven methods,including the proposed method,are applied to two kinds of electromagnetic signal data:modulation signal and real-world emitter.The experimental results demonstrate that the proposed method outperforms the other six methods overall in a simulated open environment.Specifically,compared to the state-of-the-art Openmax method,the novel method achieves up to 8.87%and 5.25%higher micro-F-measures,respectively.展开更多
Biometric recognition is a widely used technology for user authentication.In the application of this technology,biometric security and recognition accuracy are two important issues that should be considered.In terms o...Biometric recognition is a widely used technology for user authentication.In the application of this technology,biometric security and recognition accuracy are two important issues that should be considered.In terms of biometric security,cancellable biometrics is an effective technique for protecting biometric data.Regarding recognition accuracy,feature representation plays a significant role in the performance and reliability of cancellable biometric systems.How to design good feature representations for cancellable biometrics is a challenging topic that has attracted a great deal of attention from the computer vision community,especially from researchers of cancellable biometrics.Feature extraction and learning in cancellable biometrics is to find suitable feature representations with a view to achieving satisfactory recognition performance,while the privacy of biometric data is protected.This survey informs the progress,trend and challenges of feature extraction and learning for cancellable biometrics,thus shedding light on the latest developments and future research of this area.展开更多
Cleats are the dominant micro-fracture network controlling the macro-mechanical behavior of coal.Improved understanding of the spatial characteristics of cleat networks is therefore important to the coal mining indust...Cleats are the dominant micro-fracture network controlling the macro-mechanical behavior of coal.Improved understanding of the spatial characteristics of cleat networks is therefore important to the coal mining industry.Discrete fracture networks(DFNs)are increasingly used in engineering analyses to spatially model fractures at various scales.The reliability of coal DFNs largely depends on the confidence in the input cleat statistics.Estimates of these parameters can be made from image-based three-dimensional(3D)characterization of coal cleats using X-ray micro-computed tomography(m CT).One key step in this process,after cleat extraction,is the separation of individual cleats,without which the cleats are a connected network and statistics for different cleat sets cannot be measured.In this paper,a feature extraction-based image processing method is introduced to identify and separate distinct cleat groups from 3D X-ray m CT images.Kernels(filters)representing explicit cleat features of coal are built and cleat separation is successfully achieved by convolutional operations on 3D coal images.The new method is applied to a coal specimen with 80 mm in diameter and 100 mm in length acquired from an Anglo American Steelmaking Coal mine in the Bowen Basin,Queensland,Australia.It is demonstrated that the new method produces reliable cleat separation capable of defining individual cleats and preserving 3D topology after separation.Bedding-parallel fractures are also identified and separated,which has his-torically been challenging to delineate and rarely reported.A variety of cleat/fracture statistics is measured which not only can quantitatively characterize the cleat/fracture system but also can be used for DFN modeling.Finally,variability and heterogeneity with respect to the core axis are investigated.Significant heterogeneity is observed and suggests that the representative elementary volume(REV)of the cleat groups for engineering purposes may be a complex problem requiring careful consideration.展开更多
Maintaining a steady power supply requires accurate forecasting of solar irradiance,since clean energy resources do not provide steady power.The existing forecasting studies have examined the limited effects of weathe...Maintaining a steady power supply requires accurate forecasting of solar irradiance,since clean energy resources do not provide steady power.The existing forecasting studies have examined the limited effects of weather conditions on solar radiation such as temperature and precipitation utilizing convolutional neural network(CNN),but no comprehensive study has been conducted on concentrations of air pollutants along with weather conditions.This paper proposes a hybrid approach based on deep learning,expanding the feature set by adding new air pollution concentrations,and ranking these features to select and reduce their size to improve efficiency.In order to improve the accuracy of feature selection,a maximum-dependency and minimum-redundancy(mRMR)criterion is applied to the constructed feature space to identify and rank the features.The combination of air pollution data with weather conditions data has enabled the prediction of solar irradiance with a higher accuracy.An evaluation of the proposed approach is conducted in Istanbul over 12 months for 43791 discrete times,with the main purpose of analyzing air data,including particular matter(PM10 and PM25),carbon monoxide(CO),nitric oxide(NOX),nitrogen dioxide(NO_(2)),ozone(O₃),sulfur dioxide(SO_(2))using a CNN,a long short-term memory network(LSTM),and MRMR feature extraction.Compared with the benchmark models with root mean square error(RMSE)results of 76.2,60.3,41.3,32.4,there is a significant improvement with the RMSE result of 5.536.This hybrid model presented here offers high prediction accuracy,a wider feature set,and a novel approach based on air concentrations combined with weather conditions for solar irradiance prediction.展开更多
Addressing the challenges posed by the nonlinear and non-stationary vibrations in rotating machinery,where weak fault characteristic signals hinder accurate fault state representation,we propose a novel feature extrac...Addressing the challenges posed by the nonlinear and non-stationary vibrations in rotating machinery,where weak fault characteristic signals hinder accurate fault state representation,we propose a novel feature extraction method that combines the Flexible Analytic Wavelet Transform(FAWT)with Nonlinear Quantum Permutation Entropy.FAWT,leveraging fractional orders and arbitrary scaling and translation factors,exhibits superior translational invariance and adjustable fundamental oscillatory characteristics.This flexibility enables FAWT to provide well-suited wavelet shapes,effectively matching subtle fault components and avoiding performance degradation associated with fixed frequency partitioning and low-oscillation bases in detecting weak faults.In our approach,gearbox vibration signals undergo FAWT to obtain sub-bands.Quantum theory is then introduced into permutation entropy to propose Nonlinear Quantum Permutation Entropy,a feature that more accurately characterizes the operational state of vibration simulation signals.The nonlinear quantum permutation entropy extracted from sub-bands is utilized to characterize the operating state of rotating machinery.A comprehensive analysis of vibration signals from rolling bearings and gearboxes validates the feasibility of the proposed method.Comparative assessments with parameters derived from traditional permutation entropy,sample entropy,wavelet transform(WT),and empirical mode decomposition(EMD)underscore the superior effectiveness of this approach in fault detection and classification for rotating machinery.展开更多
One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelli...One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier.展开更多
Among all the plagues threatening cocoa cultivation in general, and particularly in West Africa, the swollen shoot viral disease is currently the most dangerous. The greatest challenge in the fight to eradicate this p...Among all the plagues threatening cocoa cultivation in general, and particularly in West Africa, the swollen shoot viral disease is currently the most dangerous. The greatest challenge in the fight to eradicate this pandemic remains its early detection. Traditional methods of swollen shoot detection are mostly based on visual observations, leading to late detection and/or diagnostic errors. The use of machine learning algorithms is now an alternative for effective plant disease detection. It is therefore crucial to provide efficient solutions to farmers’ cooperatives. In our study, we built a database of healthy and diseased cocoa leaves. We then explored the power of feature extractors based on convolutional neural networks such as VGG 19, Inception V3, DenseNet 201, and a custom CNN, combining their strengths with the XGBOOST classifier. The results of our experiments showed that this fusion of methods with XGBOOST yielded highly promising scores, outperforming the results of algorithms using the sigmoid function. These results were further consolidated by the use of evaluation metrics such as accuracy, mean squared error, F score, recall, and Matthews’s correlation coefficient. The proposed approach, combining state of the art feature extractors and the XGBOOST classifier, offers an efficient and reliable solution for the early detection of swollen shoot. Its implementation could significantly assist West African cocoa farmers in combating this devastating disease and preserving their crops.展开更多
In stock market forecasting,the identification of critical features that affect the performance of machine learning(ML)models is crucial to achieve accurate stock price predictions.Several review papers in the literat...In stock market forecasting,the identification of critical features that affect the performance of machine learning(ML)models is crucial to achieve accurate stock price predictions.Several review papers in the literature have focused on various ML,statistical,and deep learning-based methods used in stock market forecasting.However,no survey study has explored feature selection and extraction techniques for stock market forecasting.This survey presents a detailed analysis of 32 research works that use a combination of feature study and ML approaches in various stock market applications.We conduct a systematic search for articles in the Scopus and Web of Science databases for the years 2011–2022.We review a variety of feature selection and feature extraction approaches that have been successfully applied in the stock market analyses presented in the articles.We also describe the combination of feature analysis techniques and ML methods and evaluate their performance.Moreover,we present other survey articles,stock market input and output data,and analyses based on various factors.We find that correlation criteria,random forest,principal component analysis,and autoencoder are the most widely used feature selection and extraction techniques with the best prediction accuracy for various stock market applications.展开更多
Breast cancer is the most prevalent cancer among women,and diagnosing it early is vital for successful treatment.The examination of images captured during biopsies plays an important role in determining whether a pati...Breast cancer is the most prevalent cancer among women,and diagnosing it early is vital for successful treatment.The examination of images captured during biopsies plays an important role in determining whether a patient has cancer or not.However,the stochastic patterns,varying intensities of colors,and the large sizes of these images make it challenging to identify and mark malignant regions in them.Against this backdrop,this study proposes an approach to the pixel categorization based on the genetic algorithm(GA)and principal component analysis(PCA).The spatial features of the images were extracted using various filters,and the most prevalent ones are selected using the GA and fed into the classifiers for pixel-level categorization.Three classifiers—random forest(RF),decision tree(DT),and extra tree(ET)—were used in the proposed model.The parameters of all modelswere separately tuned,and their performance was tested.The results show that the features extracted by using the GA+PCA in the proposed model are influential and reliable for pixel-level classification in service of the image annotation and tumor identification.Further,an image from benign,malignant,and normal classes was randomly selected and used to test the proposed model.The proposed modelGA-PCA-DT has delivered accuracies between 0.99 to 1.0 on a reduced feature set.The predicted pixel sets were also compared with their respective ground-truth values to assess the overall performance of the method on two metrics—the universal image quality index(UIQI)and the structural similarity index(SSI).Both quality measures delivered excellent results.展开更多
This paper expounded in detail the principle of energy spectrum analysis based on discrete wavelet transformation and multiresolution analysis. In the aspect of feature extraction method study, with investigating the ...This paper expounded in detail the principle of energy spectrum analysis based on discrete wavelet transformation and multiresolution analysis. In the aspect of feature extraction method study, with investigating the feature of impact factor in vibration signals and considering the non-placidity and non-linear of vibration diagnosis signals, the authors import wavelet analysis and fractal theory as the tools of faulty signal feature description. Experimental results proved the validity of this method. To some extent, this method provides a good approach of resolving the wholesome problem of fault feature symptom description.展开更多
Semantic communication,as a critical component of artificial intelligence(AI),has gained increasing attention in recent years due to its significant impact on various fields.In this paper,we focus on the applications ...Semantic communication,as a critical component of artificial intelligence(AI),has gained increasing attention in recent years due to its significant impact on various fields.In this paper,we focus on the applications of semantic feature extraction,a key step in the semantic communication,in several areas of artificial intelligence,including natural language processing,medical imaging,remote sensing,autonomous driving,and other image-related applications.Specifically,we discuss how semantic feature extraction can enhance the accuracy and efficiency of natural language processing tasks,such as text classification,sentiment analysis,and topic modeling.In the medical imaging field,we explore how semantic feature extraction can be used for disease diagnosis,drug development,and treatment planning.In addition,we investigate the applications of semantic feature extraction in remote sensing and autonomous driving,where it can facilitate object detection,scene understanding,and other tasks.By providing an overview of the applications of semantic feature extraction in various fields,this paper aims to provide insights into the potential of this technology to advance the development of artificial intelligence.展开更多
Relative radiometric normalization (RRN) minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in surface. Scale invariant feature transform (SIFT) has ...Relative radiometric normalization (RRN) minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in surface. Scale invariant feature transform (SIFT) has the ability to automatically extract control points (CPs) and is commonly used for remote sensing images. However, its results are mostly inaccurate and sometimes contain incorrect matching caused by generating a small number of false CP pairs. These CP pairs have high false alarm matching. This paper presents a modified method to improve the performance of SIFT CPs matching by applying sum of absolute difference (SAD) in a different manner for the new optical satellite generation called near-equatorial orbit satellite and multi-sensor images. The proposed method, which has a significantly high rate of correct matches, improves CP matching. The data in this study were obtained from the RazakSAT satellite a new near equatorial satellite system. The proposed method involves six steps: 1) data reduction, 2) applying the SIFT to automatically extract CPs, 3) refining CPs matching by using SAD algorithm with empirical threshold, and 4) calculation of true CPs intensity values over all image’ bands, 5) preforming a linear regression model between the intensity values of CPs locate in reverence and sensed image’ bands, 6) Relative radiometric normalization conducting using regression transformation functions. Different thresholds have experimentally tested and used in conducting this study (50 and 70), by followed the proposed method, and it removed the false extracted SIFT CPs to be from 775, 1125, 883, 804, 883 and 681 false pairs to 342, 424, 547, 706, 547, and 469 corrected and matched pairs, respectively.展开更多
Photovoltaic(PV)boards are a perfect way to create eco-friendly power from daylight.The defects in the PV panels are caused by various conditions;such defective PV panels need continuous monitoring.The recent developm...Photovoltaic(PV)boards are a perfect way to create eco-friendly power from daylight.The defects in the PV panels are caused by various conditions;such defective PV panels need continuous monitoring.The recent development of PV panel monitoring systems provides a modest and viable approach to monitoring and managing the condition of the PV plants.In general,conventional procedures are used to identify the faulty modules earlier and to avoid declines in power generation.The existing deep learning architectures provide the required output to predict the faulty PV panels with less accuracy and a more time-consuming process.To increase the accuracy and to reduce the processing time,a new Convolutional Neural Network(CNN)architecture is required.Hence,in the present work,a new Real-time Multi Variant Deep learning Model(RMVDM)architecture is proposed,and it extracts the image features and classifies the defects in PV panels quickly with high accuracy.The defects that arise in the PV panels are identified by the CNN based RMVDM using RGB images.The biggest difference between CNN and its predecessors is that CNN automatically extracts the image features without any help from a person.The technique is quantitatively assessed and compared with existing faulty PV board identification approaches on the large real-time dataset.The results show that 98%of the accuracy and recall values in the fault detection and classification process.展开更多
The performance of a speech emotion recognition(SER)system is heavily influenced by the efficacy of its feature extraction techniques.The study was designed to advance the field of SER by optimizing feature extraction...The performance of a speech emotion recognition(SER)system is heavily influenced by the efficacy of its feature extraction techniques.The study was designed to advance the field of SER by optimizing feature extraction tech-niques,specifically through the incorporation of high-resolution Mel-spectrograms and the expedited calculation of Mel Frequency Cepstral Coefficients(MFCC).This initiative aimed to refine the system’s accuracy by identifying and mitigating the shortcomings commonly found in current approaches.Ultimately,the primary objective was to elevate both the intricacy and effectiveness of our SER model,with a focus on augmenting its proficiency in the accurate identification of emotions in spoken language.The research employed a dual-strategy approach for feature extraction.Firstly,a rapid computation technique for MFCC was implemented and integrated with a Bi-LSTM layer to optimize the encoding of MFCC features.Secondly,a pretrained ResNet model was utilized in conjunction with feature Stats pooling and dense layers for the effective encoding of Mel-spectrogram attributes.These two sets of features underwent separate processing before being combined in a Convolutional Neural Network(CNN)outfitted with a dense layer,with the aim of enhancing their representational richness.The model was rigorously evaluated using two prominent databases:CMU-MOSEI and RAVDESS.Notable findings include an accuracy rate of 93.2%on the CMU-MOSEI database and 95.3%on the RAVDESS database.Such exceptional performance underscores the efficacy of this innovative approach,which not only meets but also exceeds the accuracy benchmarks established by traditional models in the field of speech emotion recognition.展开更多
The traditional feature-extraction method of oriented FAST and rotated BRIEF(ORB)detects image features based on a fixed threshold;however,ORB descriptors do not distinguish features well in capsule endoscopy images.T...The traditional feature-extraction method of oriented FAST and rotated BRIEF(ORB)detects image features based on a fixed threshold;however,ORB descriptors do not distinguish features well in capsule endoscopy images.Therefore,a new feature detector that uses a new method for setting thresholds,called the adaptive threshold FAST and FREAK in capsule endoscopy images(AFFCEI),is proposed.This method,first constructs an image pyramid and then calculates the thresholds of pixels based on the gray value contrast of all pixels in the local neighborhood of the image,to achieve adaptive image feature extraction in each layer of the pyramid.Subsequently,the features are expressed by the FREAK descriptor,which can enhance the discrimination of the features extracted from the stomach image.Finally,a refined matching is obtained by applying the grid-based motion statistics algorithm to the result of Hamming distance,whereby mismatches are rejected using the RANSAC algorithm.Compared with the ASIFT method,which previously had the best performance,the average running time of AFFCEI was 4/5 that of ASIFT,and the average matching score improved by 5%when tracking features in a moving capsule endoscope.展开更多
The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpow...The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection.展开更多
文摘A large number of network security breaches in IoT networks have demonstrated the unreliability of current Network Intrusion Detection Systems(NIDSs).Consequently,network interruptions and loss of sensitive data have occurred,which led to an active research area for improving NIDS technologies.In an analysis of related works,it was observed that most researchers aim to obtain better classification results by using a set of untried combinations of Feature Reduction(FR)and Machine Learning(ML)techniques on NIDS datasets.However,these datasets are different in feature sets,attack types,and network design.Therefore,this paper aims to discover whether these techniques can be generalised across various datasets.Six ML models are utilised:a Deep Feed Forward(DFF),Convolutional Neural Network(CNN),Recurrent Neural Network(RNN),Decision Tree(DT),Logistic Regression(LR),and Naive Bayes(NB).The accuracy of three Feature Extraction(FE)algorithms is detected;Principal Component Analysis(PCA),Auto-encoder(AE),and Linear Discriminant Analysis(LDA),are evaluated using three benchmark datasets:UNSW-NB15,ToN-IoT and CSE-CIC-IDS2018.Although PCA and AE algorithms have been widely used,the determination of their optimal number of extracted dimensions has been overlooked.The results indicate that no clear FE method or ML model can achieve the best scores for all datasets.The optimal number of extracted dimensions has been identified for each dataset,and LDA degrades the performance of the ML models on two datasets.The variance is used to analyse the extracted dimensions of LDA and PCA.Finally,this paper concludes that the choice of datasets significantly alters the performance of the applied techniques.We believe that a universal(benchmark)feature set is needed to facilitate further advancement and progress of research in this field.
基金National Natural Science Foundation of China(Nos.42071444,42101444)。
文摘Cultural relics line graphic serves as a crucial form of traditional artifact information documentation,which is a simple and intuitive product with low cost of displaying compared with 3D models.Dimensionality reduction is undoubtedly necessary for line drawings.However,most existing methods for artifact drawing rely on the principles of orthographic projection that always cannot avoid angle occlusion and data overlapping while the surface of cultural relics is complex.Therefore,conformal mapping was introduced as a dimensionality reduction way to compensate for the limitation of orthographic projection.Based on the given criteria for assessing surface complexity,this paper proposed a three-dimensional feature guideline extraction method for complex cultural relic surfaces.A 2D and 3D combined factor that measured the importance of points on describing surface features,vertex weight,was designed.Then the selection threshold for feature guideline extraction was determined based on the differences between vertex weight and shape index distributions.The feasibility and stability were verified through experiments conducted on real cultural relic surface data.Results demonstrated the ability of the method to address the challenges associated with the automatic generation of line drawings for complex surfaces.The extraction method and the obtained results will be useful for line graphic drawing,displaying and propaganda of cultural relics.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2023-00235509,Development of Security Monitoring Technology Based Network Behavior against Encrypted Cyber Threats in ICT Convergence Environment).
文摘In the IoT(Internet of Things)domain,the increased use of encryption protocols such as SSL/TLS,VPN(Virtual Private Network),and Tor has led to a rise in attacks leveraging encrypted traffic.While research on anomaly detection using AI(Artificial Intelligence)is actively progressing,the encrypted nature of the data poses challenges for labeling,resulting in data imbalance and biased feature extraction toward specific nodes.This study proposes a reconstruction error-based anomaly detection method using an autoencoder(AE)that utilizes packet metadata excluding specific node information.The proposed method omits biased packet metadata such as IP and Port and trains the detection model using only normal data,leveraging a small amount of packet metadata.This makes it well-suited for direct application in IoT environments due to its low resource consumption.In experiments comparing feature extraction methods for AE-based anomaly detection,we found that using flowbased features significantly improves accuracy,precision,F1 score,and AUC(Area Under the Receiver Operating Characteristic Curve)score compared to packet-based features.Additionally,for flow-based features,the proposed method showed a 30.17%increase in F1 score and improved false positive rates compared to Isolation Forest and OneClassSVM.Furthermore,the proposedmethod demonstrated a 32.43%higherAUCwhen using packet features and a 111.39%higher AUC when using flow features,compared to previously proposed oversampling methods.This study highlights the impact of feature extraction methods on attack detection in imbalanced,encrypted traffic environments and emphasizes that the one-class method using AE is more effective for attack detection and reducing false positives compared to traditional oversampling methods.
基金This work was supported by Science and Technology Cooperation Special Project of Shijiazhuang(SJZZXA23005).
文摘In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness.
文摘This paper proposes a novel open set recognition method,the Spatial Distribution Feature Extraction Network(SDFEN),to address the problem of electromagnetic signal recognition in an open environment.The spatial distribution feature extraction layer in SDFEN replaces convolutional output neural networks with the spatial distribution features that focus more on inter-sample information by incorporating class center vectors.The designed hybrid loss function considers both intra-class distance and inter-class distance,thereby enhancing the similarity among samples of the same class and increasing the dissimilarity between samples of different classes during training.Consequently,this method allows unknown classes to occupy a larger space in the feature space.This reduces the possibility of overlap with known class samples and makes the boundaries between known and unknown samples more distinct.Additionally,the feature comparator threshold can be used to reject unknown samples.For signal open set recognition,seven methods,including the proposed method,are applied to two kinds of electromagnetic signal data:modulation signal and real-world emitter.The experimental results demonstrate that the proposed method outperforms the other six methods overall in a simulated open environment.Specifically,compared to the state-of-the-art Openmax method,the novel method achieves up to 8.87%and 5.25%higher micro-F-measures,respectively.
基金Australian Research Council,Grant/Award Numbers:DP190103660,DP200103207,LP180100663UniSQ Capacity Building Grants,Grant/Award Number:1008313。
文摘Biometric recognition is a widely used technology for user authentication.In the application of this technology,biometric security and recognition accuracy are two important issues that should be considered.In terms of biometric security,cancellable biometrics is an effective technique for protecting biometric data.Regarding recognition accuracy,feature representation plays a significant role in the performance and reliability of cancellable biometric systems.How to design good feature representations for cancellable biometrics is a challenging topic that has attracted a great deal of attention from the computer vision community,especially from researchers of cancellable biometrics.Feature extraction and learning in cancellable biometrics is to find suitable feature representations with a view to achieving satisfactory recognition performance,while the privacy of biometric data is protected.This survey informs the progress,trend and challenges of feature extraction and learning for cancellable biometrics,thus shedding light on the latest developments and future research of this area.
文摘Cleats are the dominant micro-fracture network controlling the macro-mechanical behavior of coal.Improved understanding of the spatial characteristics of cleat networks is therefore important to the coal mining industry.Discrete fracture networks(DFNs)are increasingly used in engineering analyses to spatially model fractures at various scales.The reliability of coal DFNs largely depends on the confidence in the input cleat statistics.Estimates of these parameters can be made from image-based three-dimensional(3D)characterization of coal cleats using X-ray micro-computed tomography(m CT).One key step in this process,after cleat extraction,is the separation of individual cleats,without which the cleats are a connected network and statistics for different cleat sets cannot be measured.In this paper,a feature extraction-based image processing method is introduced to identify and separate distinct cleat groups from 3D X-ray m CT images.Kernels(filters)representing explicit cleat features of coal are built and cleat separation is successfully achieved by convolutional operations on 3D coal images.The new method is applied to a coal specimen with 80 mm in diameter and 100 mm in length acquired from an Anglo American Steelmaking Coal mine in the Bowen Basin,Queensland,Australia.It is demonstrated that the new method produces reliable cleat separation capable of defining individual cleats and preserving 3D topology after separation.Bedding-parallel fractures are also identified and separated,which has his-torically been challenging to delineate and rarely reported.A variety of cleat/fracture statistics is measured which not only can quantitatively characterize the cleat/fracture system but also can be used for DFN modeling.Finally,variability and heterogeneity with respect to the core axis are investigated.Significant heterogeneity is observed and suggests that the representative elementary volume(REV)of the cleat groups for engineering purposes may be a complex problem requiring careful consideration.
文摘Maintaining a steady power supply requires accurate forecasting of solar irradiance,since clean energy resources do not provide steady power.The existing forecasting studies have examined the limited effects of weather conditions on solar radiation such as temperature and precipitation utilizing convolutional neural network(CNN),but no comprehensive study has been conducted on concentrations of air pollutants along with weather conditions.This paper proposes a hybrid approach based on deep learning,expanding the feature set by adding new air pollution concentrations,and ranking these features to select and reduce their size to improve efficiency.In order to improve the accuracy of feature selection,a maximum-dependency and minimum-redundancy(mRMR)criterion is applied to the constructed feature space to identify and rank the features.The combination of air pollution data with weather conditions data has enabled the prediction of solar irradiance with a higher accuracy.An evaluation of the proposed approach is conducted in Istanbul over 12 months for 43791 discrete times,with the main purpose of analyzing air data,including particular matter(PM10 and PM25),carbon monoxide(CO),nitric oxide(NOX),nitrogen dioxide(NO_(2)),ozone(O₃),sulfur dioxide(SO_(2))using a CNN,a long short-term memory network(LSTM),and MRMR feature extraction.Compared with the benchmark models with root mean square error(RMSE)results of 76.2,60.3,41.3,32.4,there is a significant improvement with the RMSE result of 5.536.This hybrid model presented here offers high prediction accuracy,a wider feature set,and a novel approach based on air concentrations combined with weather conditions for solar irradiance prediction.
基金supported financially by FundamentalResearch Program of Shanxi Province(No.202103021223056).
文摘Addressing the challenges posed by the nonlinear and non-stationary vibrations in rotating machinery,where weak fault characteristic signals hinder accurate fault state representation,we propose a novel feature extraction method that combines the Flexible Analytic Wavelet Transform(FAWT)with Nonlinear Quantum Permutation Entropy.FAWT,leveraging fractional orders and arbitrary scaling and translation factors,exhibits superior translational invariance and adjustable fundamental oscillatory characteristics.This flexibility enables FAWT to provide well-suited wavelet shapes,effectively matching subtle fault components and avoiding performance degradation associated with fixed frequency partitioning and low-oscillation bases in detecting weak faults.In our approach,gearbox vibration signals undergo FAWT to obtain sub-bands.Quantum theory is then introduced into permutation entropy to propose Nonlinear Quantum Permutation Entropy,a feature that more accurately characterizes the operational state of vibration simulation signals.The nonlinear quantum permutation entropy extracted from sub-bands is utilized to characterize the operating state of rotating machinery.A comprehensive analysis of vibration signals from rolling bearings and gearboxes validates the feasibility of the proposed method.Comparative assessments with parameters derived from traditional permutation entropy,sample entropy,wavelet transform(WT),and empirical mode decomposition(EMD)underscore the superior effectiveness of this approach in fault detection and classification for rotating machinery.
文摘One of the biggest dangers to society today is terrorism, where attacks have become one of the most significantrisks to international peace and national security. Big data, information analysis, and artificial intelligence (AI) havebecome the basis for making strategic decisions in many sensitive areas, such as fraud detection, risk management,medical diagnosis, and counter-terrorism. However, there is still a need to assess how terrorist attacks are related,initiated, and detected. For this purpose, we propose a novel framework for classifying and predicting terroristattacks. The proposed framework posits that neglected text attributes included in the Global Terrorism Database(GTD) can influence the accuracy of the model’s classification of terrorist attacks, where each part of the datacan provide vital information to enrich the ability of classifier learning. Each data point in a multiclass taxonomyhas one or more tags attached to it, referred as “related tags.” We applied machine learning classifiers to classifyterrorist attack incidents obtained from the GTD. A transformer-based technique called DistilBERT extracts andlearns contextual features from text attributes to acquiremore information from text data. The extracted contextualfeatures are combined with the “key features” of the dataset and used to perform the final classification. Thestudy explored different experimental setups with various classifiers to evaluate the model’s performance. Theexperimental results show that the proposed framework outperforms the latest techniques for classifying terroristattacks with an accuracy of 98.7% using a combined feature set and extreme gradient boosting classifier.
文摘Among all the plagues threatening cocoa cultivation in general, and particularly in West Africa, the swollen shoot viral disease is currently the most dangerous. The greatest challenge in the fight to eradicate this pandemic remains its early detection. Traditional methods of swollen shoot detection are mostly based on visual observations, leading to late detection and/or diagnostic errors. The use of machine learning algorithms is now an alternative for effective plant disease detection. It is therefore crucial to provide efficient solutions to farmers’ cooperatives. In our study, we built a database of healthy and diseased cocoa leaves. We then explored the power of feature extractors based on convolutional neural networks such as VGG 19, Inception V3, DenseNet 201, and a custom CNN, combining their strengths with the XGBOOST classifier. The results of our experiments showed that this fusion of methods with XGBOOST yielded highly promising scores, outperforming the results of algorithms using the sigmoid function. These results were further consolidated by the use of evaluation metrics such as accuracy, mean squared error, F score, recall, and Matthews’s correlation coefficient. The proposed approach, combining state of the art feature extractors and the XGBOOST classifier, offers an efficient and reliable solution for the early detection of swollen shoot. Its implementation could significantly assist West African cocoa farmers in combating this devastating disease and preserving their crops.
基金funded by The University of Groningen and Prospect Burma organization.
文摘In stock market forecasting,the identification of critical features that affect the performance of machine learning(ML)models is crucial to achieve accurate stock price predictions.Several review papers in the literature have focused on various ML,statistical,and deep learning-based methods used in stock market forecasting.However,no survey study has explored feature selection and extraction techniques for stock market forecasting.This survey presents a detailed analysis of 32 research works that use a combination of feature study and ML approaches in various stock market applications.We conduct a systematic search for articles in the Scopus and Web of Science databases for the years 2011–2022.We review a variety of feature selection and feature extraction approaches that have been successfully applied in the stock market analyses presented in the articles.We also describe the combination of feature analysis techniques and ML methods and evaluate their performance.Moreover,we present other survey articles,stock market input and output data,and analyses based on various factors.We find that correlation criteria,random forest,principal component analysis,and autoencoder are the most widely used feature selection and extraction techniques with the best prediction accuracy for various stock market applications.
文摘Breast cancer is the most prevalent cancer among women,and diagnosing it early is vital for successful treatment.The examination of images captured during biopsies plays an important role in determining whether a patient has cancer or not.However,the stochastic patterns,varying intensities of colors,and the large sizes of these images make it challenging to identify and mark malignant regions in them.Against this backdrop,this study proposes an approach to the pixel categorization based on the genetic algorithm(GA)and principal component analysis(PCA).The spatial features of the images were extracted using various filters,and the most prevalent ones are selected using the GA and fed into the classifiers for pixel-level categorization.Three classifiers—random forest(RF),decision tree(DT),and extra tree(ET)—were used in the proposed model.The parameters of all modelswere separately tuned,and their performance was tested.The results show that the features extracted by using the GA+PCA in the proposed model are influential and reliable for pixel-level classification in service of the image annotation and tumor identification.Further,an image from benign,malignant,and normal classes was randomly selected and used to test the proposed model.The proposed modelGA-PCA-DT has delivered accuracies between 0.99 to 1.0 on a reduced feature set.The predicted pixel sets were also compared with their respective ground-truth values to assess the overall performance of the method on two metrics—the universal image quality index(UIQI)and the structural similarity index(SSI).Both quality measures delivered excellent results.
文摘This paper expounded in detail the principle of energy spectrum analysis based on discrete wavelet transformation and multiresolution analysis. In the aspect of feature extraction method study, with investigating the feature of impact factor in vibration signals and considering the non-placidity and non-linear of vibration diagnosis signals, the authors import wavelet analysis and fractal theory as the tools of faulty signal feature description. Experimental results proved the validity of this method. To some extent, this method provides a good approach of resolving the wholesome problem of fault feature symptom description.
文摘Semantic communication,as a critical component of artificial intelligence(AI),has gained increasing attention in recent years due to its significant impact on various fields.In this paper,we focus on the applications of semantic feature extraction,a key step in the semantic communication,in several areas of artificial intelligence,including natural language processing,medical imaging,remote sensing,autonomous driving,and other image-related applications.Specifically,we discuss how semantic feature extraction can enhance the accuracy and efficiency of natural language processing tasks,such as text classification,sentiment analysis,and topic modeling.In the medical imaging field,we explore how semantic feature extraction can be used for disease diagnosis,drug development,and treatment planning.In addition,we investigate the applications of semantic feature extraction in remote sensing and autonomous driving,where it can facilitate object detection,scene understanding,and other tasks.By providing an overview of the applications of semantic feature extraction in various fields,this paper aims to provide insights into the potential of this technology to advance the development of artificial intelligence.
文摘Relative radiometric normalization (RRN) minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in surface. Scale invariant feature transform (SIFT) has the ability to automatically extract control points (CPs) and is commonly used for remote sensing images. However, its results are mostly inaccurate and sometimes contain incorrect matching caused by generating a small number of false CP pairs. These CP pairs have high false alarm matching. This paper presents a modified method to improve the performance of SIFT CPs matching by applying sum of absolute difference (SAD) in a different manner for the new optical satellite generation called near-equatorial orbit satellite and multi-sensor images. The proposed method, which has a significantly high rate of correct matches, improves CP matching. The data in this study were obtained from the RazakSAT satellite a new near equatorial satellite system. The proposed method involves six steps: 1) data reduction, 2) applying the SIFT to automatically extract CPs, 3) refining CPs matching by using SAD algorithm with empirical threshold, and 4) calculation of true CPs intensity values over all image’ bands, 5) preforming a linear regression model between the intensity values of CPs locate in reverence and sensed image’ bands, 6) Relative radiometric normalization conducting using regression transformation functions. Different thresholds have experimentally tested and used in conducting this study (50 and 70), by followed the proposed method, and it removed the false extracted SIFT CPs to be from 775, 1125, 883, 804, 883 and 681 false pairs to 342, 424, 547, 706, 547, and 469 corrected and matched pairs, respectively.
文摘Photovoltaic(PV)boards are a perfect way to create eco-friendly power from daylight.The defects in the PV panels are caused by various conditions;such defective PV panels need continuous monitoring.The recent development of PV panel monitoring systems provides a modest and viable approach to monitoring and managing the condition of the PV plants.In general,conventional procedures are used to identify the faulty modules earlier and to avoid declines in power generation.The existing deep learning architectures provide the required output to predict the faulty PV panels with less accuracy and a more time-consuming process.To increase the accuracy and to reduce the processing time,a new Convolutional Neural Network(CNN)architecture is required.Hence,in the present work,a new Real-time Multi Variant Deep learning Model(RMVDM)architecture is proposed,and it extracts the image features and classifies the defects in PV panels quickly with high accuracy.The defects that arise in the PV panels are identified by the CNN based RMVDM using RGB images.The biggest difference between CNN and its predecessors is that CNN automatically extracts the image features without any help from a person.The technique is quantitatively assessed and compared with existing faulty PV board identification approaches on the large real-time dataset.The results show that 98%of the accuracy and recall values in the fault detection and classification process.
基金supported by the GRRC program of Gyeonggi Province(GRRC-Gachon2023(B02),Development of AI-based medical service technology).
文摘The performance of a speech emotion recognition(SER)system is heavily influenced by the efficacy of its feature extraction techniques.The study was designed to advance the field of SER by optimizing feature extraction tech-niques,specifically through the incorporation of high-resolution Mel-spectrograms and the expedited calculation of Mel Frequency Cepstral Coefficients(MFCC).This initiative aimed to refine the system’s accuracy by identifying and mitigating the shortcomings commonly found in current approaches.Ultimately,the primary objective was to elevate both the intricacy and effectiveness of our SER model,with a focus on augmenting its proficiency in the accurate identification of emotions in spoken language.The research employed a dual-strategy approach for feature extraction.Firstly,a rapid computation technique for MFCC was implemented and integrated with a Bi-LSTM layer to optimize the encoding of MFCC features.Secondly,a pretrained ResNet model was utilized in conjunction with feature Stats pooling and dense layers for the effective encoding of Mel-spectrogram attributes.These two sets of features underwent separate processing before being combined in a Convolutional Neural Network(CNN)outfitted with a dense layer,with the aim of enhancing their representational richness.The model was rigorously evaluated using two prominent databases:CMU-MOSEI and RAVDESS.Notable findings include an accuracy rate of 93.2%on the CMU-MOSEI database and 95.3%on the RAVDESS database.Such exceptional performance underscores the efficacy of this innovative approach,which not only meets but also exceeds the accuracy benchmarks established by traditional models in the field of speech emotion recognition.
基金the National Natural Science Foundation of China,No.62172190the“Double Creation”Plan of Jiangsu Province,No.JSSCRC2021532and the“Taihu Talent-Innovative Leading Talent”Plan of Wuxi City.
文摘The traditional feature-extraction method of oriented FAST and rotated BRIEF(ORB)detects image features based on a fixed threshold;however,ORB descriptors do not distinguish features well in capsule endoscopy images.Therefore,a new feature detector that uses a new method for setting thresholds,called the adaptive threshold FAST and FREAK in capsule endoscopy images(AFFCEI),is proposed.This method,first constructs an image pyramid and then calculates the thresholds of pixels based on the gray value contrast of all pixels in the local neighborhood of the image,to achieve adaptive image feature extraction in each layer of the pyramid.Subsequently,the features are expressed by the FREAK descriptor,which can enhance the discrimination of the features extracted from the stomach image.Finally,a refined matching is obtained by applying the grid-based motion statistics algorithm to the result of Hamming distance,whereby mismatches are rejected using the RANSAC algorithm.Compared with the ASIFT method,which previously had the best performance,the average running time of AFFCEI was 4/5 that of ASIFT,and the average matching score improved by 5%when tracking features in a moving capsule endoscope.
基金supported by the National Key Research and Development Program of China(2020YFB1807500)the National Natural Science Foundation of China(62072360,62001357,62172438,61901367)+4 种基金the key research and development plan of Shaanxi province(2021ZDLGY02-09,2023-GHZD-44,2023-ZDLGY-54)the Natural Science Foundation of Guangdong Province of China(2022A1515010988)Key Project on Artificial Intelligence of Xi'an Science and Technology Plan(2022JH-RGZN-0003,2022JH-RGZN-0103,2022JH-CLCJ-0053)Xi'an Science and Technology Plan(20RGZN0005)the Proof-ofconcept fund from Hangzhou Research Institute of Xidian University(GNYZ2023QC0201).
文摘The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection.