Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the sett...Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the settlement caused by tunneling.However,well-performing ML models are usually less interpretable.Irrelevant input features decrease the performance and interpretability of an ML model.Nonetheless,feature selection,a critical step in the ML pipeline,is usually ignored in most studies that focused on predicting tunneling-induced settlement.This study applies four techniques,i.e.Pearson correlation method,sequential forward selection(SFS),sequential backward selection(SBS)and Boruta algorithm,to investigate the effect of feature selection on the model’s performance when predicting the tunneling-induced maximum surface settlement(S_(max)).The data set used in this study was compiled from two metro tunnel projects excavated in Hangzhou,China using earth pressure balance(EPB)shields and consists of 14 input features and a single output(i.e.S_(max)).The ML model that is trained on features selected from the Boruta algorithm demonstrates the best performance in both the training and testing phases.The relevant features chosen from the Boruta algorithm further indicate that tunneling-induced settlement is affected by parameters related to tunnel geometry,geological conditions and shield operation.The recently proposed Shapley additive explanations(SHAP)method explores how the input features contribute to the output of a complex ML model.It is observed that the larger settlements are induced during shield tunneling in silty clay.Moreover,the SHAP analysis reveals that the low magnitudes of face pressure at the top of the shield increase the model’s output。展开更多
Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease ...Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%.展开更多
A lightweight malware detection and family classification system for the Internet of Things (IoT) was designed to solve the difficulty of deploying defense models caused by the limited computing and storage resources ...A lightweight malware detection and family classification system for the Internet of Things (IoT) was designed to solve the difficulty of deploying defense models caused by the limited computing and storage resources of IoT devices. By training complex models with IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Additionally, the multi-teacher knowledge distillation method is employed to train KD-LMDNet, which focuses on classifying malware families. The results indicate that the model’s identification speed surpasses that of traditional methods by 23.68%. Moreover, the accuracy achieved on the Malimg dataset for family classification is an impressive 99.07%. Furthermore, with a model size of only 0.45M, it appears to be well-suited for the IoT environment. By training complex models using IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Thus, the presented approach can address the challenges associated with malware detection and family classification in IoT devices.展开更多
Dear Editor,Scene understanding is an essential task in computer vision.The ultimate objective of scene understanding is to instruct computers to understand and reason about the scenes as humans do.Parallel vision is ...Dear Editor,Scene understanding is an essential task in computer vision.The ultimate objective of scene understanding is to instruct computers to understand and reason about the scenes as humans do.Parallel vision is a research framework that unifies the explanation and perception of dynamic and complex scenes.展开更多
Cybersecurity increasingly relies on machine learning(ML)models to respond to and detect attacks.However,the rapidly changing data environment makes model life-cycle management after deployment essential.Real-time det...Cybersecurity increasingly relies on machine learning(ML)models to respond to and detect attacks.However,the rapidly changing data environment makes model life-cycle management after deployment essential.Real-time detection of drift signals from various threats is fundamental for effectively managing deployed models.However,detecting drift in unsupervised environments can be challenging.This study introduces a novel approach leveraging Shapley additive explanations(SHAP),a widely recognized explainability technique in ML,to address drift detection in unsupervised settings.The proposed method incorporates a range of plots and statistical techniques to enhance drift detection reliability and introduces a drift suspicion metric that considers the explanatory aspects absent in the current approaches.To validate the effectiveness of the proposed approach in a real-world scenario,we applied it to an environment designed to detect domain generation algorithms(DGAs).The dataset was obtained from various types of DGAs provided by NetLab.Based on this dataset composition,we sought to validate the proposed SHAP-based approach through drift scenarios that occur when a previously deployed model detects new data types in an environment that detects real-world DGAs.The results revealed that more than 90%of the drift data exceeded the threshold,demonstrating the high reliability of the approach to detect drift in an unsupervised environment.The proposed method distinguishes itself fromexisting approaches by employing explainable artificial intelligence(XAI)-based detection,which is not limited by model or system environment constraints.In conclusion,this paper proposes a novel approach to detect drift in unsupervised ML settings for cybersecurity.The proposed method employs SHAP-based XAI and a drift suspicion metric to improve drift detection reliability.It is versatile and suitable for various realtime data analysis contexts beyond DGA detection environments.This study significantly contributes to theMLcommunity by addressing the critical issue of managing ML models in real-world cybersecurity settings.Our approach is distinguishable from existing techniques by employing XAI-based detection,which is not limited by model or system environment constraints.As a result,our method can be applied in critical domains that require adaptation to continuous changes,such as cybersecurity.Through extensive validation across diverse settings beyond DGA detection environments,the proposed method will emerge as a versatile drift detection technique suitable for a wide range of real-time data analysis contexts.It is also anticipated to emerge as a new approach to protect essential systems and infrastructures from attacks.展开更多
Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded vid...Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos.Although visual media manipulations are not new,the introduction of deepfakes has marked a breakthrough in creating fake media and information.These manipulated pic-tures and videos will undoubtedly have an enormous societal impact.Deepfake uses the latest technology like Artificial Intelligence(AI),Machine Learning(ML),and Deep Learning(DL)to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye.Therefore,automated solutions employed by DL can be an efficient approach for detecting deepfake.Though the“black-box”nature of the DL system allows for robust predictions,they cannot be completely trustworthy.Explainability is thefirst step toward achieving transparency,but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems.Though Explainable Artificial Intelligence(XAI)can solve this problem by inter-preting the predictions of these systems.This work proposes to provide a compre-hensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explana-tions(LIME)to assure its validity and reliability.This study identifies real and deepfake images using different Convolutional Neural Network(CNN)models to get the best accuracy.It also explains which part of the image caused the model to make a specific classification using the LIME algorithm.To apply the CNN model,the dataset is taken from Kaggle,which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size.For experimental results,Jupyter notebook,TensorFlow,Num-Py,and Pandas were used as software,InceptionResnetV2,DenseNet201,Incep-tionV3,and ResNet152V2 were used as CNN models.All these models’performances were good enough,such as InceptionV3 gained 99.68%accuracy,ResNet152V2 got an accuracy of 99.19%,and DenseNet201 performed with 99.81%accuracy.However,InceptionResNetV2 achieved the highest accuracy of 99.87%,which was verified later with the LIME algorithm for XAI,where the proposed method performed the best.The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.展开更多
Recently,convolutional neural network(CNN)-based visual inspec-tion has been developed to detect defects on building surfaces automatically.The CNN model demonstrates remarkable accuracy in image data analysis;however...Recently,convolutional neural network(CNN)-based visual inspec-tion has been developed to detect defects on building surfaces automatically.The CNN model demonstrates remarkable accuracy in image data analysis;however,the predicted results have uncertainty in providing accurate informa-tion to users because of the“black box”problem in the deep learning model.Therefore,this study proposes a visual explanation method to overcome the uncertainty limitation of CNN-based defect identification.The visual repre-sentative gradient-weights class activation mapping(Grad-CAM)method is adopted to provide visually explainable information.A visualizing evaluation index is proposed to quantitatively analyze visual representations;this index reflects a rough estimate of the concordance rate between the visualized heat map and intended defects.In addition,an ablation study,adopting three-branch combinations with the VGG16,is implemented to identify perfor-mance variations by visualizing predicted results.Experiments reveal that the proposed model,combined with hybrid pooling,batch normalization,and multi-attention modules,achieves the best performance with an accuracy of 97.77%,corresponding to an improvement of 2.49%compared with the baseline model.Consequently,this study demonstrates that reliable results from an automatic defect classification model can be provided to an inspector through the visual representation of the predicted results using CNN models.展开更多
Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.More...Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.Moreover,the hidden physics behind ML is unexplained,reducing the generalization capability and versatility of ML methods in UT.In this paper,a generally applicable ML framework based on the model interpretation strategy is proposed to improve the detection accuracy and computational efciency of UT.Firstly,multi-domain features are extracted from the UT signals with signal processing techniques to construct an initial feature space.Subsequently,a feature selection method based on model interpretable strategy(FS-MIS)is innovatively developed by integrating Shapley additive explanation(SHAP),flter method,embedded method and wrapper method.The most efective ML model and the optimal feature subset with better correlation to the target defects are determined self-adaptively.The proposed framework is validated by identifying and locating side-drilled holes(SDHs)with 0.5λcentral distance and different depths.An ultrasonic array probe is adopted to acquire FMC datasets from several aluminum alloy specimens containing two SDHs by experiments.The optimal feature subset selected by FS-MIS is set as the input of the chosen ML model to train and predict the times of arrival(ToAs)of the scattered waves emitted by adjacent SDHs.The experimental results demonstrate that the relative errors of the predicted ToAs are all below 3.67%with an average error of 0.25%,signifcantly improving the time resolution of UT signals.On this basis,the predicted ToAs are assigned to the corresponding original signals for decoupling overlapped pulse-echoes and reconstructing high-resolution FMC datasets.The imaging resolution is enhanced to 0.5λby implementing the total focusing method(TFM).The relative errors of hole depths and central distance are no more than 0.51%and 3.57%,respectively.Finally,the superior performance of the proposed FS-MIS is validated by comparing it with initial feature space and conventional dimensionality reduction techniques.展开更多
The flow regimes of GLCC with horizon inlet and a vertical pipe are investigated in experiments,and the velocities and pressure drops data labeled by the corresponding flow regimes are collected.Combined with the flow...The flow regimes of GLCC with horizon inlet and a vertical pipe are investigated in experiments,and the velocities and pressure drops data labeled by the corresponding flow regimes are collected.Combined with the flow regimes data of other GLCC positions from other literatures in existence,the gas and liquid superficial velocities and pressure drops are used as the input of the machine learning algorithms respectively which are applied to identify the flow regimes.The choosing of input data types takes the availability of data for practical industry fields into consideration,and the twelve machine learning algorithms are chosen from the classical and popular algorithms in the area of classification,including the typical ensemble models,SVM,KNN,Bayesian Model and MLP.The results of flow regimes identification show that gas and liquid superficial velocities are the ideal type of input data for the flow regimes identification by machine learning.Most of the ensemble models can identify the flow regimes of GLCC by gas and liquid velocities with the accuracy of 0.99 and more.For the pressure drops as the input of each algorithm,it is not the suitable as gas and liquid velocities,and only XGBoost and Bagging Tree can identify the GLCC flow regimes accurately.The success and confusion of each algorithm are analyzed and explained based on the experimental phenomena of flow regimes evolution processes,the flow regimes map,and the principles of algorithms.The applicability and feasibility of each algorithm according to different types of data for GLCC flow regimes identification are proposed.展开更多
Existing explanation methods for Convolutional Neural Networks(CNNs)lack the pixel-level visualization explanations to generate the reliable fine-grained decision features.Since there are inconsistencies between the e...Existing explanation methods for Convolutional Neural Networks(CNNs)lack the pixel-level visualization explanations to generate the reliable fine-grained decision features.Since there are inconsistencies between the explanation and the actual behavior of the model to be interpreted,we propose a Fine-Grained Visual Explanation for CNN,namely F-GVE,which produces a fine-grained explanation with higher consistency to the decision of the original model.The exact backward class-specific gradients with respect to the input image is obtained to highlight the object-related pixels the model used to make prediction.In addition,for better visualization and less noise,F-GVE selects an appropriate threshold to filter the gradient during the calculation and the explanation map is obtained by element-wise multiplying the gradient and the input image to show fine-grained classification decision features.Experimental results demonstrate that F-GVE has good visual performances and highlights the importance of fine-grained decision features.Moreover,the faithfulness of the explanation in this paper is high and it is effective and practical on troubleshooting and debugging detection.展开更多
With the integration of global economy development and the rapid growth of science knowledge and technology,the needs of people’s consumption are increasingly personalized and diversified.Such a market background mak...With the integration of global economy development and the rapid growth of science knowledge and technology,the needs of people’s consumption are increasingly personalized and diversified.Such a market background makes sales forecasting become an indispensable part of enterprise management and development.The definition of the sales forecasting is that based on the past few years’sales situation,the enterprises through systematic sales forecasting models estimate of the quantity and amount of all or some specific sales products and services in a specific time in the future.Accurate sales forecasting can promote enterprises to do better in future revenue,and can also encourage enterprises to set and keep an efficient sales management team.This paper will analyze traditional sales forecasting methods and sales forecasting methods based on big data models related to the perspective of machine learning,and then compare them.The research shows that the two sales forecasting methods have their own advantages and disadvantages.In the future,enterprises can adopt the two sales forecasting methods in parallel to maximize the utilization advantage of sales forecasting for enterprises.展开更多
Our reconstructions of folk concepts are often influenced by the metaphysical and epistemological doctrines we are committed to.Surprisingly enough,this influence is rarely recognized in definitional debates and has b...Our reconstructions of folk concepts are often influenced by the metaphysical and epistemological doctrines we are committed to.Surprisingly enough,this influence is rarely recognized in definitional debates and has been mostly overlooked in the literature on philosophical definitions.It is frequent for philosophers to act as if only evidential support(for example,our intuitions across real and hypothetical cases)should be considered when choosing between competing reconstructions.This programmatic paper analyzes the interplay between philosophical commitments and evidence in the reconstruction of folk concepts.It also clarifies the precise manner in which metaphysical and epistemological doctrines influence philosophical definitions,why the incidence of metaphysical and epistemological doctrines is rarely recognized,and why theoretically motivated definitions should not be assimilated to the two major forms of definitions recognized in the relevant literature(descriptive and revisionary).展开更多
The essay is based on The Libido for the Ugly written by Henry Louis Mencken.The writer demonstrates the persuasiveness of the work through two different aspects–persuasive and impressionistic writing skills.After de...The essay is based on The Libido for the Ugly written by Henry Louis Mencken.The writer demonstrates the persuasiveness of the work through two different aspects–persuasive and impressionistic writing skills.After detailed analysis,the reason for the strong credibility of that subjective description could be clearly shown.展开更多
In this work,di erent kinds of traveling wave solutions and uncategorized soliton wave solutions are obtained in a three dimensional(3-D)nonlinear evolution equations(NEEs)through the implementation of the modi ed ext...In this work,di erent kinds of traveling wave solutions and uncategorized soliton wave solutions are obtained in a three dimensional(3-D)nonlinear evolution equations(NEEs)through the implementation of the modi ed extended direct algebraic method.Bright-singular and dark-singular combo solitons,Jacobi's elliptic functions,Weierstrass elliptic functions,constant wave solutions and so on are attained beside their existing conditions.Physical interpretation of the solutions to the 3-D modi ed KdV-Zakharov-Kuznetsov equation are also given.展开更多
Based on comprehensive construction of the linguistic ontology category system, the basic categorical units for nouns and verbs in Chinese can be defined.Furthermore, logic symbols could be applied to jointing of cate...Based on comprehensive construction of the linguistic ontology category system, the basic categorical units for nouns and verbs in Chinese can be defined.Furthermore, logic symbols could be applied to jointing of categories and procedure of lexical meaning's calculating, thereby externalizing and formalizing the calculating procedure of lexical meaning.This could contribute to Chinese computational linguistics and international Chinese teaching.展开更多
基金support provided by The Science and Technology Development Fund,Macao SAR,China(File Nos.0057/2020/AGJ and SKL-IOTSC-2021-2023)Science and Technology Program of Guangdong Province,China(Grant No.2021A0505080009).
文摘Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the settlement caused by tunneling.However,well-performing ML models are usually less interpretable.Irrelevant input features decrease the performance and interpretability of an ML model.Nonetheless,feature selection,a critical step in the ML pipeline,is usually ignored in most studies that focused on predicting tunneling-induced settlement.This study applies four techniques,i.e.Pearson correlation method,sequential forward selection(SFS),sequential backward selection(SBS)and Boruta algorithm,to investigate the effect of feature selection on the model’s performance when predicting the tunneling-induced maximum surface settlement(S_(max)).The data set used in this study was compiled from two metro tunnel projects excavated in Hangzhou,China using earth pressure balance(EPB)shields and consists of 14 input features and a single output(i.e.S_(max)).The ML model that is trained on features selected from the Boruta algorithm demonstrates the best performance in both the training and testing phases.The relevant features chosen from the Boruta algorithm further indicate that tunneling-induced settlement is affected by parameters related to tunnel geometry,geological conditions and shield operation.The recently proposed Shapley additive explanations(SHAP)method explores how the input features contribute to the output of a complex ML model.It is observed that the larger settlements are induced during shield tunneling in silty clay.Moreover,the SHAP analysis reveals that the low magnitudes of face pressure at the top of the shield increase the model’s output。
基金support from the Deanship for Research&Innovation,Ministry of Education in Saudi Arabia,under the Auspices of Project Number:IFP22UQU4281768DSR122.
文摘Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%.
文摘A lightweight malware detection and family classification system for the Internet of Things (IoT) was designed to solve the difficulty of deploying defense models caused by the limited computing and storage resources of IoT devices. By training complex models with IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Additionally, the multi-teacher knowledge distillation method is employed to train KD-LMDNet, which focuses on classifying malware families. The results indicate that the model’s identification speed surpasses that of traditional methods by 23.68%. Moreover, the accuracy achieved on the Malimg dataset for family classification is an impressive 99.07%. Furthermore, with a model size of only 0.45M, it appears to be well-suited for the IoT environment. By training complex models using IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Thus, the presented approach can address the challenges associated with malware detection and family classification in IoT devices.
基金supported by the Natural Science Foundation for Young Scientists in Shaanxi Province of China (2023-JC-QN-0729)the Fundamental Research Funds for the Central Universities (GK202207008)。
文摘Dear Editor,Scene understanding is an essential task in computer vision.The ultimate objective of scene understanding is to instruct computers to understand and reason about the scenes as humans do.Parallel vision is a research framework that unifies the explanation and perception of dynamic and complex scenes.
基金supported by the Institute of Information and Communications Technology Planning and Evaluation(IITP)grant funded by the Korean government(MSIT)(No.2022-0-00089,Development of clustering and analysis technology to identify cyber attack groups based on life cycle)the Institute of Civil Military Technology Cooperation funded by the Defense Acquisition Program Administration and Ministry of Trade,Industry and Energy of Korean government under Grant No.21-CM-EC-07.
文摘Cybersecurity increasingly relies on machine learning(ML)models to respond to and detect attacks.However,the rapidly changing data environment makes model life-cycle management after deployment essential.Real-time detection of drift signals from various threats is fundamental for effectively managing deployed models.However,detecting drift in unsupervised environments can be challenging.This study introduces a novel approach leveraging Shapley additive explanations(SHAP),a widely recognized explainability technique in ML,to address drift detection in unsupervised settings.The proposed method incorporates a range of plots and statistical techniques to enhance drift detection reliability and introduces a drift suspicion metric that considers the explanatory aspects absent in the current approaches.To validate the effectiveness of the proposed approach in a real-world scenario,we applied it to an environment designed to detect domain generation algorithms(DGAs).The dataset was obtained from various types of DGAs provided by NetLab.Based on this dataset composition,we sought to validate the proposed SHAP-based approach through drift scenarios that occur when a previously deployed model detects new data types in an environment that detects real-world DGAs.The results revealed that more than 90%of the drift data exceeded the threshold,demonstrating the high reliability of the approach to detect drift in an unsupervised environment.The proposed method distinguishes itself fromexisting approaches by employing explainable artificial intelligence(XAI)-based detection,which is not limited by model or system environment constraints.In conclusion,this paper proposes a novel approach to detect drift in unsupervised ML settings for cybersecurity.The proposed method employs SHAP-based XAI and a drift suspicion metric to improve drift detection reliability.It is versatile and suitable for various realtime data analysis contexts beyond DGA detection environments.This study significantly contributes to theMLcommunity by addressing the critical issue of managing ML models in real-world cybersecurity settings.Our approach is distinguishable from existing techniques by employing XAI-based detection,which is not limited by model or system environment constraints.As a result,our method can be applied in critical domains that require adaptation to continuous changes,such as cybersecurity.Through extensive validation across diverse settings beyond DGA detection environments,the proposed method will emerge as a versatile drift detection technique suitable for a wide range of real-time data analysis contexts.It is also anticipated to emerge as a new approach to protect essential systems and infrastructures from attacks.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R193)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.Taif University Researchers Supporting Project(TURSP-2020/26),Taif University,Taif,Saudi Arabia.
文摘Nowadays,deepfake is wreaking havoc on society.Deepfake content is created with the help of artificial intelligence and machine learning to replace one person’s likeness with another person in pictures or recorded videos.Although visual media manipulations are not new,the introduction of deepfakes has marked a breakthrough in creating fake media and information.These manipulated pic-tures and videos will undoubtedly have an enormous societal impact.Deepfake uses the latest technology like Artificial Intelligence(AI),Machine Learning(ML),and Deep Learning(DL)to construct automated methods for creating fake content that is becoming increasingly difficult to detect with the human eye.Therefore,automated solutions employed by DL can be an efficient approach for detecting deepfake.Though the“black-box”nature of the DL system allows for robust predictions,they cannot be completely trustworthy.Explainability is thefirst step toward achieving transparency,but the existing incapacity of DL to explain its own decisions to human users limits the efficacy of these systems.Though Explainable Artificial Intelligence(XAI)can solve this problem by inter-preting the predictions of these systems.This work proposes to provide a compre-hensive study of deepfake detection using the DL method and analyze the result of the most effective algorithm with Local Interpretable Model-Agnostic Explana-tions(LIME)to assure its validity and reliability.This study identifies real and deepfake images using different Convolutional Neural Network(CNN)models to get the best accuracy.It also explains which part of the image caused the model to make a specific classification using the LIME algorithm.To apply the CNN model,the dataset is taken from Kaggle,which includes 70 k real images from the Flickr dataset collected by Nvidia and 70 k fake faces generated by StyleGAN of 256 px in size.For experimental results,Jupyter notebook,TensorFlow,Num-Py,and Pandas were used as software,InceptionResnetV2,DenseNet201,Incep-tionV3,and ResNet152V2 were used as CNN models.All these models’performances were good enough,such as InceptionV3 gained 99.68%accuracy,ResNet152V2 got an accuracy of 99.19%,and DenseNet201 performed with 99.81%accuracy.However,InceptionResNetV2 achieved the highest accuracy of 99.87%,which was verified later with the LIME algorithm for XAI,where the proposed method performed the best.The obtained results and dependability demonstrate its preference for detecting deepfake images effectively.
基金supported by a Korea Agency for Infrastructure Technology Advancement(KAIA)grant funded by the Ministry of Land,Infrastructure,and Transport(Grant 22CTAP-C163951-02).
文摘Recently,convolutional neural network(CNN)-based visual inspec-tion has been developed to detect defects on building surfaces automatically.The CNN model demonstrates remarkable accuracy in image data analysis;however,the predicted results have uncertainty in providing accurate informa-tion to users because of the“black box”problem in the deep learning model.Therefore,this study proposes a visual explanation method to overcome the uncertainty limitation of CNN-based defect identification.The visual repre-sentative gradient-weights class activation mapping(Grad-CAM)method is adopted to provide visually explainable information.A visualizing evaluation index is proposed to quantitatively analyze visual representations;this index reflects a rough estimate of the concordance rate between the visualized heat map and intended defects.In addition,an ablation study,adopting three-branch combinations with the VGG16,is implemented to identify perfor-mance variations by visualizing predicted results.Experiments reveal that the proposed model,combined with hybrid pooling,batch normalization,and multi-attention modules,achieves the best performance with an accuracy of 97.77%,corresponding to an improvement of 2.49%compared with the baseline model.Consequently,this study demonstrates that reliable results from an automatic defect classification model can be provided to an inspector through the visual representation of the predicted results using CNN models.
基金Supported by National Natural Science Foundation of China(Grant Nos.U22B2068,52275520,52075078)National Key Research and Development Program of China(Grant No.2019YFA0709003).
文摘Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.Moreover,the hidden physics behind ML is unexplained,reducing the generalization capability and versatility of ML methods in UT.In this paper,a generally applicable ML framework based on the model interpretation strategy is proposed to improve the detection accuracy and computational efciency of UT.Firstly,multi-domain features are extracted from the UT signals with signal processing techniques to construct an initial feature space.Subsequently,a feature selection method based on model interpretable strategy(FS-MIS)is innovatively developed by integrating Shapley additive explanation(SHAP),flter method,embedded method and wrapper method.The most efective ML model and the optimal feature subset with better correlation to the target defects are determined self-adaptively.The proposed framework is validated by identifying and locating side-drilled holes(SDHs)with 0.5λcentral distance and different depths.An ultrasonic array probe is adopted to acquire FMC datasets from several aluminum alloy specimens containing two SDHs by experiments.The optimal feature subset selected by FS-MIS is set as the input of the chosen ML model to train and predict the times of arrival(ToAs)of the scattered waves emitted by adjacent SDHs.The experimental results demonstrate that the relative errors of the predicted ToAs are all below 3.67%with an average error of 0.25%,signifcantly improving the time resolution of UT signals.On this basis,the predicted ToAs are assigned to the corresponding original signals for decoupling overlapped pulse-echoes and reconstructing high-resolution FMC datasets.The imaging resolution is enhanced to 0.5λby implementing the total focusing method(TFM).The relative errors of hole depths and central distance are no more than 0.51%and 3.57%,respectively.Finally,the superior performance of the proposed FS-MIS is validated by comparing it with initial feature space and conventional dimensionality reduction techniques.
文摘The flow regimes of GLCC with horizon inlet and a vertical pipe are investigated in experiments,and the velocities and pressure drops data labeled by the corresponding flow regimes are collected.Combined with the flow regimes data of other GLCC positions from other literatures in existence,the gas and liquid superficial velocities and pressure drops are used as the input of the machine learning algorithms respectively which are applied to identify the flow regimes.The choosing of input data types takes the availability of data for practical industry fields into consideration,and the twelve machine learning algorithms are chosen from the classical and popular algorithms in the area of classification,including the typical ensemble models,SVM,KNN,Bayesian Model and MLP.The results of flow regimes identification show that gas and liquid superficial velocities are the ideal type of input data for the flow regimes identification by machine learning.Most of the ensemble models can identify the flow regimes of GLCC by gas and liquid velocities with the accuracy of 0.99 and more.For the pressure drops as the input of each algorithm,it is not the suitable as gas and liquid velocities,and only XGBoost and Bagging Tree can identify the GLCC flow regimes accurately.The success and confusion of each algorithm are analyzed and explained based on the experimental phenomena of flow regimes evolution processes,the flow regimes map,and the principles of algorithms.The applicability and feasibility of each algorithm according to different types of data for GLCC flow regimes identification are proposed.
基金This work was partially supported by Beijing Natural Science Foundation(No.4222038)by Open Research Project of the State Key Laboratory of Media Convergence and Communication(Communication University of China),by the National Key RD Program of China(No.2021YFF0307600)and by Fundamental Research Funds for the Central Universities.
文摘Existing explanation methods for Convolutional Neural Networks(CNNs)lack the pixel-level visualization explanations to generate the reliable fine-grained decision features.Since there are inconsistencies between the explanation and the actual behavior of the model to be interpreted,we propose a Fine-Grained Visual Explanation for CNN,namely F-GVE,which produces a fine-grained explanation with higher consistency to the decision of the original model.The exact backward class-specific gradients with respect to the input image is obtained to highlight the object-related pixels the model used to make prediction.In addition,for better visualization and less noise,F-GVE selects an appropriate threshold to filter the gradient during the calculation and the explanation map is obtained by element-wise multiplying the gradient and the input image to show fine-grained classification decision features.Experimental results demonstrate that F-GVE has good visual performances and highlights the importance of fine-grained decision features.Moreover,the faithfulness of the explanation in this paper is high and it is effective and practical on troubleshooting and debugging detection.
文摘With the integration of global economy development and the rapid growth of science knowledge and technology,the needs of people’s consumption are increasingly personalized and diversified.Such a market background makes sales forecasting become an indispensable part of enterprise management and development.The definition of the sales forecasting is that based on the past few years’sales situation,the enterprises through systematic sales forecasting models estimate of the quantity and amount of all or some specific sales products and services in a specific time in the future.Accurate sales forecasting can promote enterprises to do better in future revenue,and can also encourage enterprises to set and keep an efficient sales management team.This paper will analyze traditional sales forecasting methods and sales forecasting methods based on big data models related to the perspective of machine learning,and then compare them.The research shows that the two sales forecasting methods have their own advantages and disadvantages.In the future,enterprises can adopt the two sales forecasting methods in parallel to maximize the utilization advantage of sales forecasting for enterprises.
文摘Our reconstructions of folk concepts are often influenced by the metaphysical and epistemological doctrines we are committed to.Surprisingly enough,this influence is rarely recognized in definitional debates and has been mostly overlooked in the literature on philosophical definitions.It is frequent for philosophers to act as if only evidential support(for example,our intuitions across real and hypothetical cases)should be considered when choosing between competing reconstructions.This programmatic paper analyzes the interplay between philosophical commitments and evidence in the reconstruction of folk concepts.It also clarifies the precise manner in which metaphysical and epistemological doctrines influence philosophical definitions,why the incidence of metaphysical and epistemological doctrines is rarely recognized,and why theoretically motivated definitions should not be assimilated to the two major forms of definitions recognized in the relevant literature(descriptive and revisionary).
文摘The essay is based on The Libido for the Ugly written by Henry Louis Mencken.The writer demonstrates the persuasiveness of the work through two different aspects–persuasive and impressionistic writing skills.After detailed analysis,the reason for the strong credibility of that subjective description could be clearly shown.
文摘In this work,di erent kinds of traveling wave solutions and uncategorized soliton wave solutions are obtained in a three dimensional(3-D)nonlinear evolution equations(NEEs)through the implementation of the modi ed extended direct algebraic method.Bright-singular and dark-singular combo solitons,Jacobi's elliptic functions,Weierstrass elliptic functions,constant wave solutions and so on are attained beside their existing conditions.Physical interpretation of the solutions to the 3-D modi ed KdV-Zakharov-Kuznetsov equation are also given.
基金funded by the 62nd China Postdoctoral Science Foundation“Research on the Intercommunity of Chinese Nominal Verb and Subject Words and Its Semantic Category Analysis”(2017M622893)。
文摘Based on comprehensive construction of the linguistic ontology category system, the basic categorical units for nouns and verbs in Chinese can be defined.Furthermore, logic symbols could be applied to jointing of categories and procedure of lexical meaning's calculating, thereby externalizing and formalizing the calculating procedure of lexical meaning.This could contribute to Chinese computational linguistics and international Chinese teaching.