期刊文献+
共找到11篇文章
< 1 >
每页显示 20 50 100
Investigation of feature contribution to shield tunneling-induced settlement using Shapley additive explanations method 被引量:7
1
作者 K.K.Pabodha M.Kannangara Wanhuan Zhou +1 位作者 Zhi Ding Zhehao Hong 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2022年第4期1052-1063,共12页
Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the sett... Accurate prediction of shield tunneling-induced settlement is a complex problem that requires consideration of many influential parameters.Recent studies reveal that machine learning(ML)algorithms can predict the settlement caused by tunneling.However,well-performing ML models are usually less interpretable.Irrelevant input features decrease the performance and interpretability of an ML model.Nonetheless,feature selection,a critical step in the ML pipeline,is usually ignored in most studies that focused on predicting tunneling-induced settlement.This study applies four techniques,i.e.Pearson correlation method,sequential forward selection(SFS),sequential backward selection(SBS)and Boruta algorithm,to investigate the effect of feature selection on the model’s performance when predicting the tunneling-induced maximum surface settlement(S_(max)).The data set used in this study was compiled from two metro tunnel projects excavated in Hangzhou,China using earth pressure balance(EPB)shields and consists of 14 input features and a single output(i.e.S_(max)).The ML model that is trained on features selected from the Boruta algorithm demonstrates the best performance in both the training and testing phases.The relevant features chosen from the Boruta algorithm further indicate that tunneling-induced settlement is affected by parameters related to tunnel geometry,geological conditions and shield operation.The recently proposed Shapley additive explanations(SHAP)method explores how the input features contribute to the output of a complex ML model.It is observed that the larger settlements are induced during shield tunneling in silty clay.Moreover,the SHAP analysis reveals that the low magnitudes of face pressure at the top of the shield increase the model’s output。 展开更多
关键词 feature Selection Shield operational parameters Pearson correlation method Boruta algorithm shapley additive explanations(SHAP) analysis
下载PDF
Early Detection of Colletotrichum Kahawae Disease in Coffee Cherry Based on Computer Vision Techniques
2
作者 Raveena Selvanarayanan Surendran Rajendran Youseef Alotaibi 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第4期759-782,共24页
Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease ... Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%. 展开更多
关键词 Computer vision coffee berry disease colletotrichum kahawae XG boost shapley additive explanations
下载PDF
Dynamic Forecasting of Traffic Event Duration in Istanbul:A Classification Approach with Real-Time Data Integration
3
作者 Mesut Ulu Yusuf Sait Türkan +2 位作者 Kenan Menguc Ersin Namlı Tarık Kucukdeniz 《Computers, Materials & Continua》 SCIE EI 2024年第8期2259-2281,共23页
Today,urban traffic,growing populations,and dense transportation networks are contributing to an increase in traffic incidents.These incidents include traffic accidents,vehicle breakdowns,fires,and traffic disputes,re... Today,urban traffic,growing populations,and dense transportation networks are contributing to an increase in traffic incidents.These incidents include traffic accidents,vehicle breakdowns,fires,and traffic disputes,resulting in long waiting times,high carbon emissions,and other undesirable situations.It is vital to estimate incident response times quickly and accurately after traffic incidents occur for the success of incident-related planning and response activities.This study presents a model for forecasting the traffic incident duration of traffic events with high precision.The proposed model goes through a 4-stage process using various features to predict the duration of four different traffic events and presents a feature reduction approach to enable real-time data collection and prediction.In the first stage,the dataset consisting of 24,431 data points and 75 variables is prepared by data collection,merging,missing data processing and data cleaning.In the second stage,models such as Decision Trees(DT),K-Nearest Neighbour(KNN),Random Forest(RF)and Support Vector Machines(SVM)are used and hyperparameter optimisation is performed with GridSearchCV.In the third stage,feature selection and reduction are performed and real-time data are used.In the last stage,model performance with 14 variables is evaluated with metrics such as accuracy,precision,recall,F1-score,MCC,confusion matrix and SHAP.The RF model outperforms other models with an accuracy of 98.5%.The study’s prediction results demonstrate that the proposed dynamic prediction model can achieve a high level of success. 展开更多
关键词 Traffic event duration forecasting machine learning feature reduction shapley additive explanations(SHAP)
下载PDF
Landslide susceptibility mapping(LSM)based on different boosting and hyperparameter optimization algorithms:A case of Wanzhou District,China
4
作者 Deliang Sun Jing Wang +2 位作者 Haijia Wen YueKai Ding Changlin Mi 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2024年第8期3221-3232,共12页
Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challen... Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping(LSM)studies.However,these algorithms possess distinct computational strategies and hyperparameters,making it challenging to propose an ideal LSM model.To investigate the impact of different boosting algorithms and hyperparameter optimization algorithms on LSM,this study constructed a geospatial database comprising 12 conditioning factors,such as elevation,stratum,and annual average rainfall.The XGBoost(XGB),LightGBM(LGBM),and CatBoost(CB)algorithms were employed to construct the LSM model.Furthermore,the Bayesian optimization(BO),particle swarm optimization(PSO),and Hyperband optimization(HO)algorithms were applied to optimizing the LSM model.The boosting algorithms exhibited varying performances,with CB demonstrating the highest precision,followed by LGBM,and XGB showing poorer precision.Additionally,the hyperparameter optimization algorithms displayed different performances,with HO outperforming PSO and BO showing poorer performance.The HO-CB model achieved the highest precision,boasting an accuracy of 0.764,an F1-score of 0.777,an area under the curve(AUC)value of 0.837 for the training set,and an AUC value of 0.863 for the test set.The model was interpreted using SHapley Additive exPlanations(SHAP),revealing that slope,curvature,topographic wetness index(TWI),degree of relief,and elevation significantly influenced landslides in the study area.This study offers a scientific reference for LSM and disaster prevention research.This study examines the utilization of various boosting algorithms and hyperparameter optimization algorithms in Wanzhou District.It proposes the HO-CB-SHAP framework as an effective approach to accurately forecast landslide disasters and interpret LSM models.However,limitations exist concerning the generalizability of the model and the data processing,which require further exploration in subsequent studies. 展开更多
关键词 Landslide susceptibility Hyperparameter optimization Boosting algorithms shapley additive explanations(SHAP)
下载PDF
Explainable Artificial Intelligence-Based Model Drift Detection Applicable to Unsupervised Environments
5
作者 Yongsoo Lee Yeeun Lee +1 位作者 Eungyu Lee Taejin Lee 《Computers, Materials & Continua》 SCIE EI 2023年第8期1701-1719,共19页
Cybersecurity increasingly relies on machine learning(ML)models to respond to and detect attacks.However,the rapidly changing data environment makes model life-cycle management after deployment essential.Real-time det... Cybersecurity increasingly relies on machine learning(ML)models to respond to and detect attacks.However,the rapidly changing data environment makes model life-cycle management after deployment essential.Real-time detection of drift signals from various threats is fundamental for effectively managing deployed models.However,detecting drift in unsupervised environments can be challenging.This study introduces a novel approach leveraging Shapley additive explanations(SHAP),a widely recognized explainability technique in ML,to address drift detection in unsupervised settings.The proposed method incorporates a range of plots and statistical techniques to enhance drift detection reliability and introduces a drift suspicion metric that considers the explanatory aspects absent in the current approaches.To validate the effectiveness of the proposed approach in a real-world scenario,we applied it to an environment designed to detect domain generation algorithms(DGAs).The dataset was obtained from various types of DGAs provided by NetLab.Based on this dataset composition,we sought to validate the proposed SHAP-based approach through drift scenarios that occur when a previously deployed model detects new data types in an environment that detects real-world DGAs.The results revealed that more than 90%of the drift data exceeded the threshold,demonstrating the high reliability of the approach to detect drift in an unsupervised environment.The proposed method distinguishes itself fromexisting approaches by employing explainable artificial intelligence(XAI)-based detection,which is not limited by model or system environment constraints.In conclusion,this paper proposes a novel approach to detect drift in unsupervised ML settings for cybersecurity.The proposed method employs SHAP-based XAI and a drift suspicion metric to improve drift detection reliability.It is versatile and suitable for various realtime data analysis contexts beyond DGA detection environments.This study significantly contributes to theMLcommunity by addressing the critical issue of managing ML models in real-world cybersecurity settings.Our approach is distinguishable from existing techniques by employing XAI-based detection,which is not limited by model or system environment constraints.As a result,our method can be applied in critical domains that require adaptation to continuous changes,such as cybersecurity.Through extensive validation across diverse settings beyond DGA detection environments,the proposed method will emerge as a versatile drift detection technique suitable for a wide range of real-time data analysis contexts.It is also anticipated to emerge as a new approach to protect essential systems and infrastructures from attacks. 展开更多
关键词 CYBERSECURITY machine learning(ML) model life-cycle management drift detection unsupervised environments shapley additive explanations(SHAP) explainability
下载PDF
考虑建成环境交互影响的共享单车需求预测
6
作者 魏晋 安实 张炎棠 《科学技术与工程》 北大核心 2023年第26期11424-11430,共7页
共享单车的发展有利于交通的节能减排绿色发展。建成环境是影响共享单车出行需求的重要因素,然而很少有学者探究考虑其交互作用。为了准确分析建成环境中各影响因素的交互作用以达到精确预测共享单车出行需求的目的,使用了深圳市共享单... 共享单车的发展有利于交通的节能减排绿色发展。建成环境是影响共享单车出行需求的重要因素,然而很少有学者探究考虑其交互作用。为了准确分析建成环境中各影响因素的交互作用以达到精确预测共享单车出行需求的目的,使用了深圳市共享单车出行数据、兴趣点数据(point of interest,POI)、路网数据和公交线路数据等多源数据,采用梯度提升决策树(gradient boosting decision tree,GBDT)模型预测共享单车出行需求,并与BP(back propagation)神经网络模型预测结果进行比较;最后借助SHAP(shapley additive explanation)方法解释GBDT模型中各种影响因子对共享单车出行需求产生的影响,并分析各影响因素及其交互作用。实验结果表明:GBDT模型预测结果平均绝对误差为0.683,均方根误差为0.728,较BP神经网络模型预测准确性更高;通过SHAP方法发现自行车道密度、公交站点数等交通属性因素对于共享单车出行需求作用明显,土地利用中土地利用混合度不是简单线性作用且不同POI间存在复杂交互关系。可见通过借助GBDT模型和SHAP方法可以用来共享单车出行需求预测以及影响因素分析,从而为共享单车发展提出改善建议。 展开更多
关键词 共享单车 需求预测 POI数据 梯度提升决策树 SHAP(shapley additive explanation)
下载PDF
Improving Ultrasonic Testing by Using Machine Learning Framework Based on Model Interpretation Strategy
7
作者 Siqi Shi Shijie Jin +3 位作者 Donghui Zhang Jingyu Liao Dongxin Fu Li Lin 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2023年第5期174-186,共13页
Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.More... Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.Moreover,the hidden physics behind ML is unexplained,reducing the generalization capability and versatility of ML methods in UT.In this paper,a generally applicable ML framework based on the model interpretation strategy is proposed to improve the detection accuracy and computational efciency of UT.Firstly,multi-domain features are extracted from the UT signals with signal processing techniques to construct an initial feature space.Subsequently,a feature selection method based on model interpretable strategy(FS-MIS)is innovatively developed by integrating Shapley additive explanation(SHAP),flter method,embedded method and wrapper method.The most efective ML model and the optimal feature subset with better correlation to the target defects are determined self-adaptively.The proposed framework is validated by identifying and locating side-drilled holes(SDHs)with 0.5λcentral distance and different depths.An ultrasonic array probe is adopted to acquire FMC datasets from several aluminum alloy specimens containing two SDHs by experiments.The optimal feature subset selected by FS-MIS is set as the input of the chosen ML model to train and predict the times of arrival(ToAs)of the scattered waves emitted by adjacent SDHs.The experimental results demonstrate that the relative errors of the predicted ToAs are all below 3.67%with an average error of 0.25%,signifcantly improving the time resolution of UT signals.On this basis,the predicted ToAs are assigned to the corresponding original signals for decoupling overlapped pulse-echoes and reconstructing high-resolution FMC datasets.The imaging resolution is enhanced to 0.5λby implementing the total focusing method(TFM).The relative errors of hole depths and central distance are no more than 0.51%and 3.57%,respectively.Finally,the superior performance of the proposed FS-MIS is validated by comparing it with initial feature space and conventional dimensionality reduction techniques. 展开更多
关键词 Ultrasonic testing Machine learning Feature extraction Feature selection shapley additive explanation
下载PDF
基于回归树集成学习方法的工业增长预测和分析 被引量:1
8
作者 陈磊 李丽娟 《计量经济学报》 CSCD 2024年第1期104-129,共26页
本文从众多变量中筛选出59个相关经济指标,分别考查疫情前后传统时间序列模型和几种回归树集成学习模型对中国工业增加值增速的预测效果,并结合Shapley additive explanations(SHAP)方法对相关预测变量的作用进行解释分析.研究发现,随... 本文从众多变量中筛选出59个相关经济指标,分别考查疫情前后传统时间序列模型和几种回归树集成学习模型对中国工业增加值增速的预测效果,并结合Shapley additive explanations(SHAP)方法对相关预测变量的作用进行解释分析.研究发现,随着预测步长的增加和新冠疫情的暴发,传统时间序列模型的预测性能明显减弱,而集成学习模型的预测表现则相对较好,其中梯度提升树模型在较长预测步长中更加稳健和准确.基于SHAP方法的分析发现,作为预测变量的经济指标在不同时期的重要性有所不同,除生产、投资等指标外,金融类变量在高风险时期也具有一定的预测作用,需结合具体时间和预期目标来选择合适的经济指标进行工业增长预测.基于预测的视角可在一定程度上说明新冠疫情冲击可能不会改变工业增长未来走势的基本面. 展开更多
关键词 工业增加值预测 回归树集成学习 shapley additive explanations(SHAP)方法 梯度提升树模型
原文传递
基于可解释机器学习的重症慢性阻塞性肺疾病的预后模型
9
作者 耿祺焜 李吉利 +4 位作者 胡雲迪 李润泓 向涛 张双弋 李镭 《中国呼吸与危重监护杂志》 CAS CSCD 2024年第3期153-159,共7页
目的建立预测重症慢性阻塞性肺疾病(简称慢阻肺)患者死亡风险的机器学习模型,探讨与慢阻肺患者死亡风险相关的因素,并加以解释,解决机器学习模型的“黑箱”问题。方法选取美国多中心急诊重症监护病(emergency intensive care unit,eICU... 目的建立预测重症慢性阻塞性肺疾病(简称慢阻肺)患者死亡风险的机器学习模型,探讨与慢阻肺患者死亡风险相关的因素,并加以解释,解决机器学习模型的“黑箱”问题。方法选取美国多中心急诊重症监护病(emergency intensive care unit,eICU)数据库中的8088例重症慢阻肺患者为研究对象,提取每次入住重症监护病房的前24 h内的数据并随机分组,70%用于模型训练,30%用于模型验证。采用LASSO回归进行预测变量选择,避免过拟合。采用5种机器学习模型对患者的住院病死率进行预测。通过曲线下面积(area under curve,AUC)比较5种模型和APACHEⅣa评分的预测性能,并采用SHAP(SHapley Additive exPlanations)方法解释随机森林(random forest,RF)模型的预测结果。结果RF模型在5种机器学习模型和APACHEⅣa评分系统中表现出最佳的性能,AUC达到0.830(95%置信区间0.806~0.855)。通过SHAP方法检测最重要的10种预测变量,其中无创收缩压的最小值被认为是最重要的预测变量。结论通过机器学习识别危险因素,并使用SHAP方法解释预测结果,可早期预测患者的死亡风险,有助于临床医生制定准确的治疗计划,合理分配医疗资源。 展开更多
关键词 慢性阻塞性肺疾病 机器学习 eICU合作研究数据库 病死率 SHAP(shapley additive explanations)方法
原文传递
Machine learning-based detection of cervical spondylotic myelopathy using multiple gait parameters
10
作者 Xinyu Ji Wei Zeng +3 位作者 Qihang Dai Yuyan Zhang Shaoyi Du Bing Ji 《Biomimetic Intelligence & Robotics》 2023年第2期30-40,共11页
Cervical spondylotic myelopathy(CSM)is the main cause of adult spinal cord dysfunction,mostly appearing in middle-aged and elderly patients.Currently,the diagnosis of this condition depends mainly on the available ima... Cervical spondylotic myelopathy(CSM)is the main cause of adult spinal cord dysfunction,mostly appearing in middle-aged and elderly patients.Currently,the diagnosis of this condition depends mainly on the available imaging tools such as X-ray,computed tomography and magnetic resonance imaging(MRI),of which MRI is the gold standard for clinical diagnosis.However,MRI data cannot clearly demonstrate the dynamic characteristics of CSM,and the overall process is far from costefficient.Therefore,this study proposes a new method using multiple gait parameters and shallow classifiers to dynamically detect the occurrence of CSM.In the present study,45 patients with CSM and 45 age-matched asymptomatic healthy controls(HCs)were recruited,and a three-dimensional(3D)motion capture system was utilized to capture the locomotion data.Furthermore,63 spatiotemporal,kinematic,and nonlinear parameters were extracted,including lower limb joint angles in the sagittal,coronal,and transverse planes.Then,the Shapley Additive exPlanations(SHAP)value was utilized for feature selection and reduction of the dimensionality of features,and five traditional shallow classifiers,including support vector machine(SVM),logistic regression(LR),k-nearest neighbor(KNN),decision tree(DT),and random forest(RF),were used to classify gait patterns between CSM patients and HCs.On the basis of the 10-fold cross-validation method,the highest average accuracy was achieved by SVM(95.56%).Our results demonstrated that the proposed method could effectively detect CSM and thus serve as an automated auxiliary tool for the clinical diagnosis of CSM. 展开更多
关键词 Cervical spondylotic myelopathy Gait analysis Machine learning shapley additive explanations
原文传递
An explainable framework for load forecasting of a regional integrated energy system based on coupled features and multi-task learning 被引量:4
11
作者 Kailang Wu Jie Gu +2 位作者 Lu Meng Honglin Wen Jinghuan Ma 《Protection and Control of Modern Power Systems》 2022年第1期349-362,共14页
To extract strong correlations between different energy loads and improve the interpretability and accuracy for load forecasting of a regional integrated energy system(RIES),an explainable framework for load forecasti... To extract strong correlations between different energy loads and improve the interpretability and accuracy for load forecasting of a regional integrated energy system(RIES),an explainable framework for load forecasting of an RIES is proposed.This includes the load forecasting model of RIES and its interpretation.A coupled feature extracting strat-egy is adopted to construct coupled features between loads as the input variables of the model.It is designed based on multi-task learning(MTL)with a long short-term memory(LSTM)model as the sharing layer.Based on SHapley Additive exPlanations(SHAP),this explainable framework combines global and local interpretations to improve the interpretability of load forecasting of the RIES.In addition,an input variable selection strategy based on the global SHAP value is proposed to select input feature variables of the model.A case study is given to verify the effectiveness of the proposed model,constructed coupled features,and input variable selection strategy.The results show that the explainable framework intuitively improves the interpretability of the prediction model. 展开更多
关键词 Load forecasting Regional integrated energy system Coupled feature shapley additive explanations Interpretability of deep learning
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部