Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/appr...Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.展开更多
The burning of crop residues in fields is a significant global biomass burning activity which is a key element of the terrestrial carbon cycle,and an important source of atmospheric trace gasses and aerosols.Accurate ...The burning of crop residues in fields is a significant global biomass burning activity which is a key element of the terrestrial carbon cycle,and an important source of atmospheric trace gasses and aerosols.Accurate estimation of cropland burned area is both crucial and challenging,especially for the small and fragmented burned scars in China.Here we developed an automated burned area mapping algorithm that was implemented using Sentinel-2 Multi Spectral Instrument(MSI)data and its effectiveness was tested taking Songnen Plain,Northeast China as a case using satellite image of 2020.We employed a logistic regression method for integrating multiple spectral data into a synthetic indicator,and compared the results with manually interpreted burned area reference maps and the Moderate-Resolution Imaging Spectroradiometer(MODIS)MCD64A1 burned area product.The overall accuracy of the single variable logistic regression was 77.38%to 86.90%and 73.47%to 97.14%for the 52TCQ and 51TYM cases,respectively.In comparison,the accuracy of the burned area map was improved to 87.14%and 98.33%for the 52TCQ and 51TYM cases,respectively by multiple variable logistic regression of Sentind-2 images.The balance of omission error and commission error was also improved.The integration of multiple spectral data combined with a logistic regression method proves to be effective for burned area detection,offering a highly automated process with an automatic threshold determination mechanism.This method exhibits excellent extensibility and flexibility taking the image tile as the operating unit.It is suitable for burned area detection at a regional scale and can also be implemented with other satellite data.展开更多
In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluste...In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.展开更多
Internet of Things(IoT)is a popular social network in which devices are virtually connected for communicating and sharing information.This is applied greatly in business enterprises and government sectors for deliveri...Internet of Things(IoT)is a popular social network in which devices are virtually connected for communicating and sharing information.This is applied greatly in business enterprises and government sectors for delivering the services to their customers,clients and citizens.But,the interaction is success-ful only based on the trust that each device has on another.Thus trust is very much essential for a social network.As Internet of Things have access over sen-sitive information,it urges to many threats that lead data management to risk.This issue is addressed by trust management that help to take decision about trust-worthiness of requestor and provider before communication and sharing.Several trust-based systems are existing for different domain using Dynamic weight meth-od,Fuzzy classification,Bayes inference and very few Regression analysis for IoT.The proposed algorithm is based on Logistic Regression,which provide strong statistical background to trust prediction.To make our stand strong on regression support to trust,we have compared the performance with equivalent sound Bayes analysis using Beta distribution.The performance is studied in simu-lated IoT setup with Quality of Service(QoS)and Social parameters for the nodes.The proposed model performs better in terms of various metrics.An IoT connects heterogeneous devices such as tags and sensor devices for sharing of information and avail different application services.The most salient features of IoT system is to design it with scalability,extendibility,compatibility and resiliency against attack.The existing worksfinds a way to integrate direct and indirect trust to con-verge quickly and estimate the bias due to attacks in addition to the above features.展开更多
Autism spectrum disorder(ASD),classified as a developmental disability,is now more common in children than ever.A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection ...Autism spectrum disorder(ASD),classified as a developmental disability,is now more common in children than ever.A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection of autism in children.Parents can seek professional help for a better prognosis of the child’s therapy when ASD is diagnosed under five years.This research study aims to develop an automated tool for diagnosing autism in children.The computer-aided diagnosis tool for ASD detection is designed and developed by a novel methodology that includes data acquisition,feature selection,and classification phases.The most deterministic features are selected from the self-acquired dataset by novel feature selection methods before classification.The Imperialistic competitive algorithm(ICA)based on empires conquering colonies performs feature selection in this study.The performance of Logistic Regression(LR),Decision tree,K-Nearest Neighbor(KNN),and Random Forest(RF)classifiers are experimentally studied in this research work.The experimental results prove that the Logistic regression classifier exhibits the highest accuracy for the self-acquired dataset.The ASD detection is evaluated experimentally with the Least Absolute Shrinkage and Selection Operator(LASSO)feature selection method and different classifiers.The Exploratory Data Analysis(EDA)phase has uncovered crucial facts about the data,like the correlation of the features in the dataset with the class variable.展开更多
This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. ...This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. The LR and ANN algorithms are employed to train the datasets. The models demonstrate a remarkably high classification accuracy of 89.3% in predicting ozone levels on a given day. Evaluation metrics reveal that both the ANN and LR models exhibit accuracies of 89.3% and 88.4%, respectively. Additionally, the AUC values for both models are comparable, with the ANN achieving 95.4% and the LR obtaining 95.2%. The lower the cross-entropy loss (log loss), the higher the model’s accuracy or performance. Our ANN model yields a log loss of 3.74, while the LR model shows a log loss of 6.03. The prediction time for the ANN model is approximately 0.00 seconds, whereas the LR model takes 0.02 seconds. Our odds ratio analysis indicates that features such as “Solar radiation”, “Std. Dev. Wind Direction”, “outdoor temperature”, “dew point temperature”, and “PM10” contribute to high ozone levels in El Paso, Texas. Based on metrics such as accuracy, error rate, log loss, and prediction time, the ANN model proves to be faster and more suitable for ozone classification in the El Paso, Texas area.展开更多
In this paper, a weighted maximum likelihood technique (WMLT) for the logistic regression model is presented. This method depended on a weight function that is continuously adaptable using Mahalanobis distances for pr...In this paper, a weighted maximum likelihood technique (WMLT) for the logistic regression model is presented. This method depended on a weight function that is continuously adaptable using Mahalanobis distances for predictor variables. Under the model, the asymptotic consistency of the suggested estimator is demonstrated and properties of finite-sample are also investigated via simulation. In simulation studies and real data sets, it is observed that the newly proposed technique demonstrated the greatest performance among all estimators compared.展开更多
This paper presents a case study on the IPUMS NHIS database,which provides data from censuses and surveys on the health of the U.S.population,including data related to COVID-19.By addressing gaps in previous studies,w...This paper presents a case study on the IPUMS NHIS database,which provides data from censuses and surveys on the health of the U.S.population,including data related to COVID-19.By addressing gaps in previous studies,we propose a machine learning approach to train predictive models for identifying and measuring factors that affect the severity of COVID-19 symptoms.Our experiments focus on four groups of factors:demographic,socio-economic,health condition,and related to COVID-19 vaccination.By analysing the sensitivity of the variables used to train the models and the VEC(variable effect characteristics)analysis on the variable values,we identify and measure importance of various factors that influence the severity of COVID-19 symptoms.展开更多
Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ri...Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.展开更多
With the well-being trends to pursue a healthy life, mountain ginseng(Panax ginseng) is rising as one of the most profitable forest products in South Korea. This study was aimed at evaluating a new methodology for ide...With the well-being trends to pursue a healthy life, mountain ginseng(Panax ginseng) is rising as one of the most profitable forest products in South Korea. This study was aimed at evaluating a new methodology for identifying suitable sites for mountain ginseng cultivation in the country. Forest vegetation data were collected from 46 sites and the spatial distribution of all sites was analyzed using GIS data for topographic position, landform, solar radiation, and topographic wetness. The physical and chemical properties of the soil samples, including moisture content, p H, organic matter, total nitrogen, exchangeable cations, available phosphorous, and soil texture, were analyzed. The cultivation suitability at each site was assessed based on the environmental conditions using logistic regression(LR) and geographically weighted logistic regression(GWLR) and the results of both methods were compared. The results show that the areas with northern aspect and higher levels of solar radiation, moisture content, total nitrogen, and sand ratio are more likely to be identified as suitable sites for ginseng cultivation. In contrast to the LR, the spatial modeling with the GWLR results in an increase in the model fitness and indicates that a significant portion of spatialautocorrelation in the data decreases. A higher value of the area under the receiver operating characteristic(ROC) curve presents a better prediction accuracy of site suitability by the GWLR. The geographically weighted coefficient estimates of the model are nonstationary, and reveal that different site suitability is associated with the geographical location of the forest stands. The GWLR increases the accuracy of selecting suitable sites by considering the geographical variations in the characteristics of the cultivation sites.展开更多
Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence...Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.展开更多
The Wenchuan earthquake on May 12,2008 caused numerous collapses,landslides,barrier lakes,and debris flows.Landslide susceptibility mapping is important for evaluation of environmental capacity and also as a guide for...The Wenchuan earthquake on May 12,2008 caused numerous collapses,landslides,barrier lakes,and debris flows.Landslide susceptibility mapping is important for evaluation of environmental capacity and also as a guide for post-earthquake reconstruction.In this paper,a logistic regression model was developed within the framework of GIS to map landslide susceptibility.Qingchuan County,a heavily affected area,was selected for the study.Distribution of landslides was prepared by interpretation of multi-temporal and multi-resolution remote sensing images(ADS40 aerial imagery,SPOT5 imagery and TM imagery,etc.) and field surveys.The Certainly Factor method was used to find the influencial factors,indicating that lithologic groups,distance from major faults,slope angle,profile curvature,and altitude are the dominant factors influencing landslides.The weight of each factor was determined using a binomial logistic regression model.Landslide susceptibility mapping was based on spatial overlay analysis and divided into five classes.Major faults have the most significant impact,and landslides will occur most likely in areas near the faults.Onethird of the area has a high or very high susceptibility,located in the northeast,south and southwest,including 65.3% of all landslides coincident with the earthquake.The susceptibility map can reveal the likelihood of future failures,and it will be useful for planners during the rebuilding process and for future zoning issues.展开更多
Landslide susceptibility map is one of the study fields portraying the spatial distribution of future slope failure sus- ceptibility. This paper deals with past methods for producing landslide susceptibility map and d...Landslide susceptibility map is one of the study fields portraying the spatial distribution of future slope failure sus- ceptibility. This paper deals with past methods for producing landslide susceptibility map and divides these methods into 3 types. The logistic linear regression approach is further elaborated on by crosstabs method, which is used to analyze the relationship between the categorical or binary response variable and one or more continuous or categorical or binary explanatory variables derived from samples. It is an objective assignment of coefficients serving as weights of various factors under considerations while expert opinions make great difference in heuristic approaches. Different from deterministic approach, it is very applicable to regional scale. In this study, double logistic regression is applied in the study area. The entire study area is first analyzed. The logistic regression equation showed that elevation, proximity to road, river and residential area are main factors triggering land- slide occurrence in this area. The prediction accuracy of the first landslide susceptibility map was showed to be 80%. Along the road and residential area, almost all areas are in high landslide susceptibility zone. Some non-landslide areas are incorrectly divided into high and medium landslide susceptibility zone. In order to improve the status, a second logistic regression was done in high landslide susceptibility zone using landslide cells and non-landslide sample cells in this area. In the second logistic regression analysis, only engineering and geological conditions are important in these areas and are entered in the new logistic regression equation indicating that only areas with unstable engineering and geological conditions are prone to landslide during large scale engineering activity. Taking these two logistic regression results into account yields a new landslide susceptibility map. Double logistic regression analysis improved the non-landslide prediction accuracy. During calculation of parameters for logistic regres- sion, landslide density is used to transform nominal variable to numeric variable and this avoids the creation of an excessively high number of dummy variables.展开更多
Landslide susceptibility mapping is the first step in regional hazard management as it helps to understand the spatial distribution of the probability of slope failure in an area.An attempt is made to map the landslid...Landslide susceptibility mapping is the first step in regional hazard management as it helps to understand the spatial distribution of the probability of slope failure in an area.An attempt is made to map the landslide susceptibility in Tevankarai Ar subwatershed,Kodaikkanal,India using binary logistic regression analysis.Geographic Information System is used to prepare the database of the predictor variables and landslide inventory map,which is used to build the spatial model of landslide susceptibility.The model describes the relationship between the dependent variable(presence and absence of landslide) and the independent variables selected for study(predictor variables) by the best fitting function.A forward stepwise logistic regression model using maximum likelihood estimation is used in the regression analysis.An inventory of 84 landslides and cells within a buffer distance of 10m around the landslide is used as the dependent variable.Relief,slope,aspect,plan curvature,profile curvature,land use,soil,topographic wetness index,proximity to roads and proximity to lineaments are taken as independent variables.The constant and the coefficient of the predictor variable retained by the regression model are used to calculate the probability of slope failure and analyze the effect of each predictor variable on landslide occurrence in thestudy area.The model shows that the most significant parameter contributing to landslides is slope.The other significant parameters are profile curvature,soil,road,wetness index and relief.The predictive logistic regression model is validated using temporal validation data-set of known landslide locations and shows an accuracy of 85.29 %.展开更多
A detailed landslide susceptibility map was produced in the Youfang catchment using logistic regression method with datasets developed for a geographic information system(GIS).Known as one of the most landslide-prone ...A detailed landslide susceptibility map was produced in the Youfang catchment using logistic regression method with datasets developed for a geographic information system(GIS).Known as one of the most landslide-prone areas in China, the Youfang catchment of Longnan mountain region,which lies in the transitional area among QinghaiTibet Plateau, loess Plateau and Sichuan Basin, was selected as a representative case to evaluate the frequency and distribution of landslides.Statistical relationships for landslide susceptibility assessment were developed using landslide and landslide causative factor databases.Logistic regression(LR)was used to create the landslide susceptibility maps based on a series of available data sources: landslide inventory; distance to drainage systems, faults and roads; slope angle and aspect; topographic elevation and topographical wetness index, and land use.The quality of the landslide susceptibility map produced in this paper was validated and the result can be used fordesigning protective and mitigation measures against landslide hazards.The landslide susceptibility map is expected to provide a fundamental tool for landslide hazards assessment and risk management in the Youfang catchment.展开更多
This study aimed to assess the potential of in-situ measured soil and vegetation characteristics in landslide susceptibility analyses.First,data for eight independent variables,i.e.,soil moisture content,soil organic ...This study aimed to assess the potential of in-situ measured soil and vegetation characteristics in landslide susceptibility analyses.First,data for eight independent variables,i.e.,soil moisture content,soil organic content,compaction of soil(soil toughness),plant root strength,crop biomass,tree diameter at knee height,Shannon Wiener Index(SWI)for trees and herbs was assembled from field tests at two historic landslide locations:Aranayaka and Kurukudegama,Sri Lanka.An economical,finer resolution database was obtained as the field tests were not cost-prohibitive.The logistic regression(LR)analysis showed that soil moisture content,compaction of soil,SWI for trees and herbs were statistically significant at P<0.05.The variance inflation factors(VIFs)were computed to test for multicollinearity.VIF values(<2)confirmed the absence of multicollinearity between four independent variables in the LR model.Receiver Operating Characteristics(ROC)curve and Confusion Metrix(CM)methods were used to validate the model.In ROC analysis,areas under the curve of Success Rate Curve and Prediction Rate Curve were 84.5% and 96.6%,respectively,demonstrating the model’s excellent compatibility and predictability.According to the CM,the model demonstrated a 79.6% accuracy,63.6% precision,100% recall,and a F-measure of 77.8%.The model coefficients revealed that the vegetation cover has a more significant contribution to landslide susceptibility than soil characteristics.Finally,the susceptibility map,which was then classified as low,medium,and highly susceptible areas based on the natural breaks(Jenks)method,was generated using geographical information systems(GIS)techniques.All the historic landslide locations fell into the high susceptibility areas.Thus,validation of the model and inspection of the susceptibility map indicated that the in-situ soil and vegetation characteristics used in the model could be employed to demarcate historical landslide patches and identify landslide susceptible locations with high confidence.展开更多
The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data ...The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data collected in different conditions.However,failure data are always hard to acquire,thus making those techniques hard to be applied.In this paper,a novel method which does not need failure history data is introduced.Wavelet packet decomposition(WPD) is used to extract features from raw signals,principal component analysis(PCA) is utilized to reduce feature dimensions,and Gaussian mixture model(GMM) is then applied to approximate the feature space distributions.Single-channel confidence value(SCV) is calculated by the overlap between GMM of the monitoring condition and that of the normal condition,which can indicate the performance of single-channel.Furthermore,multi-channel confidence value(MCV),which can be deemed as the overall performance index of multi-channel,is calculated via logistic regression(LR) and that the task of decision-level sensor fusion is also completed.Both SCV and MCV can serve as the basis on which proactive maintenance measures can be taken,thus preventing machine breakdown.The method has been adopted to assess the performance of the turbine of a centrifugal compressor in a factory of Petro-China,and the result shows that it can effectively complete this task.The proposed method has engineering significance for machine performance degradation assessment.展开更多
Ecological land is an important guarantee to maintain urban ecological security and sustainable development.Although increasing studies have been brought to ecological land,with few explorations of the relative import...Ecological land is an important guarantee to maintain urban ecological security and sustainable development.Although increasing studies have been brought to ecological land,with few explorations of the relative importance of anthropogenic-natural factors and how they interact to induce the ecological land evolution.This research sought to fill this gap.In this study,18 factors,including the risk of goaf collapse,fault,prime croplands,were selected from six aspects of topography,geology,climate,accessibility,socio-economic and land control policies.logistic regression(LR)and random forest(RF)models were adopted to identify the anthropogenic and biophysical factors on the dynamic change of ecological land of Mentougou in Beijing from 1990 to 2018.The results show that there was a significant increase in ecological land from 1990 to 2018.The increased area of ecological land reached 102.11 km2 with an increased rate of 0.78,the gravity center of ecological land gradually moved to the northwest.The impact of anthropogenic factors on ecological land was greater than that of natural factors,ecological land was mainly driven by proportion of prime cropland,per capita GDP,land urbanization,temperature,per capita rural income,elevation and aspect factors.Additionally,slope and precipitation were also identified as important predictors for ecological land change.The model comparison suggested that RF can better identify the relationship between ecological land and explanatory variables than LR model.Based on our findings,the implementation of government policies along with anthropogenic factors are the most important variables influencing ecological land change,and the rational planning and allocation of ecological land by Mentougou government are still needed.展开更多
Landslide distribution and susceptibility mapping are the fundamental steps for landslide-related hazard and disaster risk management activities, especially in the Himalaya region which has resulted in a great deal of...Landslide distribution and susceptibility mapping are the fundamental steps for landslide-related hazard and disaster risk management activities, especially in the Himalaya region which has resulted in a great deal of death and damage to property. To better understand the landslide condition in the Nepal Himalaya, we carried out an investigation on the landslide distribution and susceptibility using the landslide inventory data and 12 different contributing factors in the Dailekh district, Western Nepal. Based on the evaluation of the frequency distribution of the landslide, the relationship between the landslide and the various contributing factors was determined.Then, the landslide susceptibility was calculated using logistic regression and statistical index methods along with different topographic(slope, aspect, relative relief, plan curvature, altitude, topographic wetness index) and non-topographic factors(distance from river, normalized difference vegetation index(NDVI), distance from road, precipitation, land use and land cover, and geology), and 470(70%) of total 658 landslides. The receiver operating characteristic(ROC) curve analysis using 198(30%) of total landslides showed that the prediction curve rates(area under the curve, AUC) values for two methods(logistic regression and statistical index) were 0.826, and 0.823with success rates of 0.793, and 0.811, respectively. The values of R-Index for the logistic regression and statistical index methods were83.66 and 88.54, respectively, consisting of high susceptible hazard classes. In general, this research concluded that the cohesive and coherent natural interplay of topographic and non-topographic factors strongly affects landslide occurrence, distribution, and susceptibility condition in the Nepal Himalaya region. Furthermore, the reliability of these two methods is verified for landslide susceptibility mapping in Nepal’s central mountain region.展开更多
In order to improve classification accuracy, the regularized logistic regression is used to classify single-trial electroencephalogram (EEG). A novel approach, named local sparse logistic regression (LSLR), is pro...In order to improve classification accuracy, the regularized logistic regression is used to classify single-trial electroencephalogram (EEG). A novel approach, named local sparse logistic regression (LSLR), is proposed. The LSLR integrates the locality preserving projection regularization term into the framework of sparse logistic regression. It tries to maintain the neighborhood information of original feature space, and, meanwhile, keeps sparsity. The bound optimization algorithm and component-wise update are used to compute the weight vector in the training data, thus overcoming the disadvantage of the Newton-Raphson method and iterative re-weighted least squares (IRLS). The classification accuracy of 80% is achieved using ten-fold cross-validation in the self-paced finger tapping data set. The results of LSLR are compared with SLR, showing the effectiveness of the proposed method.展开更多
文摘Purpose:The purpose of this study is to develop and compare model choice strategies in context of logistic regression.Model choice means the choice of the covariates to be included in the model.Design/methodology/approach:The study is based on Monte Carlo simulations.The methods are compared in terms of three measures of accuracy:specificity and two kinds of sensitivity.A loss function combining sensitivity and specificity is introduced and used for a final comparison.Findings:The choice of method depends on how much the users emphasize sensitivity against specificity.It also depends on the sample size.For a typical logistic regression setting with a moderate sample size and a small to moderate effect size,either BIC,BICc or Lasso seems to be optimal.Research limitations:Numerical simulations cannot cover the whole range of data-generating processes occurring with real-world data.Thus,more simulations are needed.Practical implications:Researchers can refer to these results if they believe that their data-generating process is somewhat similar to some of the scenarios presented in this paper.Alternatively,they could run their own simulations and calculate the loss function.Originality/value:This is a systematic comparison of model choice algorithms and heuristics in context of logistic regression.The distinction between two types of sensitivity and a comparison based on a loss function are methodological novelties.
基金Under the auspices of National Natural Science Foundation of China(No.42101414)Natural Science Found for Outstanding Young Scholars in Jilin Province(No.20230508106RC)。
文摘The burning of crop residues in fields is a significant global biomass burning activity which is a key element of the terrestrial carbon cycle,and an important source of atmospheric trace gasses and aerosols.Accurate estimation of cropland burned area is both crucial and challenging,especially for the small and fragmented burned scars in China.Here we developed an automated burned area mapping algorithm that was implemented using Sentinel-2 Multi Spectral Instrument(MSI)data and its effectiveness was tested taking Songnen Plain,Northeast China as a case using satellite image of 2020.We employed a logistic regression method for integrating multiple spectral data into a synthetic indicator,and compared the results with manually interpreted burned area reference maps and the Moderate-Resolution Imaging Spectroradiometer(MODIS)MCD64A1 burned area product.The overall accuracy of the single variable logistic regression was 77.38%to 86.90%and 73.47%to 97.14%for the 52TCQ and 51TYM cases,respectively.In comparison,the accuracy of the burned area map was improved to 87.14%and 98.33%for the 52TCQ and 51TYM cases,respectively by multiple variable logistic regression of Sentind-2 images.The balance of omission error and commission error was also improved.The integration of multiple spectral data combined with a logistic regression method proves to be effective for burned area detection,offering a highly automated process with an automatic threshold determination mechanism.This method exhibits excellent extensibility and flexibility taking the image tile as the operating unit.It is suitable for burned area detection at a regional scale and can also be implemented with other satellite data.
文摘In view of the composition analysis and identification of ancient glass products, L1 regularization, K-Means cluster analysis, elbow rule and other methods were comprehensively used to build logical regression, cluster analysis, hyper-parameter test and other models, and SPSS, Python and other tools were used to obtain the classification rules of glass products under different fluxes, sub classification under different chemical compositions, hyper-parameter K value test and rationality analysis. Research can provide theoretical support for the protection and restoration of ancient glass relics.
文摘Internet of Things(IoT)is a popular social network in which devices are virtually connected for communicating and sharing information.This is applied greatly in business enterprises and government sectors for delivering the services to their customers,clients and citizens.But,the interaction is success-ful only based on the trust that each device has on another.Thus trust is very much essential for a social network.As Internet of Things have access over sen-sitive information,it urges to many threats that lead data management to risk.This issue is addressed by trust management that help to take decision about trust-worthiness of requestor and provider before communication and sharing.Several trust-based systems are existing for different domain using Dynamic weight meth-od,Fuzzy classification,Bayes inference and very few Regression analysis for IoT.The proposed algorithm is based on Logistic Regression,which provide strong statistical background to trust prediction.To make our stand strong on regression support to trust,we have compared the performance with equivalent sound Bayes analysis using Beta distribution.The performance is studied in simu-lated IoT setup with Quality of Service(QoS)and Social parameters for the nodes.The proposed model performs better in terms of various metrics.An IoT connects heterogeneous devices such as tags and sensor devices for sharing of information and avail different application services.The most salient features of IoT system is to design it with scalability,extendibility,compatibility and resiliency against attack.The existing worksfinds a way to integrate direct and indirect trust to con-verge quickly and estimate the bias due to attacks in addition to the above features.
基金The authors extend their appreciation to the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number(IF2-PSAU-2022/01/22043)。
文摘Autism spectrum disorder(ASD),classified as a developmental disability,is now more common in children than ever.A drastic increase in the rate of autism spectrum disorder in children worldwide demands early detection of autism in children.Parents can seek professional help for a better prognosis of the child’s therapy when ASD is diagnosed under five years.This research study aims to develop an automated tool for diagnosing autism in children.The computer-aided diagnosis tool for ASD detection is designed and developed by a novel methodology that includes data acquisition,feature selection,and classification phases.The most deterministic features are selected from the self-acquired dataset by novel feature selection methods before classification.The Imperialistic competitive algorithm(ICA)based on empires conquering colonies performs feature selection in this study.The performance of Logistic Regression(LR),Decision tree,K-Nearest Neighbor(KNN),and Random Forest(RF)classifiers are experimentally studied in this research work.The experimental results prove that the Logistic regression classifier exhibits the highest accuracy for the self-acquired dataset.The ASD detection is evaluated experimentally with the Least Absolute Shrinkage and Selection Operator(LASSO)feature selection method and different classifiers.The Exploratory Data Analysis(EDA)phase has uncovered crucial facts about the data,like the correlation of the features in the dataset with the class variable.
文摘This paper focuses on ozone prediction in the atmosphere using a machine learning approach. We utilize air pollutant and meteorological variable datasets from the El Paso area to classify ozone levels as high or low. The LR and ANN algorithms are employed to train the datasets. The models demonstrate a remarkably high classification accuracy of 89.3% in predicting ozone levels on a given day. Evaluation metrics reveal that both the ANN and LR models exhibit accuracies of 89.3% and 88.4%, respectively. Additionally, the AUC values for both models are comparable, with the ANN achieving 95.4% and the LR obtaining 95.2%. The lower the cross-entropy loss (log loss), the higher the model’s accuracy or performance. Our ANN model yields a log loss of 3.74, while the LR model shows a log loss of 6.03. The prediction time for the ANN model is approximately 0.00 seconds, whereas the LR model takes 0.02 seconds. Our odds ratio analysis indicates that features such as “Solar radiation”, “Std. Dev. Wind Direction”, “outdoor temperature”, “dew point temperature”, and “PM10” contribute to high ozone levels in El Paso, Texas. Based on metrics such as accuracy, error rate, log loss, and prediction time, the ANN model proves to be faster and more suitable for ozone classification in the El Paso, Texas area.
文摘In this paper, a weighted maximum likelihood technique (WMLT) for the logistic regression model is presented. This method depended on a weight function that is continuously adaptable using Mahalanobis distances for predictor variables. Under the model, the asymptotic consistency of the suggested estimator is demonstrated and properties of finite-sample are also investigated via simulation. In simulation studies and real data sets, it is observed that the newly proposed technique demonstrated the greatest performance among all estimators compared.
文摘This paper presents a case study on the IPUMS NHIS database,which provides data from censuses and surveys on the health of the U.S.population,including data related to COVID-19.By addressing gaps in previous studies,we propose a machine learning approach to train predictive models for identifying and measuring factors that affect the severity of COVID-19 symptoms.Our experiments focus on four groups of factors:demographic,socio-economic,health condition,and related to COVID-19 vaccination.By analysing the sensitivity of the variables used to train the models and the VEC(variable effect characteristics)analysis on the variable values,we identify and measure importance of various factors that influence the severity of COVID-19 symptoms.
基金This paper was financially supported by NSC96-2628-E-366-004-MY2 and NSC96-2628-E-132-001-MY2
文摘Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p 〈0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions p^1, p^2 and p^3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to cheek the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1 ) and potential energy (X2 ) significantly impact (p 〈 0. 0001 ) the amplitude-based refleeted rate; the P-values for the deviance and Pearson are all 〉 0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height ( X1 ) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model. Investigation of 6 predictive powers ( R2, Max-rescaled R^2, Sorners' D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.
基金R&D Program for Forestry Technology funded by Korea Forest Service(Project No.S121012L100100)the framework of international cooperation program funded by National Research Foundation of Korea(2013K2A2A4000649,FY2013)
文摘With the well-being trends to pursue a healthy life, mountain ginseng(Panax ginseng) is rising as one of the most profitable forest products in South Korea. This study was aimed at evaluating a new methodology for identifying suitable sites for mountain ginseng cultivation in the country. Forest vegetation data were collected from 46 sites and the spatial distribution of all sites was analyzed using GIS data for topographic position, landform, solar radiation, and topographic wetness. The physical and chemical properties of the soil samples, including moisture content, p H, organic matter, total nitrogen, exchangeable cations, available phosphorous, and soil texture, were analyzed. The cultivation suitability at each site was assessed based on the environmental conditions using logistic regression(LR) and geographically weighted logistic regression(GWLR) and the results of both methods were compared. The results show that the areas with northern aspect and higher levels of solar radiation, moisture content, total nitrogen, and sand ratio are more likely to be identified as suitable sites for ginseng cultivation. In contrast to the LR, the spatial modeling with the GWLR results in an increase in the model fitness and indicates that a significant portion of spatialautocorrelation in the data decreases. A higher value of the area under the receiver operating characteristic(ROC) curve presents a better prediction accuracy of site suitability by the GWLR. The geographically weighted coefficient estimates of the model are nonstationary, and reveal that different site suitability is associated with the geographical location of the forest stands. The GWLR increases the accuracy of selecting suitable sites by considering the geographical variations in the characteristics of the cultivation sites.
基金supported by the Project of the 12th Five-year National Sci-Tech Support Plan of China(2011BAK12B09)China Special Project of Basic Work of Science and Technology(2011FY110100-2)
文摘Bailongjiang watershed in southern Gansu province, China, is one of the most landslide-prone regions in China, characterized by very high frequency of landslide occurrence. In order to predict the landslide occurrence, a comprehensive map of landslide susceptibility is required which may be significantly helpful in reducing loss of property and human life. In this study, an integrated model of information value method and logistic regression is proposed by using their merits at maximum and overcoming their weaknesses, which may enhance precision and accuracy of landslide susceptibility assessment. A detailed and reliable landslide inventory with 1587 landslides was prepared and randomly divided into two groups,(i) training dataset and(ii) testing dataset. Eight distinct landslide conditioning factors including lithology, slope gradient, aspect, elevation, distance to drainages,distance to faults, distance to roads and vegetation coverage were selected for landslide susceptibility mapping. The produced landslide susceptibility maps were validated by the success rate and prediction rate curves. The validation results show that the success rate and the prediction rate of the integrated model are 81.7 % and 84.6 %, respectively, which indicate that the proposed integrated method is reliable to produce an accurate landslide susceptibility map and the results may be used for landslides management and mitigation.
基金supported by State Key Fundamental Research Program (973) project (2008CB425802)the National natural Science Foundation of China (Grant No. 40801009)
文摘The Wenchuan earthquake on May 12,2008 caused numerous collapses,landslides,barrier lakes,and debris flows.Landslide susceptibility mapping is important for evaluation of environmental capacity and also as a guide for post-earthquake reconstruction.In this paper,a logistic regression model was developed within the framework of GIS to map landslide susceptibility.Qingchuan County,a heavily affected area,was selected for the study.Distribution of landslides was prepared by interpretation of multi-temporal and multi-resolution remote sensing images(ADS40 aerial imagery,SPOT5 imagery and TM imagery,etc.) and field surveys.The Certainly Factor method was used to find the influencial factors,indicating that lithologic groups,distance from major faults,slope angle,profile curvature,and altitude are the dominant factors influencing landslides.The weight of each factor was determined using a binomial logistic regression model.Landslide susceptibility mapping was based on spatial overlay analysis and divided into five classes.Major faults have the most significant impact,and landslides will occur most likely in areas near the faults.Onethird of the area has a high or very high susceptibility,located in the northeast,south and southwest,including 65.3% of all landslides coincident with the earthquake.The susceptibility map can reveal the likelihood of future failures,and it will be useful for planners during the rebuilding process and for future zoning issues.
基金Project supported by the Natural Science Foundation of ZhejiangProvince (No. 30295) and the Key Project of Zhejiang Province (No.011103192), China
文摘Landslide susceptibility map is one of the study fields portraying the spatial distribution of future slope failure sus- ceptibility. This paper deals with past methods for producing landslide susceptibility map and divides these methods into 3 types. The logistic linear regression approach is further elaborated on by crosstabs method, which is used to analyze the relationship between the categorical or binary response variable and one or more continuous or categorical or binary explanatory variables derived from samples. It is an objective assignment of coefficients serving as weights of various factors under considerations while expert opinions make great difference in heuristic approaches. Different from deterministic approach, it is very applicable to regional scale. In this study, double logistic regression is applied in the study area. The entire study area is first analyzed. The logistic regression equation showed that elevation, proximity to road, river and residential area are main factors triggering land- slide occurrence in this area. The prediction accuracy of the first landslide susceptibility map was showed to be 80%. Along the road and residential area, almost all areas are in high landslide susceptibility zone. Some non-landslide areas are incorrectly divided into high and medium landslide susceptibility zone. In order to improve the status, a second logistic regression was done in high landslide susceptibility zone using landslide cells and non-landslide sample cells in this area. In the second logistic regression analysis, only engineering and geological conditions are important in these areas and are entered in the new logistic regression equation indicating that only areas with unstable engineering and geological conditions are prone to landslide during large scale engineering activity. Taking these two logistic regression results into account yields a new landslide susceptibility map. Double logistic regression analysis improved the non-landslide prediction accuracy. During calculation of parameters for logistic regres- sion, landslide density is used to transform nominal variable to numeric variable and this avoids the creation of an excessively high number of dummy variables.
文摘Landslide susceptibility mapping is the first step in regional hazard management as it helps to understand the spatial distribution of the probability of slope failure in an area.An attempt is made to map the landslide susceptibility in Tevankarai Ar subwatershed,Kodaikkanal,India using binary logistic regression analysis.Geographic Information System is used to prepare the database of the predictor variables and landslide inventory map,which is used to build the spatial model of landslide susceptibility.The model describes the relationship between the dependent variable(presence and absence of landslide) and the independent variables selected for study(predictor variables) by the best fitting function.A forward stepwise logistic regression model using maximum likelihood estimation is used in the regression analysis.An inventory of 84 landslides and cells within a buffer distance of 10m around the landslide is used as the dependent variable.Relief,slope,aspect,plan curvature,profile curvature,land use,soil,topographic wetness index,proximity to roads and proximity to lineaments are taken as independent variables.The constant and the coefficient of the predictor variable retained by the regression model are used to calculate the probability of slope failure and analyze the effect of each predictor variable on landslide occurrence in thestudy area.The model shows that the most significant parameter contributing to landslides is slope.The other significant parameters are profile curvature,soil,road,wetness index and relief.The predictive logistic regression model is validated using temporal validation data-set of known landslide locations and shows an accuracy of 85.29 %.
基金supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions(164320H101)the Opening Fund of State Key Laboratory of Geohazard Prevention and Geoenvironment Protection of Chengdu University of Technology,China(SKLGP2012K012)+4 种基金the Opening Fund of Key Laboratory for Geo-hazards in Loess area(GLA2014005)the National Natural Science Foundation of China(No.40801212 and No.41201424)the 973 National Basic Research Program(Nos.2013CB733203,2013CB733204)the 863 National High-Tech Rand D Program(No.2012AA121302)the FP6 project"Mountain Risks"of the European Commission(No.MRTNCT-2006-035798)
文摘A detailed landslide susceptibility map was produced in the Youfang catchment using logistic regression method with datasets developed for a geographic information system(GIS).Known as one of the most landslide-prone areas in China, the Youfang catchment of Longnan mountain region,which lies in the transitional area among QinghaiTibet Plateau, loess Plateau and Sichuan Basin, was selected as a representative case to evaluate the frequency and distribution of landslides.Statistical relationships for landslide susceptibility assessment were developed using landslide and landslide causative factor databases.Logistic regression(LR)was used to create the landslide susceptibility maps based on a series of available data sources: landslide inventory; distance to drainage systems, faults and roads; slope angle and aspect; topographic elevation and topographical wetness index, and land use.The quality of the landslide susceptibility map produced in this paper was validated and the result can be used fordesigning protective and mitigation measures against landslide hazards.The landslide susceptibility map is expected to provide a fundamental tool for landslide hazards assessment and risk management in the Youfang catchment.
基金funded by the National Research Council,Sri Lanka[NRC 17-066]。
文摘This study aimed to assess the potential of in-situ measured soil and vegetation characteristics in landslide susceptibility analyses.First,data for eight independent variables,i.e.,soil moisture content,soil organic content,compaction of soil(soil toughness),plant root strength,crop biomass,tree diameter at knee height,Shannon Wiener Index(SWI)for trees and herbs was assembled from field tests at two historic landslide locations:Aranayaka and Kurukudegama,Sri Lanka.An economical,finer resolution database was obtained as the field tests were not cost-prohibitive.The logistic regression(LR)analysis showed that soil moisture content,compaction of soil,SWI for trees and herbs were statistically significant at P<0.05.The variance inflation factors(VIFs)were computed to test for multicollinearity.VIF values(<2)confirmed the absence of multicollinearity between four independent variables in the LR model.Receiver Operating Characteristics(ROC)curve and Confusion Metrix(CM)methods were used to validate the model.In ROC analysis,areas under the curve of Success Rate Curve and Prediction Rate Curve were 84.5% and 96.6%,respectively,demonstrating the model’s excellent compatibility and predictability.According to the CM,the model demonstrated a 79.6% accuracy,63.6% precision,100% recall,and a F-measure of 77.8%.The model coefficients revealed that the vegetation cover has a more significant contribution to landslide susceptibility than soil characteristics.Finally,the susceptibility map,which was then classified as low,medium,and highly susceptible areas based on the natural breaks(Jenks)method,was generated using geographical information systems(GIS)techniques.All the historic landslide locations fell into the high susceptibility areas.Thus,validation of the model and inspection of the susceptibility map indicated that the in-situ soil and vegetation characteristics used in the model could be employed to demarcate historical landslide patches and identify landslide susceptible locations with high confidence.
基金supported by National Key Natural Science Foundation of China (Grant No. 50635010)
文摘The currently prevalent machine performance degradation assessment techniques involve estimating a machine's current condition based upon the recognition of indications of failure features,which entail complete data collected in different conditions.However,failure data are always hard to acquire,thus making those techniques hard to be applied.In this paper,a novel method which does not need failure history data is introduced.Wavelet packet decomposition(WPD) is used to extract features from raw signals,principal component analysis(PCA) is utilized to reduce feature dimensions,and Gaussian mixture model(GMM) is then applied to approximate the feature space distributions.Single-channel confidence value(SCV) is calculated by the overlap between GMM of the monitoring condition and that of the normal condition,which can indicate the performance of single-channel.Furthermore,multi-channel confidence value(MCV),which can be deemed as the overall performance index of multi-channel,is calculated via logistic regression(LR) and that the task of decision-level sensor fusion is also completed.Both SCV and MCV can serve as the basis on which proactive maintenance measures can be taken,thus preventing machine breakdown.The method has been adopted to assess the performance of the turbine of a centrifugal compressor in a factory of Petro-China,and the result shows that it can effectively complete this task.The proposed method has engineering significance for machine performance degradation assessment.
基金funded by the National Natural Science Foundation of China(Grant No.41877533)。
文摘Ecological land is an important guarantee to maintain urban ecological security and sustainable development.Although increasing studies have been brought to ecological land,with few explorations of the relative importance of anthropogenic-natural factors and how they interact to induce the ecological land evolution.This research sought to fill this gap.In this study,18 factors,including the risk of goaf collapse,fault,prime croplands,were selected from six aspects of topography,geology,climate,accessibility,socio-economic and land control policies.logistic regression(LR)and random forest(RF)models were adopted to identify the anthropogenic and biophysical factors on the dynamic change of ecological land of Mentougou in Beijing from 1990 to 2018.The results show that there was a significant increase in ecological land from 1990 to 2018.The increased area of ecological land reached 102.11 km2 with an increased rate of 0.78,the gravity center of ecological land gradually moved to the northwest.The impact of anthropogenic factors on ecological land was greater than that of natural factors,ecological land was mainly driven by proportion of prime cropland,per capita GDP,land urbanization,temperature,per capita rural income,elevation and aspect factors.Additionally,slope and precipitation were also identified as important predictors for ecological land change.The model comparison suggested that RF can better identify the relationship between ecological land and explanatory variables than LR model.Based on our findings,the implementation of government policies along with anthropogenic factors are the most important variables influencing ecological land change,and the rational planning and allocation of ecological land by Mentougou government are still needed.
基金Under the auspices of the CAS Overseas Institutions Platform Project (No. 131C11KYSB20200033)the National Natural Science Foundation of China (No. 42071349)the Sichuan Science and Technology Program (No. 2020JDJQ0003)。
文摘Landslide distribution and susceptibility mapping are the fundamental steps for landslide-related hazard and disaster risk management activities, especially in the Himalaya region which has resulted in a great deal of death and damage to property. To better understand the landslide condition in the Nepal Himalaya, we carried out an investigation on the landslide distribution and susceptibility using the landslide inventory data and 12 different contributing factors in the Dailekh district, Western Nepal. Based on the evaluation of the frequency distribution of the landslide, the relationship between the landslide and the various contributing factors was determined.Then, the landslide susceptibility was calculated using logistic regression and statistical index methods along with different topographic(slope, aspect, relative relief, plan curvature, altitude, topographic wetness index) and non-topographic factors(distance from river, normalized difference vegetation index(NDVI), distance from road, precipitation, land use and land cover, and geology), and 470(70%) of total 658 landslides. The receiver operating characteristic(ROC) curve analysis using 198(30%) of total landslides showed that the prediction curve rates(area under the curve, AUC) values for two methods(logistic regression and statistical index) were 0.826, and 0.823with success rates of 0.793, and 0.811, respectively. The values of R-Index for the logistic regression and statistical index methods were83.66 and 88.54, respectively, consisting of high susceptible hazard classes. In general, this research concluded that the cohesive and coherent natural interplay of topographic and non-topographic factors strongly affects landslide occurrence, distribution, and susceptibility condition in the Nepal Himalaya region. Furthermore, the reliability of these two methods is verified for landslide susceptibility mapping in Nepal’s central mountain region.
基金The National Natural Science Foundation of China(No.61075009)the Natural Science Foundation of Jiangsu Province(No.BK2011595)the Program for New Century Excellent Talents in University of China,the Qing Lan Project of Jiangsu Province
文摘In order to improve classification accuracy, the regularized logistic regression is used to classify single-trial electroencephalogram (EEG). A novel approach, named local sparse logistic regression (LSLR), is proposed. The LSLR integrates the locality preserving projection regularization term into the framework of sparse logistic regression. It tries to maintain the neighborhood information of original feature space, and, meanwhile, keeps sparsity. The bound optimization algorithm and component-wise update are used to compute the weight vector in the training data, thus overcoming the disadvantage of the Newton-Raphson method and iterative re-weighted least squares (IRLS). The classification accuracy of 80% is achieved using ten-fold cross-validation in the self-paced finger tapping data set. The results of LSLR are compared with SLR, showing the effectiveness of the proposed method.