Forecasting river flow is crucial for optimal planning,management,and sustainability using freshwater resources.Many machine learning(ML)approaches have been enhanced to improve streamflow prediction.Hybrid techniques...Forecasting river flow is crucial for optimal planning,management,and sustainability using freshwater resources.Many machine learning(ML)approaches have been enhanced to improve streamflow prediction.Hybrid techniques have been viewed as a viable method for enhancing the accuracy of univariate streamflow estimation when compared to standalone approaches.Current researchers have also emphasised using hybrid models to improve forecast accuracy.Accordingly,this paper conducts an updated literature review of applications of hybrid models in estimating streamflow over the last five years,summarising data preprocessing,univariate machine learning modelling strategy,advantages and disadvantages of standalone ML techniques,hybrid models,and performance metrics.This study focuses on two types of hybrid models:parameter optimisation-based hybrid models(OBH)and hybridisation of parameter optimisation-based and preprocessing-based hybridmodels(HOPH).Overall,this research supports the idea thatmeta-heuristic approaches precisely improveML techniques.It’s also one of the first efforts to comprehensively examine the efficiency of various meta-heuristic approaches(classified into four primary classes)hybridised with ML techniques.This study revealed that previous research applied swarm,evolutionary,physics,and hybrid metaheuristics with 77%,61%,12%,and 12%,respectively.Finally,there is still room for improving OBH and HOPH models by examining different data pre-processing techniques and metaheuristic algorithms.展开更多
High-quality data play a paramount role in monitoring,control,and prediction of wastewater treatment process(WWTP)and can effectively ensure the efficient and stable operation of system.Missing values seriously degrad...High-quality data play a paramount role in monitoring,control,and prediction of wastewater treatment process(WWTP)and can effectively ensure the efficient and stable operation of system.Missing values seriously degrade the accuracy,reliability and completeness of the data quality due to network collapses,connection errors and data transformation failures.In these cases,it is infeasible to recover missing data depending on the correlation with other variables.To tackle this issue,a univariate imputation method(UIM)is proposed for WWTP integrating decomposition method and imputation algorithms.First,the seasonal-trend decomposition based on loess method is utilized to decompose the original time series into the seasonal,trend and remainder components to deal with the nonstationary characteristics of WWTP data.Second,the support vector regression is used to approximate the nonlinearity of the trend and remainder components respectively to provide estimates of its missing values.A self-similarity decomposition is conducted to fill the seasonal component based on its periodic pattern.Third,all the imputed results are merged to obtain the imputation result.Finally,six time series of WWTP are used to evaluate the imputation performance of the proposed UIM by comparing with existing seven methods based on two indicators.The experimental results illustrate that the proposed UIM is effective for WWTP time series under different missing ratios.Therefore,the proposed UIM is a promising method to impute WWTP time series.展开更多
As an extended period of unusually dry weather conditions without sufficient rain, drought poses enormous risk on societies. Characterized by the absence of precipitation for long periods of time, often resulting in w...As an extended period of unusually dry weather conditions without sufficient rain, drought poses enormous risk on societies. Characterized by the absence of precipitation for long periods of time, often resulting in water scarcity, droughts are increasingly posing significant environmental challenges. Drought is therefore considered an important element in the management of water resources, especially groundwater resources during drought. This study therefore sought to investigate the rainfall variability and the frequency of drought for the period 1991 to 2020 in Bamako based on monthly rainfall data from Bamako-Senou gauge station. The standardized precipitation index (SPI) for 12-month, 6-month and 3-month timescales and the SPI for annual totals were used to characterized drought in the study area (Bamako). Univariate parametric probability distributions such as Normal, Log-normal, Gumbel type I and Pearson type III (P3) distributions were fitted with drought variables (severity and duration) for future planning and management. Non-parametric test such as Mann-Kendall trend test was also used to detect trend in annual rainfall data. The results showed that based on 12-month SPI, Bamako experienced two (02) extreme droughts one in July 2002 (SPI = -2.2165) and another in June 2015 (SPI = -2.0598 QUOTE SPI=-2.0598 ). Drought years represented 46.67% for the overall periods according to the SPI for annual totals. The result further indicated that based on the goodness of fit test, the P3 distribution represents the best fitted distribution to both drought severity and duration over Bamako. Bamako is expected to experience several severe severities with higher and shorter duration in the future. Severities with 1, 2, 6, and 10-month duration had return periods ranged from 2.4 to 3.8 years, while 5, 10, 20, 25, 50, and 100-year return periods had 18.51, 26.08, 33.25, 35.50, 42.38, and 49.14 severities, respectively, and durations associated to these severities were 19.8, 26.9, 33.5, 35.6, 42, and 48.2 months, respectively.展开更多
The time domain approach, i.e. Autoregressive (AR) processes, of time series analysis is applied to the monsoon rainfall series of India and its two major regions, viz. North-West India and Central India. Since the or...The time domain approach, i.e. Autoregressive (AR) processes, of time series analysis is applied to the monsoon rainfall series of India and its two major regions, viz. North-West India and Central India. Since the original time series shows no modelable structure due to the presence of high interannual variability, a 3-point running filter is applied before exploring and fitting appropriate stochastic models. Out of several parsimonious models fitted, AR(3) is found to be most suitable. The usefulness of this fitted model is validted on an independent datum of 18 years and some skill has been noted. These models therefore can be used for low skill higher lead time forecasts of monsoon. Further the forecasts produced through such models can be combined with other forecasts to increase the skill of monsoon forecasts.展开更多
Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate mode...Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate models are discussed: the exponential smoothing (ES), Holt-Winters (HW) and autoregressive intergrade moving average (ARIMA) models. To determine the best model, six different strategies were applied as selection criteria to quantify these models’ prediction accuracies. This comparison should help policy makers and industry marketing strategists select the best forecasting method in oil market. The three models were compared by applying them to the time series of regular oil prices for West Texas Intermediate (WTI) crude. The comparison indicated that the HW model performed better than the ES model for a prediction with a confidence interval of 95%. However, the ARIMA (2, 1, 2) model yielded the best results, leading us to conclude that this sophisticated and robust model outperformed other simple yet flexible models in oil market.展开更多
In this work, a new approach is proposed for constructing splines with tension. The basic idea is in the use of distributions theory, which allows us to define suitable Hilbert spaces in which the tension spline minim...In this work, a new approach is proposed for constructing splines with tension. The basic idea is in the use of distributions theory, which allows us to define suitable Hilbert spaces in which the tension spline minimizes some energy functional. Classical orthogonal conditions and characterizations of the spline in terms of a fundamental solution of a differential operator are provided. An explicit representation of the tension spline is given. The tension spline can be computed by solving a linear system. Some numerical examples are given to illustrate this approach.展开更多
This study is aimed to evaluate the quality of life (QOL) for individuals living with human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) in Hubei province-central China by using WHOQOL-...This study is aimed to evaluate the quality of life (QOL) for individuals living with human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) in Hubei province-central China by using WHOQOL-BREF instrument (Chinese version). One hundred and thirty six respondents (HIV/AIDS individuals) attending out-patient department of Chinese Center for Disease Control and Prevention (Chinese CDC) were administered a structured questionnaire developed by investigators. QOL was evaluated by using WHOQOL-BREF instrument (Chinese version). The results showed that the mean score of overall QOL on a scale of 0-100 was 25.8. The mean scores in 4 domains of QOL on a scale of 0-100 were 82.9 (social domain), 27.5 (psychological domain), 17.7 (physical domain) and 11.65 (environmental domain). The significant difference of QOL was noted in the score of physical domain between asymptomatic (14.6) and early symptomatic individuals (12) (P=0.014), and between patients with early symptoms (12) and those with AIDS (10.43) (P〈0.001). QOL in psychological domain was significantly lower in early symptomatic (12.1) (P〈0.05) and AIDS patients (12.4) (P〈0.006) than in asymptomatic individuals (14.2). The difference in QOL scores in the psychological domain was significant with respect to the income of patients (P〈0.048) and educational status (P〈0.037). Significantly better QOL scores in the physical domain (P〈0.040) and environmental domain (P〈0.017) were noted with respect to the occupation of the patients. Patients with family support had better QOL scores in environmental domain. In our research, QOL for HIV/AIDS individuals was associated with education, occupation, income, family support and clinical categories of the patients. It was concluded that WHOQOL-BREF Chinese version was successfully used in the evaluation of QOL of HIV/AIDS individuals in Chinese population and proved to be a reliable and useful tool.展开更多
Often many variables have to be analyzed for their importance in terms of significant contribution and predictability in medical research. One of the possible analytical tools may be the Multiple Linear Regression Ana...Often many variables have to be analyzed for their importance in terms of significant contribution and predictability in medical research. One of the possible analytical tools may be the Multiple Linear Regression Analysis. However, research papers usually report both univariate and multivariate regression analyses of the data. The biostatistician sometimes faces practical difficulties while selecting the independent variables for logical inclusion in the multivariate analysis. The selection criteria for inclusion of a variable in the multivariate regression is that the variable at the univariate level should have a regression coefficient with p 〈 0.20. However, there is a chance that an independent variable with p 〉 0.20 at univariate regression may become significant in the multivariate regression analysis and vice versa, provided the above criteria is not strictly adhered to. We undertook both univariate and multivariate linear regression analyses on data from two multi-centric clinical trials. We recommend that there is no need to restrict the p value of 〈= 0.20. Because of high speed computer and availability of statistical software, the desired results could be achieved by considering all relevant independent variables in multivariate regression analysis.展开更多
In order to support the perception and defense of the operation risk of the medium and low voltage distribution system, it is crucial to conduct data mining on the time series generated by the system to learn anomalou...In order to support the perception and defense of the operation risk of the medium and low voltage distribution system, it is crucial to conduct data mining on the time series generated by the system to learn anomalous patterns, and carry out accurate and timely anomaly detection for timely discovery of anomalous conditions and early alerting. And edge computing has been widely used in the processing of Internet of Things (IoT) data. The key challenge of univariate time series anomaly detection is how to model complex nonlinear time dependence. However, most of the previous works only model the short-term time dependence, without considering the periodic long-term time dependence. Therefore, we propose a new Hierarchical Attention Network (HAN), which introduces seven day-level attention networks to capture fine-grained short-term time dependence, and uses a week-level attention network to model the periodic long-term time dependence. Then we combine the day-level feature learned by day-level attention network and week-level feature learned by week-level attention network to obtain the high-level time feature, according to which we can calculate the anomaly probability and further detect the anomaly. Extensive experiments on a public anomaly detection dataset, and deployment in a real-world medium and low voltage distribution system show the superiority of our proposed framework over state-of-the-arts.展开更多
Background:"Chickenpox"is a highly infectious disease caused by the varicella-zoster virus,influenced by seasonal and spatial factors.Dealing with varicella-zoster epidemics can be a substantial drain on hea...Background:"Chickenpox"is a highly infectious disease caused by the varicella-zoster virus,influenced by seasonal and spatial factors.Dealing with varicella-zoster epidemics can be a substantial drain on health-authority resources.Methods that improve the ability to locally predict case numbers from time-series data sets every week are therefore worth developing.Methods:Simple-to-extract trend attributes from published univariate weekly case-number univariate data sets were used to generate multivariate data for Hungary covering 10 years.That attribute-enhanced data set was assessed by machine learning(ML)and deep learning(DL)models to generate weekly case forecasts from next week(t0)to 12 weeks forward(t+12).The ML and DL predictions were compared with those generated by multilinear regression and univariate prediction methods.Results:Support vector regression generates the best predictions for weeks t0 and t+1,whereas extreme gradient boosting generates the best predictions for weeks t+3 to t+12.Long-short-term memory only provides comparable prediction accuracy to the ML models for week t+12.Multi-K-fold cross validation reveals that overall the lowest prediction uncertainty is associated with the tree-ensemble ML models.Conclusion:The novel trend-attribute method offers the potential to reduce prediction errors and improve transparency for chickenpox timeseries.展开更多
In this paper,the notion of rational univariate representations with variables is introduced.Consequently,the ideals,created by given rational univariate representations with variables,are defined.One merit of these c...In this paper,the notion of rational univariate representations with variables is introduced.Consequently,the ideals,created by given rational univariate representations with variables,are defined.One merit of these created ideals is that some of their algebraic properties can be easily decided.With the aid of the theory of valuations,some related results are established.Based on these results,a new approach is presented for decomposing the radical of a polynomial ideal into an intersection of prime ideals.展开更多
In this paper,we propose a novel study for gesture identification using surface electromyography(sEMG)signal,and the raw sEMG signal and the sEMG envelope signal are collected by the sensor at the same time.An efficie...In this paper,we propose a novel study for gesture identification using surface electromyography(sEMG)signal,and the raw sEMG signal and the sEMG envelope signal are collected by the sensor at the same time.An efficient method of gesture identification based on the combination of two signals using supervised learning and univariate feature selection is implemented.In previous research techniques,researchers tend to use the raw sEMG signal and extract several constant features for classification,which inevitably causes a result of ignoring individual differences.Our experiment shows that both the optimal feature set and redundant feature set are not same for different subjects.In order to address this problem,we extract all the common features from two signals,up to 76 features,most of which has been established as the common EMG-based gesture index.In addition,extracting too many features in an application can reduce operational efficiency,so we apply for feature selection to get the optimal feature set and decrease the number of extracting feature.As a result,the combination of two signals is better than using a single signal.The feature selection can be used to select optimal feature set from all features to achieve the best classification performance for each subject.The experimental results demonstrate that the proposed method achieves the performance with the highest accuracy of 95%for identifying up to nine gestures only using two sensors.Finally,we develop a real-time intelligent sEMG-driven bionic hand system by using the proposed method.展开更多
In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet ...In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet the intended requirements within a specific frequency band, considering the uncertainty of structural parameters and frequency-variant electromagnetic parameters.And then a frequency domain reliability analysis method based on univariate dimension reduction method is proposed, which provides an effective calculation tool for electromagnetic frequency domain reliability. In electromagnetic problems, performance indicators usually vary with frequency. The method firstly discretizes the frequency-variant performance indicator function into a series of frequency points' functions, and then transforms the frequency domain reliability problem into a series system reliability problem of discrete frequency points' functions. Secondly, the univariate dimension reduction method is introduced to solve the probability distribution functions and correlation coefficients of discrete frequency points' functions in the system. Finally, according to the above calculation results, the series system reliability can be solved to obtain the frequency domain reliability, and the cumulative distribution function of the performance indicator can also be obtained. In this study,Monte Carlo simulation is adopted to demonstrate the validity of the frequency domain reliability analysis method. Three examples are investigated to demonstrate the accuracy and efficiency of the proposed method.展开更多
In this paper,the so-called invertibility is introduced for rational univariate representations,and a characterization of the invertibility is given.It is shown that the rational univariate representations,obtained by...In this paper,the so-called invertibility is introduced for rational univariate representations,and a characterization of the invertibility is given.It is shown that the rational univariate representations,obtained by both Rouillier’s approach and Wu’s method,are invertible.Moreover,the ideal created by a given rational univariate representation is defined.Some results on invertible rational univariate representations and created ideals are established.Based on these results,a new approach is presented for decomposing the radical of a zero-dimensional polynomial ideal into an intersection of maximal ideals.展开更多
Severe matrix effects and high signal uncertainty are two key bottlenecks for the quantitative performance and wide applications of laser-induced breakdown spectroscopy(LIBS).Based on the understanding that the superp...Severe matrix effects and high signal uncertainty are two key bottlenecks for the quantitative performance and wide applications of laser-induced breakdown spectroscopy(LIBS).Based on the understanding that the superposition of both matrix effects and signal uncertainty directly affects plasma parameters and further influences spectral intensity and LIBS quantification performance,a data selection method based on plasma temperature matching(DSPTM)was proposed to reduce both matrix effects and signal uncertainty.By selecting spectra with smaller plasma temperature differences for all samples,the proposed method was able to build up the quantification model to rely more on spectra with smaller matrix effects and signal uncertainty,therefore improving final quantification performance.When applied to quantitative analysis of the zinc content in brass alloys,it was found that both accuracy and precision were improved using either a univariate model or multiple linear regression(MLR).More specifically,for the univariate model,the root-mean-square error of prediction(RMSEP),the determination coefficients(R^(2))and relative standard derivation(RSD)were improved from 3.30%,0.864 and 18.8%to 1.06%,0.986 and 13.5%,respectively;while for MLR,RMSEP,R^(2)and RSD were improved from 3.22%,0.871 and 26.2%to 1.07%,0.986 and 17.4%,respectively.These results prove that DSPTM can be used as an effective method to reduce matrix effects and improve repeatability by selecting reliable data.展开更多
OBJECTIVE: To investigate the factors associated with sensory and motor recovery after the repair of upper limb peripheral nerve injuries. DATA SOURCES: The online PubMed database was searched for English articles d...OBJECTIVE: To investigate the factors associated with sensory and motor recovery after the repair of upper limb peripheral nerve injuries. DATA SOURCES: The online PubMed database was searched for English articles describing outcomes after the repair of median, ulnar, radial, and digital nerve injuries in humans with a publication date between 1 January 1990 and 16 February 2011. STUDY SELECTION: The following types of article were selected: (1) clinical trials describ- ing the repair of median, ulnar, radial, and digital nerve injuries published in English; and (2) studies that reported sufficient patient information, including age, mechanism of injury, nerve injured, injury location, defect length, repair time, repair method, and repair materials. SPSS 13.0 software was used to perform univariate and multivariate logistic regression analyses and to in- vestigate the patient and intervention factors associated with outcomes. MAIN OUTCOME MEASURES: Sensory function was assessed using the Mackinnon-Dellon scale and motor function was assessed using the manual muscle test. Satisfactory motor recovery was defined as grade M4 or M5, and satisfactory sensory recovery was defined as grade S3+ or S4. RESULTS: Seventy-one articles were included in this study. Univariate and multivariate logistic regression analyses showed that repair time, repair materials, and nerve injured were inde- pendent predictors of outcome after the repair of nerve injuries (P 〈 0.05), and that the nerve injured was the main factor affecting the rate of good to excellent recovery. CONCLUSION: Predictors of outcome after the repair of peripheral nerve injuries include age, gender, repair time, repair materials, nerve injured, defect length, and duration of follow-up.展开更多
In dynamic environments,it is important to track changing optimal solutions over time.Univariate marginal distribution algorithm(UMDA) which is a class algorithm of estimation of distribution algorithms attracts mor...In dynamic environments,it is important to track changing optimal solutions over time.Univariate marginal distribution algorithm(UMDA) which is a class algorithm of estimation of distribution algorithms attracts more and more attention in recent years.In this paper a new multi-population and diffusion UMDA(MDUMDA) is proposed for dynamic multimodal problems.The multi-population approach is used to locate multiple local optima which are useful to find the global optimal solution quickly to dynamic multimodal problems.The diffusion model is used to increase the diversity in a guided fashion,which makes the neighbor individuals of previous optimal solutions move gradually from the previous optimal solutions and enlarge the search space.This approach uses both the information of current population and the part history information of the optimal solutions.Finally experimental studies on the moving peaks benchmark are carried out to evaluate the proposed algorithm and compare the performance of MDUMDA and multi-population quantum swarm optimization(MQSO) from the literature.The experimental results show that the MDUMDA is effective for the function with moving optimum and can adapt to the dynamic environments rapidly.展开更多
BACKGROUND Neuroendocrine tumors(NETs)frequently occur in the gastrointestinal tract,lung,and pancreas,and the rectum and appendix are the sites with the highest incidence.Epidemiology statistics show that an estimate...BACKGROUND Neuroendocrine tumors(NETs)frequently occur in the gastrointestinal tract,lung,and pancreas,and the rectum and appendix are the sites with the highest incidence.Epidemiology statistics show that an estimated 8000 people every year in the United States are diagnosed with NETs occurring in the gastrointestinal tract,including the stomach,intestine,appendix,colon,and rectum.The pathological changes and clinical symptoms of NETs are not specific,and therefore they are frequently misdiagnosed.AIM To investigate the clinical symptoms,pathological characteristics,treatment,and prognosis of rectal neuroendocrine tumors(RNETs)by analyzing the clinical and pathological data of 132 RNET cases at our hospital.METHODS All RNETs were graded according to Ki-67 positivity and mitotic events.The tumors were staged as clinical stages I,II,III,and IV according to infiltrative depth and tumor size.COX proportional hazard model was used to assess the main risk factors for survival.RESULTS These 132 RNETs included 83 cases of G1,21 cases of G2,and 28 cases of G3(neuroendocrine carcinoma)disease.Immunohistochemical staining showed that 89.4%of RNETs were positive for synaptophysin and 39.4%positive for chromogranin A.There were 19,85,23,and 5 cases of clinical stages I,II,III,and IV,respectively.The median patient age was 52.96 years.The diameter of tumor,depth of invasion,and pathological grade were the main reference factors for the treatment of RNETs.The survival rates at 6,12,36,and 60 mo after operation were 98.5%,94.6%,90.2%,and 85.6%,respectively.Gender,tumor size,tumor grade,lymph node or distant organ metastasis,and radical resection were the main factors associated with prognosis of RNETs.Multivariate analysis showed that tumor size and grade were independent prognostic factors.CONCLUSION The clinical symptoms of RNETs are not specific,and they are easy to misdiagnose.Surgery is the main treatment method.The grade and stage of RNETs are the main indices to evaluate prognosis.展开更多
Considering the dependent relationship among wave height,wind speed,and current velocity,we construct novel trivariate joint probability distributions via Archimedean copula functions.Total 30-year data of wave height...Considering the dependent relationship among wave height,wind speed,and current velocity,we construct novel trivariate joint probability distributions via Archimedean copula functions.Total 30-year data of wave height,wind speed,and current velocity in the Bohai Sea are hindcast and sampled for case study.Four kinds of distributions,namely,Gumbel distribution,lognormal distribution,Weibull distribution,and Pearson Type III distribution,are candidate models for marginal distributions of wave height,wind speed,and current velocity.The Pearson Type III distribution is selected as the optimal model.Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas,namely,Clayton,Frank,Gumbel-Hougaard,and Ali-Mikhail-Haq copulas.These joint probability models can maximize marginal information and the dependence among the three variables.The design return values of these three variables can be obtained by three methods:univariate probability,conditional probability,and joint probability.The joint return periods of different load combinations are estimated by the proposed models.Platform responses(including base shear,overturning moment,and deck displacement) are further calculated.For the same return period,the design values of wave height,wind speed,and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability.Considering the dependence among variables,the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.展开更多
基金This paper’s logical organisation and content quality have been enhanced,so the authors thank anonymous reviewers and journal editors for assistance.
文摘Forecasting river flow is crucial for optimal planning,management,and sustainability using freshwater resources.Many machine learning(ML)approaches have been enhanced to improve streamflow prediction.Hybrid techniques have been viewed as a viable method for enhancing the accuracy of univariate streamflow estimation when compared to standalone approaches.Current researchers have also emphasised using hybrid models to improve forecast accuracy.Accordingly,this paper conducts an updated literature review of applications of hybrid models in estimating streamflow over the last five years,summarising data preprocessing,univariate machine learning modelling strategy,advantages and disadvantages of standalone ML techniques,hybrid models,and performance metrics.This study focuses on two types of hybrid models:parameter optimisation-based hybrid models(OBH)and hybridisation of parameter optimisation-based and preprocessing-based hybridmodels(HOPH).Overall,this research supports the idea thatmeta-heuristic approaches precisely improveML techniques.It’s also one of the first efforts to comprehensively examine the efficiency of various meta-heuristic approaches(classified into four primary classes)hybridised with ML techniques.This study revealed that previous research applied swarm,evolutionary,physics,and hybrid metaheuristics with 77%,61%,12%,and 12%,respectively.Finally,there is still room for improving OBH and HOPH models by examining different data pre-processing techniques and metaheuristic algorithms.
基金the National Key Research and Development Project(No.2018YFC1900800-5)the National Natural Science Foundation of China(Nos.61890930-5,61903010,6202100)+1 种基金the Beijing Outstanding Young Scientist Program(No.BJJWZYJH01201910005020)the Beijing Natural Science Foundation(No.KZ202110005009).
文摘High-quality data play a paramount role in monitoring,control,and prediction of wastewater treatment process(WWTP)and can effectively ensure the efficient and stable operation of system.Missing values seriously degrade the accuracy,reliability and completeness of the data quality due to network collapses,connection errors and data transformation failures.In these cases,it is infeasible to recover missing data depending on the correlation with other variables.To tackle this issue,a univariate imputation method(UIM)is proposed for WWTP integrating decomposition method and imputation algorithms.First,the seasonal-trend decomposition based on loess method is utilized to decompose the original time series into the seasonal,trend and remainder components to deal with the nonstationary characteristics of WWTP data.Second,the support vector regression is used to approximate the nonlinearity of the trend and remainder components respectively to provide estimates of its missing values.A self-similarity decomposition is conducted to fill the seasonal component based on its periodic pattern.Third,all the imputed results are merged to obtain the imputation result.Finally,six time series of WWTP are used to evaluate the imputation performance of the proposed UIM by comparing with existing seven methods based on two indicators.The experimental results illustrate that the proposed UIM is effective for WWTP time series under different missing ratios.Therefore,the proposed UIM is a promising method to impute WWTP time series.
文摘As an extended period of unusually dry weather conditions without sufficient rain, drought poses enormous risk on societies. Characterized by the absence of precipitation for long periods of time, often resulting in water scarcity, droughts are increasingly posing significant environmental challenges. Drought is therefore considered an important element in the management of water resources, especially groundwater resources during drought. This study therefore sought to investigate the rainfall variability and the frequency of drought for the period 1991 to 2020 in Bamako based on monthly rainfall data from Bamako-Senou gauge station. The standardized precipitation index (SPI) for 12-month, 6-month and 3-month timescales and the SPI for annual totals were used to characterized drought in the study area (Bamako). Univariate parametric probability distributions such as Normal, Log-normal, Gumbel type I and Pearson type III (P3) distributions were fitted with drought variables (severity and duration) for future planning and management. Non-parametric test such as Mann-Kendall trend test was also used to detect trend in annual rainfall data. The results showed that based on 12-month SPI, Bamako experienced two (02) extreme droughts one in July 2002 (SPI = -2.2165) and another in June 2015 (SPI = -2.0598 QUOTE SPI=-2.0598 ). Drought years represented 46.67% for the overall periods according to the SPI for annual totals. The result further indicated that based on the goodness of fit test, the P3 distribution represents the best fitted distribution to both drought severity and duration over Bamako. Bamako is expected to experience several severe severities with higher and shorter duration in the future. Severities with 1, 2, 6, and 10-month duration had return periods ranged from 2.4 to 3.8 years, while 5, 10, 20, 25, 50, and 100-year return periods had 18.51, 26.08, 33.25, 35.50, 42.38, and 49.14 severities, respectively, and durations associated to these severities were 19.8, 26.9, 33.5, 35.6, 42, and 48.2 months, respectively.
文摘The time domain approach, i.e. Autoregressive (AR) processes, of time series analysis is applied to the monsoon rainfall series of India and its two major regions, viz. North-West India and Central India. Since the original time series shows no modelable structure due to the presence of high interannual variability, a 3-point running filter is applied before exploring and fitting appropriate stochastic models. Out of several parsimonious models fitted, AR(3) is found to be most suitable. The usefulness of this fitted model is validted on an independent datum of 18 years and some skill has been noted. These models therefore can be used for low skill higher lead time forecasts of monsoon. Further the forecasts produced through such models can be combined with other forecasts to increase the skill of monsoon forecasts.
文摘Time-series-based forecasting is essential to determine how past events affect future events. This paper compares the performance accuracy of different time-series models for oil prices. Three types of univariate models are discussed: the exponential smoothing (ES), Holt-Winters (HW) and autoregressive intergrade moving average (ARIMA) models. To determine the best model, six different strategies were applied as selection criteria to quantify these models’ prediction accuracies. This comparison should help policy makers and industry marketing strategists select the best forecasting method in oil market. The three models were compared by applying them to the time series of regular oil prices for West Texas Intermediate (WTI) crude. The comparison indicated that the HW model performed better than the ES model for a prediction with a confidence interval of 95%. However, the ARIMA (2, 1, 2) model yielded the best results, leading us to conclude that this sophisticated and robust model outperformed other simple yet flexible models in oil market.
文摘In this work, a new approach is proposed for constructing splines with tension. The basic idea is in the use of distributions theory, which allows us to define suitable Hilbert spaces in which the tension spline minimizes some energy functional. Classical orthogonal conditions and characterizations of the spline in terms of a fundamental solution of a differential operator are provided. An explicit representation of the tension spline is given. The tension spline can be computed by solving a linear system. Some numerical examples are given to illustrate this approach.
文摘This study is aimed to evaluate the quality of life (QOL) for individuals living with human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) in Hubei province-central China by using WHOQOL-BREF instrument (Chinese version). One hundred and thirty six respondents (HIV/AIDS individuals) attending out-patient department of Chinese Center for Disease Control and Prevention (Chinese CDC) were administered a structured questionnaire developed by investigators. QOL was evaluated by using WHOQOL-BREF instrument (Chinese version). The results showed that the mean score of overall QOL on a scale of 0-100 was 25.8. The mean scores in 4 domains of QOL on a scale of 0-100 were 82.9 (social domain), 27.5 (psychological domain), 17.7 (physical domain) and 11.65 (environmental domain). The significant difference of QOL was noted in the score of physical domain between asymptomatic (14.6) and early symptomatic individuals (12) (P=0.014), and between patients with early symptoms (12) and those with AIDS (10.43) (P〈0.001). QOL in psychological domain was significantly lower in early symptomatic (12.1) (P〈0.05) and AIDS patients (12.4) (P〈0.006) than in asymptomatic individuals (14.2). The difference in QOL scores in the psychological domain was significant with respect to the income of patients (P〈0.048) and educational status (P〈0.037). Significantly better QOL scores in the physical domain (P〈0.040) and environmental domain (P〈0.017) were noted with respect to the occupation of the patients. Patients with family support had better QOL scores in environmental domain. In our research, QOL for HIV/AIDS individuals was associated with education, occupation, income, family support and clinical categories of the patients. It was concluded that WHOQOL-BREF Chinese version was successfully used in the evaluation of QOL of HIV/AIDS individuals in Chinese population and proved to be a reliable and useful tool.
文摘Often many variables have to be analyzed for their importance in terms of significant contribution and predictability in medical research. One of the possible analytical tools may be the Multiple Linear Regression Analysis. However, research papers usually report both univariate and multivariate regression analyses of the data. The biostatistician sometimes faces practical difficulties while selecting the independent variables for logical inclusion in the multivariate analysis. The selection criteria for inclusion of a variable in the multivariate regression is that the variable at the univariate level should have a regression coefficient with p 〈 0.20. However, there is a chance that an independent variable with p 〉 0.20 at univariate regression may become significant in the multivariate regression analysis and vice versa, provided the above criteria is not strictly adhered to. We undertook both univariate and multivariate linear regression analyses on data from two multi-centric clinical trials. We recommend that there is no need to restrict the p value of 〈= 0.20. Because of high speed computer and availability of statistical software, the desired results could be achieved by considering all relevant independent variables in multivariate regression analysis.
基金supported by the Science and Technology Project named“Research on Risk Perception and Defense System for Medium and Low Voltage Distribution System Operation Based on Data Mining”of State Grid Beijing Electric Power Company(No.520202220002).
文摘In order to support the perception and defense of the operation risk of the medium and low voltage distribution system, it is crucial to conduct data mining on the time series generated by the system to learn anomalous patterns, and carry out accurate and timely anomaly detection for timely discovery of anomalous conditions and early alerting. And edge computing has been widely used in the processing of Internet of Things (IoT) data. The key challenge of univariate time series anomaly detection is how to model complex nonlinear time dependence. However, most of the previous works only model the short-term time dependence, without considering the periodic long-term time dependence. Therefore, we propose a new Hierarchical Attention Network (HAN), which introduces seven day-level attention networks to capture fine-grained short-term time dependence, and uses a week-level attention network to model the periodic long-term time dependence. Then we combine the day-level feature learned by day-level attention network and week-level feature learned by week-level attention network to obtain the high-level time feature, according to which we can calculate the anomaly probability and further detect the anomaly. Extensive experiments on a public anomaly detection dataset, and deployment in a real-world medium and low voltage distribution system show the superiority of our proposed framework over state-of-the-arts.
文摘Background:"Chickenpox"is a highly infectious disease caused by the varicella-zoster virus,influenced by seasonal and spatial factors.Dealing with varicella-zoster epidemics can be a substantial drain on health-authority resources.Methods that improve the ability to locally predict case numbers from time-series data sets every week are therefore worth developing.Methods:Simple-to-extract trend attributes from published univariate weekly case-number univariate data sets were used to generate multivariate data for Hungary covering 10 years.That attribute-enhanced data set was assessed by machine learning(ML)and deep learning(DL)models to generate weekly case forecasts from next week(t0)to 12 weeks forward(t+12).The ML and DL predictions were compared with those generated by multilinear regression and univariate prediction methods.Results:Support vector regression generates the best predictions for weeks t0 and t+1,whereas extreme gradient boosting generates the best predictions for weeks t+3 to t+12.Long-short-term memory only provides comparable prediction accuracy to the ML models for week t+12.Multi-K-fold cross validation reveals that overall the lowest prediction uncertainty is associated with the tree-ensemble ML models.Conclusion:The novel trend-attribute method offers the potential to reduce prediction errors and improve transparency for chickenpox timeseries.
基金supported by the National Natural Science Foundation of China under Grant No.12161057。
文摘In this paper,the notion of rational univariate representations with variables is introduced.Consequently,the ideals,created by given rational univariate representations with variables,are defined.One merit of these created ideals is that some of their algebraic properties can be easily decided.With the aid of the theory of valuations,some related results are established.Based on these results,a new approach is presented for decomposing the radical of a polynomial ideal into an intersection of prime ideals.
文摘In this paper,we propose a novel study for gesture identification using surface electromyography(sEMG)signal,and the raw sEMG signal and the sEMG envelope signal are collected by the sensor at the same time.An efficient method of gesture identification based on the combination of two signals using supervised learning and univariate feature selection is implemented.In previous research techniques,researchers tend to use the raw sEMG signal and extract several constant features for classification,which inevitably causes a result of ignoring individual differences.Our experiment shows that both the optimal feature set and redundant feature set are not same for different subjects.In order to address this problem,we extract all the common features from two signals,up to 76 features,most of which has been established as the common EMG-based gesture index.In addition,extracting too many features in an application can reduce operational efficiency,so we apply for feature selection to get the optimal feature set and decrease the number of extracting feature.As a result,the combination of two signals is better than using a single signal.The feature selection can be used to select optimal feature set from all features to achieve the best classification performance for each subject.The experimental results demonstrate that the proposed method achieves the performance with the highest accuracy of 95%for identifying up to nine gestures only using two sensors.Finally,we develop a real-time intelligent sEMG-driven bionic hand system by using the proposed method.
基金supported by the National Natural Science Foundation of China(Grant No.51490662)the National Science Fund for Distinguished Young Scholars(Grant No.51725502)
文摘In this paper, a class of electromagnetic field frequency domain reliability problem is first defined. The frequency domain reliability refers to the probability that an electromagnetic performance indicator can meet the intended requirements within a specific frequency band, considering the uncertainty of structural parameters and frequency-variant electromagnetic parameters.And then a frequency domain reliability analysis method based on univariate dimension reduction method is proposed, which provides an effective calculation tool for electromagnetic frequency domain reliability. In electromagnetic problems, performance indicators usually vary with frequency. The method firstly discretizes the frequency-variant performance indicator function into a series of frequency points' functions, and then transforms the frequency domain reliability problem into a series system reliability problem of discrete frequency points' functions. Secondly, the univariate dimension reduction method is introduced to solve the probability distribution functions and correlation coefficients of discrete frequency points' functions in the system. Finally, according to the above calculation results, the series system reliability can be solved to obtain the frequency domain reliability, and the cumulative distribution function of the performance indicator can also be obtained. In this study,Monte Carlo simulation is adopted to demonstrate the validity of the frequency domain reliability analysis method. Three examples are investigated to demonstrate the accuracy and efficiency of the proposed method.
基金the National Natural Science Foundation of China under Grant No.12161057。
文摘In this paper,the so-called invertibility is introduced for rational univariate representations,and a characterization of the invertibility is given.It is shown that the rational univariate representations,obtained by both Rouillier’s approach and Wu’s method,are invertible.Moreover,the ideal created by a given rational univariate representation is defined.Some results on invertible rational univariate representations and created ideals are established.Based on these results,a new approach is presented for decomposing the radical of a zero-dimensional polynomial ideal into an intersection of maximal ideals.
基金financial support from the Scientific Research Program for Young Talents of China National Nuclear Corporation(2020)National Natural Science Foundation of China(Nos.51906124 and 62205172)+1 种基金Shanxi Province Science and Technology Department(No.20201101013)Guoneng Bengbu Power Generation Co.,Ltd(No.20212000001)。
文摘Severe matrix effects and high signal uncertainty are two key bottlenecks for the quantitative performance and wide applications of laser-induced breakdown spectroscopy(LIBS).Based on the understanding that the superposition of both matrix effects and signal uncertainty directly affects plasma parameters and further influences spectral intensity and LIBS quantification performance,a data selection method based on plasma temperature matching(DSPTM)was proposed to reduce both matrix effects and signal uncertainty.By selecting spectra with smaller plasma temperature differences for all samples,the proposed method was able to build up the quantification model to rely more on spectra with smaller matrix effects and signal uncertainty,therefore improving final quantification performance.When applied to quantitative analysis of the zinc content in brass alloys,it was found that both accuracy and precision were improved using either a univariate model or multiple linear regression(MLR).More specifically,for the univariate model,the root-mean-square error of prediction(RMSEP),the determination coefficients(R^(2))and relative standard derivation(RSD)were improved from 3.30%,0.864 and 18.8%to 1.06%,0.986 and 13.5%,respectively;while for MLR,RMSEP,R^(2)and RSD were improved from 3.22%,0.871 and 26.2%to 1.07%,0.986 and 17.4%,respectively.These results prove that DSPTM can be used as an effective method to reduce matrix effects and improve repeatability by selecting reliable data.
基金supported by the National High-Technology Research and Development Program of China(863 Program),No.2012A A020507985 Program of Sun Yat-sen University,No.90035-3283312+1 种基金Specialized Research Fund for the Doctoral Program of Higher Education,No.20120171120075Doctoral Start-up Project of the Natural Science Foundation of Guangdong Province,No.S201204006336
文摘OBJECTIVE: To investigate the factors associated with sensory and motor recovery after the repair of upper limb peripheral nerve injuries. DATA SOURCES: The online PubMed database was searched for English articles describing outcomes after the repair of median, ulnar, radial, and digital nerve injuries in humans with a publication date between 1 January 1990 and 16 February 2011. STUDY SELECTION: The following types of article were selected: (1) clinical trials describ- ing the repair of median, ulnar, radial, and digital nerve injuries published in English; and (2) studies that reported sufficient patient information, including age, mechanism of injury, nerve injured, injury location, defect length, repair time, repair method, and repair materials. SPSS 13.0 software was used to perform univariate and multivariate logistic regression analyses and to in- vestigate the patient and intervention factors associated with outcomes. MAIN OUTCOME MEASURES: Sensory function was assessed using the Mackinnon-Dellon scale and motor function was assessed using the manual muscle test. Satisfactory motor recovery was defined as grade M4 or M5, and satisfactory sensory recovery was defined as grade S3+ or S4. RESULTS: Seventy-one articles were included in this study. Univariate and multivariate logistic regression analyses showed that repair time, repair materials, and nerve injured were inde- pendent predictors of outcome after the repair of nerve injuries (P 〈 0.05), and that the nerve injured was the main factor affecting the rate of good to excellent recovery. CONCLUSION: Predictors of outcome after the repair of peripheral nerve injuries include age, gender, repair time, repair materials, nerve injured, defect length, and duration of follow-up.
基金supported by the National Natural Science Foundation of China (6087309960775013)
文摘In dynamic environments,it is important to track changing optimal solutions over time.Univariate marginal distribution algorithm(UMDA) which is a class algorithm of estimation of distribution algorithms attracts more and more attention in recent years.In this paper a new multi-population and diffusion UMDA(MDUMDA) is proposed for dynamic multimodal problems.The multi-population approach is used to locate multiple local optima which are useful to find the global optimal solution quickly to dynamic multimodal problems.The diffusion model is used to increase the diversity in a guided fashion,which makes the neighbor individuals of previous optimal solutions move gradually from the previous optimal solutions and enlarge the search space.This approach uses both the information of current population and the part history information of the optimal solutions.Finally experimental studies on the moving peaks benchmark are carried out to evaluate the proposed algorithm and compare the performance of MDUMDA and multi-population quantum swarm optimization(MQSO) from the literature.The experimental results show that the MDUMDA is effective for the function with moving optimum and can adapt to the dynamic environments rapidly.
文摘BACKGROUND Neuroendocrine tumors(NETs)frequently occur in the gastrointestinal tract,lung,and pancreas,and the rectum and appendix are the sites with the highest incidence.Epidemiology statistics show that an estimated 8000 people every year in the United States are diagnosed with NETs occurring in the gastrointestinal tract,including the stomach,intestine,appendix,colon,and rectum.The pathological changes and clinical symptoms of NETs are not specific,and therefore they are frequently misdiagnosed.AIM To investigate the clinical symptoms,pathological characteristics,treatment,and prognosis of rectal neuroendocrine tumors(RNETs)by analyzing the clinical and pathological data of 132 RNET cases at our hospital.METHODS All RNETs were graded according to Ki-67 positivity and mitotic events.The tumors were staged as clinical stages I,II,III,and IV according to infiltrative depth and tumor size.COX proportional hazard model was used to assess the main risk factors for survival.RESULTS These 132 RNETs included 83 cases of G1,21 cases of G2,and 28 cases of G3(neuroendocrine carcinoma)disease.Immunohistochemical staining showed that 89.4%of RNETs were positive for synaptophysin and 39.4%positive for chromogranin A.There were 19,85,23,and 5 cases of clinical stages I,II,III,and IV,respectively.The median patient age was 52.96 years.The diameter of tumor,depth of invasion,and pathological grade were the main reference factors for the treatment of RNETs.The survival rates at 6,12,36,and 60 mo after operation were 98.5%,94.6%,90.2%,and 85.6%,respectively.Gender,tumor size,tumor grade,lymph node or distant organ metastasis,and radical resection were the main factors associated with prognosis of RNETs.Multivariate analysis showed that tumor size and grade were independent prognostic factors.CONCLUSION The clinical symptoms of RNETs are not specific,and they are easy to misdiagnose.Surgery is the main treatment method.The grade and stage of RNETs are the main indices to evaluate prognosis.
基金partially supported by the National Natural Science Foundation of China(No.51479183)the National Key Research and Development Program,China(Nos.2016YFC0302301 and 2016YFC0803401)the Fundamental Research Funds for the Central University(No.201564003)
文摘Considering the dependent relationship among wave height,wind speed,and current velocity,we construct novel trivariate joint probability distributions via Archimedean copula functions.Total 30-year data of wave height,wind speed,and current velocity in the Bohai Sea are hindcast and sampled for case study.Four kinds of distributions,namely,Gumbel distribution,lognormal distribution,Weibull distribution,and Pearson Type III distribution,are candidate models for marginal distributions of wave height,wind speed,and current velocity.The Pearson Type III distribution is selected as the optimal model.Bivariate and trivariate probability distributions of these environmental conditions are established based on four bivariate and trivariate Archimedean copulas,namely,Clayton,Frank,Gumbel-Hougaard,and Ali-Mikhail-Haq copulas.These joint probability models can maximize marginal information and the dependence among the three variables.The design return values of these three variables can be obtained by three methods:univariate probability,conditional probability,and joint probability.The joint return periods of different load combinations are estimated by the proposed models.Platform responses(including base shear,overturning moment,and deck displacement) are further calculated.For the same return period,the design values of wave height,wind speed,and current velocity obtained by the conditional and joint probability models are much smaller than those by univariate probability.Considering the dependence among variables,the multivariate probability distributions provide close design parameters to actual sea state for ocean platform design.