Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series da...Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.展开更多
Short-term(up to 30 days)predictions of Earth Rotation Parameters(ERPs)such as Polar Motion(PM:PMX and PMY)play an essential role in real-time applications related to high-precision reference frame conversion.Currentl...Short-term(up to 30 days)predictions of Earth Rotation Parameters(ERPs)such as Polar Motion(PM:PMX and PMY)play an essential role in real-time applications related to high-precision reference frame conversion.Currently,least squares(LS)+auto-regressive(AR)hybrid method is one of the main techniques of PM prediction.Besides,the weighted LS+AR hybrid method performs well for PM short-term prediction.However,the corresponding covariance information of LS fitting residuals deserves further exploration in the AR model.In this study,we have derived a modified stochastic model for the LS+AR hybrid method,namely the weighted LS+weighted AR hybrid method.By using the PM data products of IERS EOP 14 C04,the numerical results indicate that for PM short-term forecasting,the proposed weighted LS+weighted AR hybrid method shows an advantage over both the LS+AR hybrid method and the weighted LS+AR hybrid method.Compared to the mean absolute errors(MAEs)of PMX/PMY sho rt-term prediction of the LS+AR hybrid method and the weighted LS+AR hybrid method,the weighted LS+weighted AR hybrid method shows average improvements of 6.61%/12.08%and 0.24%/11.65%,respectively.Besides,for the slopes of the linear regression lines fitted to the errors of each method,the growth of the prediction error of the proposed method is slower than that of the other two methods.展开更多
Accurate forecasting of time series is crucial across various domains.Many prediction tasks rely on effectively segmenting,matching,and time series data alignment.For instance,regardless of time series with the same g...Accurate forecasting of time series is crucial across various domains.Many prediction tasks rely on effectively segmenting,matching,and time series data alignment.For instance,regardless of time series with the same granularity,segmenting them into different granularity events can effectively mitigate the impact of varying time scales on prediction accuracy.However,these events of varying granularity frequently intersect with each other,which may possess unequal durations.Even minor differences can result in significant errors when matching time series with future trends.Besides,directly using matched events but unaligned events as state vectors in machine learning-based prediction models can lead to insufficient prediction accuracy.Therefore,this paper proposes a short-term forecasting method for time series based on a multi-granularity event,MGE-SP(multi-granularity event-based short-termprediction).First,amethodological framework for MGE-SP established guides the implementation steps.The framework consists of three key steps,including multi-granularity event matching based on the LTF(latest time first)strategy,multi-granularity event alignment using a piecewise aggregate approximation based on the compression ratio,and a short-term prediction model based on XGBoost.The data from a nationwide online car-hailing service in China ensures the method’s reliability.The average RMSE(root mean square error)and MAE(mean absolute error)of the proposed method are 3.204 and 2.360,lower than the respective values of 4.056 and 3.101 obtained using theARIMA(autoregressive integratedmoving average)method,as well as the values of 4.278 and 2.994 obtained using k-means-SVR(support vector regression)method.The other experiment is conducted on stock data froma public data set.The proposed method achieved an average RMSE and MAE of 0.836 and 0.696,lower than the respective values of 1.019 and 0.844 obtained using the ARIMA method,as well as the values of 1.350 and 1.172 obtained using the k-means-SVR method.展开更多
Accurate origin–destination(OD)demand prediction is crucial for the efficient operation and management of urban rail transit(URT)systems,particularly during a pandemic.However,this task faces several limitations,incl...Accurate origin–destination(OD)demand prediction is crucial for the efficient operation and management of urban rail transit(URT)systems,particularly during a pandemic.However,this task faces several limitations,including real-time availability,sparsity,and high-dimensionality issues,and the impact of the pandemic.Consequently,this study proposes a unified framework called the physics-guided adaptive graph spatial–temporal attention network(PAG-STAN)for metro OD demand prediction under pandemic conditions.Specifically,PAG-STAN introduces a real-time OD estimation module to estimate real-time complete OD demand matrices.Subsequently,a novel dynamic OD demand matrix compression module is proposed to generate dense real-time OD demand matrices.Thereafter,PAG-STAN leverages various heterogeneous data to learn the evolutionary trend of future OD ridership during the pandemic.Finally,a masked physics-guided loss function(MPG-loss function)incorporates the physical quantity information between the OD demand and inbound flow into the loss function to enhance model interpretability.PAG-STAN demonstrated favorable performance on two real-world metro OD demand datasets under the pandemic and conventional scenarios,highlighting its robustness and sensitivity for metro OD demand prediction.A series of ablation studies were conducted to verify the indispensability of each module in PAG-STAN.展开更多
With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning ...With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning and operating traffic structures.This study proposed an improved ensemble-based deep learning method to solve traffic volume prediction problems.A set of optimal hyperparameters is also applied for the suggested approach to improve the performance of the learning process.The fusion of these methodologies aims to harness ensemble empirical mode decomposition’s capacity to discern complex traffic patterns and long short-term memory’s proficiency in learning temporal relationships.Firstly,a dataset for automatic vehicle identification is obtained and utilized in the preprocessing stage of the ensemble empirical mode decomposition model.The second aspect involves predicting traffic volume using the long short-term memory algorithm.Next,the study employs a trial-and-error approach to select a set of optimal hyperparameters,including the lookback window,the number of neurons in the hidden layers,and the gradient descent optimization.Finally,the fusion of the obtained results leads to a final traffic volume prediction.The experimental results show that the proposed method outperforms other benchmarks regarding various evaluation measures,including mean absolute error,root mean squared error,mean absolute percentage error,and R-squared.The achieved R-squared value reaches an impressive 98%,while the other evaluation indices surpass the competing.These findings highlight the accuracy of traffic pattern prediction.Consequently,this offers promising prospects for enhancing transportation management systems and urban infrastructure planning.展开更多
The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment ...The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment risk.The quantification of investment sentiment indicators and the persistent analysis of their impact has been a complex and significant area of research.In this paper,a structured multi-head attention stock index prediction method based adaptive public opinion sentiment vector is proposed.The proposedmethod utilizes an innovative approach to transform numerous investor comments on social platforms over time into public opinion sentiment vectors expressing complex sentiments.It then analyzes the continuous impact of these vectors on the market through the use of aggregating techniques and public opinion data via a structured multi-head attention mechanism.The experimental results demonstrate that the public opinion sentiment vector can provide more comprehensive feedback on market sentiment than traditional sentiment polarity analysis.Furthermore,the multi-head attention mechanism is shown to improve prediction accuracy through attention convergence on each type of input information separately.Themean absolute percentage error(MAPE)of the proposedmethod is 0.463%,a reduction of 0.294% compared to the benchmark attention algorithm.Additionally,the market backtesting results indicate that the return was 24.560%,an improvement of 8.202% compared to the benchmark algorithm.These results suggest that themarket trading strategy based on thismethod has the potential to improve trading profits.展开更多
To tackle the problem of inaccurate short-term bus load prediction,especially during holidays,a Transformer-based scheme with tailored architectural enhancements is proposed.First,the input data are clustered to reduc...To tackle the problem of inaccurate short-term bus load prediction,especially during holidays,a Transformer-based scheme with tailored architectural enhancements is proposed.First,the input data are clustered to reduce complexity and capture inherent characteristics more effectively.Gated residual connections are then employed to selectively propagate salient features across layers,while an attention mechanism focuses on identifying prominent patterns in multivariate time-series data.Ultimately,a pre-trained structure is incorporated to reduce computational complexity.Experimental results based on extensive data show that the proposed scheme achieves improved prediction accuracy over comparative algorithms by at least 32.00%consistently across all buses evaluated,and the fitting effect of holiday load curves is outstanding.Meanwhile,the pre-trained structure drastically reduces the training time of the proposed algorithm by more than 65.75%.The proposed scheme can efficiently predict bus load results while enhancing robustness for holiday predictions,making it better adapted to real-world prediction scenarios.展开更多
With the continuous advancement of China’s“peak carbon dioxide emissions and Carbon Neutrality”process,the proportion of wind power is increasing.In the current research,aiming at the problem that the forecasting m...With the continuous advancement of China’s“peak carbon dioxide emissions and Carbon Neutrality”process,the proportion of wind power is increasing.In the current research,aiming at the problem that the forecasting model is outdated due to the continuous updating of wind power data,a short-term wind power forecasting algorithm based on Incremental Learning-Bagging Deep Hybrid Kernel Extreme Learning Machine(IL-Bagging-DHKELM)error affinity propagation cluster analysis is proposed.The algorithm effectively combines deep hybrid kernel extreme learning machine(DHKELM)with incremental learning(IL).Firstly,an initial wind power prediction model is trained using the Bagging-DHKELM model.Secondly,Euclidean morphological distance affinity propagation AP clustering algorithm is used to cluster and analyze the prediction error of wind power obtained from the initial training model.Finally,the correlation between wind power prediction errors and Numerical Weather Prediction(NWP)data is introduced as incremental updates to the initial wind power prediction model.During the incremental learning process,multiple error performance indicators are used to measure the overall model performance,thereby enabling incremental updates of wind power models.Practical examples show the method proposed in this article reduces the root mean square error of the initial model by 1.9 percentage points,indicating that this method can be better adapted to the current scenario of the continuous increase in wind power penetration rate.The accuracy and precision of wind power generation prediction are effectively improved through the method.展开更多
BACKGROUND Endometrial cancer(EC)is a common gynecological malignancy that typically requires prompt surgical intervention;however,the advantage of surgical management is limited by the high postoperative recurrence r...BACKGROUND Endometrial cancer(EC)is a common gynecological malignancy that typically requires prompt surgical intervention;however,the advantage of surgical management is limited by the high postoperative recurrence rates and adverse outcomes.Previous studies have highlighted the prognostic potential of circulating tumor DNA(ctDNA)monitoring for minimal residual disease in patients with EC.AIM To develop and validate an optimized ctDNA-based model for predicting shortterm postoperative EC recurrence.METHODS We retrospectively analyzed 294 EC patients treated surgically from 2015-2019 to devise a short-term recurrence prediction model,which was validated on 143 EC patients operated between 2020 and 2021.Prognostic factors were identified using univariate Cox,Lasso,and multivariate Cox regressions.A nomogram was created to predict the 1,1.5,and 2-year recurrence-free survival(RFS).Model performance was assessed via receiver operating characteristic(ROC),calibration,and decision curve analyses(DCA),leading to a recurrence risk stratification system.RESULTS Based on the regression analysis and the nomogram created,patients with postoperative ctDNA-negativity,postoperative carcinoembryonic antigen 125(CA125)levels of<19 U/mL,and grade G1 tumors had improved RFS after surgery.The nomogram’s efficacy for recurrence prediction was confirmed through ROC analysis,calibration curves,and DCA methods,highlighting its high accuracy and clinical utility.Furthermore,using the nomogram,the patients were successfully classified into three risk subgroups.CONCLUSION The nomogram accurately predicted RFS after EC surgery at 1,1.5,and 2 years.This model will help clinicians personalize treatments,stratify risks,and enhance clinical outcomes for patients with EC.展开更多
Purpose-To optimize train operations,dispatchers currently rely on experience for quick adjustments when delays occur.However,delay predictions often involve imprecise shifts based on known delay times.Real-time and a...Purpose-To optimize train operations,dispatchers currently rely on experience for quick adjustments when delays occur.However,delay predictions often involve imprecise shifts based on known delay times.Real-time and accurate train delay predictions,facilitated by data-driven neural network models,can significantly reduce dispatcher stress and improve adjustment plans.Leveraging current train operation data,these models enable swift and precise predictions,addressing challenges posed by train delays in high-speed rail networks during unforeseen events.Design/methodology/approach-This paper proposes CBLA-net,a neural network architecture for predicting late arrival times.It combines CNN,Bi-LSTM,and attention mechanisms to extract features,handle time series data,and enhance information utilization.Trained on operational data from the Beijing-Tianjin line,it predicts the late arrival time of a target train at the next station using multidimensional input data from the target and preceding trains.Findings-This study evaluates our model’s predictive performance using two data approaches:one considering full data and another focusing only on late arrivals.Results show precise and rapid predictions.Training with full data achieves aMAEof approximately 0.54 minutes and a RMSEof 0.65 minutes,surpassing the model trained solely on delay data(MAE:is about 1.02 min,RMSE:is about 1.52 min).Despite superior overall performance with full data,the model excels at predicting delays exceeding 15 minutes when trained exclusively on late arrivals.For enhanced adaptability to real-world train operations,training with full data is recommended.Originality/value-This paper introduces a novel neural network model,CBLA-net,for predicting train delay times.It innovatively compares and analyzes the model’s performance using both full data and delay data formats.Additionally,the evaluation of the network’s predictive capabilities considers different scenarios,providing a comprehensive demonstration of the model’s predictive performance.展开更多
The research focuses on improving predictive accuracy in the financial sector through the exploration of machine learning algorithms for stock price prediction. The research follows an organized process combining Agil...The research focuses on improving predictive accuracy in the financial sector through the exploration of machine learning algorithms for stock price prediction. The research follows an organized process combining Agile Scrum and the Obtain, Scrub, Explore, Model, and iNterpret (OSEMN) methodology. Six machine learning models, namely Linear Forecast, Naive Forecast, Simple Moving Average with weekly window (SMA 5), Simple Moving Average with monthly window (SMA 20), Autoregressive Integrated Moving Average (ARIMA), and Long Short-Term Memory (LSTM), are compared and evaluated through Mean Absolute Error (MAE), with the LSTM model performing the best, showcasing its potential for practical financial applications. A Django web application “Predict It” is developed to implement the LSTM model. Ethical concerns related to predictive modeling in finance are addressed. Data quality, algorithm choice, feature engineering, and preprocessing techniques are emphasized for better model performance. The research acknowledges limitations and suggests future research directions, aiming to equip investors and financial professionals with reliable predictive models for dynamic markets.展开更多
The stock market, as one of the hotspots in the financial field, forms a data system with a huge volume of data and complex relationships between various factors, making stock price prediction an area of keen interest...The stock market, as one of the hotspots in the financial field, forms a data system with a huge volume of data and complex relationships between various factors, making stock price prediction an area of keen interest for further in-depth mining and research. Mathematical statistics methods struggle to deal with nonlinear relationships in practical applications, making it difficult to explore deep information about stocks. Meanwhile, machine learning methods, particularly neural network models and composite models, which have achieved outstanding results in other fields, are being applied to the stock market with significant results. However, researchers have found that these methods do not grasp the essential information of the data as well as expected. In response to these issues, researchers are exploring better neural network models and combining them with other methods to analyze stock data. Thus, this paper proposes the ABiGRU composite model, which combines the attention mechanism and bidirectional gated recurrent unit (GRU) that can effectively extract data features for stock price prediction research. Models such as LSTM, GRU, and Bi-LSTM are selected for comparative experiments. To ensure the credibility and representativeness of the research data, daily stock price indices of BYD are chosen for closing price prediction studies across different models. The results show that the ABiGRU model has a lower prediction error and better fitting effect on three index-based stock prices, enhancing the learning efficiency of the neural network model and demonstrating good prediction stability. This suggests that the ABiGRU model is highly adaptable for stock price prediction.展开更多
The stock market is a vital component of the broader financial system,with its dynamics closely linked to economic growth.The challenges associated with analyzing and forecasting stock prices have persisted since the ...The stock market is a vital component of the broader financial system,with its dynamics closely linked to economic growth.The challenges associated with analyzing and forecasting stock prices have persisted since the inception of financial markets.By examining historical transaction data,latent opportunities for profit can be uncovered,providing valuable insights for both institutional and individual investors to make more informed decisions.This study focuses on analyzing historical transaction data from four banks to predict closing price trends.Various models,including decision trees,random forests,and Long Short-Term Memory(LSTM)networks,are employed to forecast stock price movements.Historical stock transaction data serves as the input for training these models,which are then used to predict upward or downward stock price trends.The study’s empirical results indicate that these methods are effective to a degree in predicting stock price movements.The LSTM-based deep neural network model,in particular,demonstrates a commendable level of predictive accuracy.This conclusion is reached following a thorough evaluation of model performance,highlighting the potential of LSTM models in stock market forecasting.The findings offer significant implications for advancing financial forecasting approaches,thereby improving the decision-making capabilities of investors and financial institutions.展开更多
The numerical simulation and slope stability prediction are the focus of slope disaster research.Recently,machine learning models are commonly used in the slope stability prediction.However,these machine learning mode...The numerical simulation and slope stability prediction are the focus of slope disaster research.Recently,machine learning models are commonly used in the slope stability prediction.However,these machine learning models have some problems,such as poor nonlinear performance,local optimum and incomplete factors feature extraction.These issues can affect the accuracy of slope stability prediction.Therefore,a deep learning algorithm called Long short-term memory(LSTM)has been innovatively proposed to predict slope stability.Taking the Ganzhou City in China as the study area,the landslide inventory and their characteristics of geotechnical parameters,slope height and slope angle are analyzed.Based on these characteristics,typical soil slopes are constructed using the Geo-Studio software.Five control factors affecting slope stability,including slope height,slope angle,internal friction angle,cohesion and volumetric weight,are selected to form different slope and construct model input variables.Then,the limit equilibrium method is used to calculate the stability coefficients of these typical soil slopes under different control factors.Each slope stability coefficient and its corresponding control factors is a slope sample.As a result,a total of 2160 training samples and 450 testing samples are constructed.These sample sets are imported into LSTM for modelling and compared with the support vector machine(SVM),random forest(RF)and convo-lutional neural network(CNN).The results show that the LSTM overcomes the problem that the commonly used machine learning models have difficulty extracting global features.Furthermore,LSTM has a better prediction performance for slope stability compared to SVM,RF and CNN models.展开更多
The growing global requirement for food and the need for sustainable farming in an era of a changing climate and scarce resources have inspired substantial crop yield prediction research.Deep learning(DL)and machine l...The growing global requirement for food and the need for sustainable farming in an era of a changing climate and scarce resources have inspired substantial crop yield prediction research.Deep learning(DL)and machine learning(ML)models effectively deal with such challenges.This research paper comprehensively analyses recent advancements in crop yield prediction from January 2016 to March 2024.In addition,it analyses the effectiveness of various input parameters considered in crop yield prediction models.We conducted an in-depth search and gathered studies that employed crop modeling and AI-based methods to predict crop yield.The total number of articles reviewed for crop yield prediction using ML,meta-modeling(Crop models coupled with ML/DL),and DL-based prediction models and input parameter selection is 125.We conduct the research by setting up five objectives for this research and discussing them after analyzing the selected research papers.Each study is assessed based on the crop type,input parameters employed for prediction,the modeling techniques adopted,and the evaluation metrics used for estimatingmodel performance.We also discuss the ethical and social impacts of AI on agriculture.However,various approaches presented in the scientific literature have delivered impressive predictions,they are complicateddue to intricate,multifactorial influences oncropgrowthand theneed for accuratedata-driven models.Therefore,thorough research is required to deal with challenges in predicting agricultural output.展开更多
Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep lear...Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep learning has largely contributed to the elevation of the prediction performance.Currently,the most up-to-date review of advanced machine learning techniques for financial time series prediction is still lacking,making it challenging for finance domain experts and relevant practitioners to determine which model potentially performs better,what techniques and components are involved,and how themodel can be designed and implemented.This review article provides an overview of techniques,components and frameworks for financial time series prediction,with an emphasis on state-of-the-art deep learning models in the literature from2015 to 2023,including standalonemodels like convolutional neural networks(CNN)that are capable of extracting spatial dependencies within data,and long short-term memory(LSTM)that is designed for handling temporal dependencies;and hybrid models integrating CNN,LSTM,attention mechanism(AM)and other techniques.For illustration and comparison purposes,models proposed in recent studies are mapped to relevant elements of a generalized framework comprised of input,output,feature extraction,prediction,and related processes.Among the state-of-the-artmodels,hybrid models like CNNLSTMand CNN-LSTM-AM in general have been reported superior in performance to stand-alone models like the CNN-only model.Some remaining challenges have been discussed,including non-friendliness for finance domain experts,delayed prediction,domain knowledge negligence,lack of standards,and inability of real-time and highfrequency predictions.The principal contributions of this paper are to provide a one-stop guide for both academia and industry to review,compare and summarize technologies and recent advances in this area,to facilitate smooth and informed implementation,and to highlight future research directions.展开更多
The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended inter...The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended intervals and time delays in time series data.Additionally,the sequence-to-sequence(Seq2Seq)model,known for handling temporal relationships,adapting to variable-length sequences,effectively capturing historical information,and accommodating various influencing factors,emerges as a robust and flexible tool in discharge forecasting.In this study,we introduce the application of LSTM-based Seq2Seq models for the first time in forecasting the discharge of a tidal reach of the Changjiang River(Yangtze River)Estuary.This study focuses on discharge forecasting using three key input characteristics:flow velocity,water level,and discharge,which means the structure of multiple input and single output is adopted.The experiment used the discharge data of the whole year of 2020,of which the first 80%is used as the training set,and the last 20%is used as the test set.This means that the data covers different tidal cycles,which helps to test the forecasting effect of different models in different tidal cycles and different runoff.The experimental results indicate that the proposed models demonstrate advantages in long-term,mid-term,and short-term discharge forecasting.The Seq2Seq models improved by 6%-60%and 5%-20%of the relative standard deviation compared to the harmonic analysis models and improved back propagation neural network models in discharge prediction,respectively.In addition,the relative accuracy of the Seq2Seq model is 1%to 3%higher than that of the LSTM model.Analytical assessment of the prediction errors shows that the Seq2Seq models are insensitive to the forecast lead time and they can capture characteristic values such as maximum flood tide flow and maximum ebb tide flow in the tidal cycle well.This indicates the significance of the Seq2Seq models.展开更多
Stock market forecasting has drawn interest from both economists and computer scientists as a classic yet difficult topic.With the objective of constructing an effective prediction model,both linear and machine learni...Stock market forecasting has drawn interest from both economists and computer scientists as a classic yet difficult topic.With the objective of constructing an effective prediction model,both linear and machine learning tools have been investigated for the past couple of decades.In recent years,recurrent neural networks(RNNs)have been observed to perform well on tasks involving sequence-based data in many research domains.With this motivation,we investigated the performance of long-short term memory(LSTM)and gated recurrent units(GRU)and their combination with the attention mechanism;LSTM+Attention,GRU+Attention,and LSTM+GRU+Attention.The methods were evaluated with stock data from three different stock indices:the KSE 100 index,the DSE 30 index,and the BSE Sensex.The results were compared to other machine learning models such as support vector regression,random forest,and k-nearest neighbor.The best results for the three datasets were obtained by the RNN-based models combined with the attention mechanism.The performances of the RNN and attention-based models are higher and would be more effective for applications in the business industry.展开更多
Stock trend prediction is a challenging problem because it involves many variables.Aiming at the problem that some existing machine learning techniques, such as random forest(RF), probabilistic random forest(PRF), k-n...Stock trend prediction is a challenging problem because it involves many variables.Aiming at the problem that some existing machine learning techniques, such as random forest(RF), probabilistic random forest(PRF), k-nearest neighbor(KNN), and fuzzy KNN(FKNN), have difficulty in accurately predicting the stock trend(uptrend or downtrend) for a given date, a generalized Heronian mean(GHM) based FKNN predictor named GHM-FKNN was proposed.GHM-FKNN combines GHM aggregation function with the ideas of the classical FKNN approach.After evaluation, the comparison results elucidated that GHM-FKNN outperformed the other best existing methods RF, PRF, KNN and FKNN on independent test datasets corresponding to three stocks, namely AAPL, AMZN and NFLX.Compared with RF, PRF, KNN and FKNN, GHM-FKNN achieved the best performance with accuracy of 62.37% for AAPL, 58.25% for AMZN, and 64.10% for NFLX.展开更多
The aim of the present work is to examine whether the price volatility of nonferrous metal futures can be used to predict the aggregate stock market returns in China. During a sample period from January of 2004 to Dec...The aim of the present work is to examine whether the price volatility of nonferrous metal futures can be used to predict the aggregate stock market returns in China. During a sample period from January of 2004 to December of 2011, empirical results show that the price volatility of basic nonferrous metals is a good predictor of value-weighted stock portfolio at various horizons in both in-sample and out-of-sample regressions. The predictive power of metal copper volatility is greater than that of aluminum. The results are robust to alternative measurements of variables and econometric approaches. After controlling several well-known macro pricing variables, the predictive power of copper volatility declines but remains statistically significant. Since the predictability exists only during our sample period, we conjecture that the stock market predictability by metal price volatility is partly driven by commodity financialization.展开更多
文摘Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on shorttime stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.
基金supported by National Natural Science Foundation of China,China(No.42004016)HuBei Natural Science Fund,China(No.2020CFB329)+1 种基金HuNan Natural Science Fund,China(No.2023JJ60559,2023JJ60560)the State Key Laboratory of Geodesy and Earth’s Dynamics self-deployment project,China(No.S21L6101)。
文摘Short-term(up to 30 days)predictions of Earth Rotation Parameters(ERPs)such as Polar Motion(PM:PMX and PMY)play an essential role in real-time applications related to high-precision reference frame conversion.Currently,least squares(LS)+auto-regressive(AR)hybrid method is one of the main techniques of PM prediction.Besides,the weighted LS+AR hybrid method performs well for PM short-term prediction.However,the corresponding covariance information of LS fitting residuals deserves further exploration in the AR model.In this study,we have derived a modified stochastic model for the LS+AR hybrid method,namely the weighted LS+weighted AR hybrid method.By using the PM data products of IERS EOP 14 C04,the numerical results indicate that for PM short-term forecasting,the proposed weighted LS+weighted AR hybrid method shows an advantage over both the LS+AR hybrid method and the weighted LS+AR hybrid method.Compared to the mean absolute errors(MAEs)of PMX/PMY sho rt-term prediction of the LS+AR hybrid method and the weighted LS+AR hybrid method,the weighted LS+weighted AR hybrid method shows average improvements of 6.61%/12.08%and 0.24%/11.65%,respectively.Besides,for the slopes of the linear regression lines fitted to the errors of each method,the growth of the prediction error of the proposed method is slower than that of the other two methods.
基金funded by the Fujian Province Science and Technology Plan,China(Grant Number 2019H0017).
文摘Accurate forecasting of time series is crucial across various domains.Many prediction tasks rely on effectively segmenting,matching,and time series data alignment.For instance,regardless of time series with the same granularity,segmenting them into different granularity events can effectively mitigate the impact of varying time scales on prediction accuracy.However,these events of varying granularity frequently intersect with each other,which may possess unequal durations.Even minor differences can result in significant errors when matching time series with future trends.Besides,directly using matched events but unaligned events as state vectors in machine learning-based prediction models can lead to insufficient prediction accuracy.Therefore,this paper proposes a short-term forecasting method for time series based on a multi-granularity event,MGE-SP(multi-granularity event-based short-termprediction).First,amethodological framework for MGE-SP established guides the implementation steps.The framework consists of three key steps,including multi-granularity event matching based on the LTF(latest time first)strategy,multi-granularity event alignment using a piecewise aggregate approximation based on the compression ratio,and a short-term prediction model based on XGBoost.The data from a nationwide online car-hailing service in China ensures the method’s reliability.The average RMSE(root mean square error)and MAE(mean absolute error)of the proposed method are 3.204 and 2.360,lower than the respective values of 4.056 and 3.101 obtained using theARIMA(autoregressive integratedmoving average)method,as well as the values of 4.278 and 2.994 obtained using k-means-SVR(support vector regression)method.The other experiment is conducted on stock data froma public data set.The proposed method achieved an average RMSE and MAE of 0.836 and 0.696,lower than the respective values of 1.019 and 0.844 obtained using the ARIMA method,as well as the values of 1.350 and 1.172 obtained using the k-means-SVR method.
基金supported by the National Natural Science Foundation of China(72288101,72201029,and 72322022).
文摘Accurate origin–destination(OD)demand prediction is crucial for the efficient operation and management of urban rail transit(URT)systems,particularly during a pandemic.However,this task faces several limitations,including real-time availability,sparsity,and high-dimensionality issues,and the impact of the pandemic.Consequently,this study proposes a unified framework called the physics-guided adaptive graph spatial–temporal attention network(PAG-STAN)for metro OD demand prediction under pandemic conditions.Specifically,PAG-STAN introduces a real-time OD estimation module to estimate real-time complete OD demand matrices.Subsequently,a novel dynamic OD demand matrix compression module is proposed to generate dense real-time OD demand matrices.Thereafter,PAG-STAN leverages various heterogeneous data to learn the evolutionary trend of future OD ridership during the pandemic.Finally,a masked physics-guided loss function(MPG-loss function)incorporates the physical quantity information between the OD demand and inbound flow into the loss function to enhance model interpretability.PAG-STAN demonstrated favorable performance on two real-world metro OD demand datasets under the pandemic and conventional scenarios,highlighting its robustness and sensitivity for metro OD demand prediction.A series of ablation studies were conducted to verify the indispensability of each module in PAG-STAN.
文摘With the advancement of artificial intelligence,traffic forecasting is gaining more and more interest in optimizing route planning and enhancing service quality.Traffic volume is an influential parameter for planning and operating traffic structures.This study proposed an improved ensemble-based deep learning method to solve traffic volume prediction problems.A set of optimal hyperparameters is also applied for the suggested approach to improve the performance of the learning process.The fusion of these methodologies aims to harness ensemble empirical mode decomposition’s capacity to discern complex traffic patterns and long short-term memory’s proficiency in learning temporal relationships.Firstly,a dataset for automatic vehicle identification is obtained and utilized in the preprocessing stage of the ensemble empirical mode decomposition model.The second aspect involves predicting traffic volume using the long short-term memory algorithm.Next,the study employs a trial-and-error approach to select a set of optimal hyperparameters,including the lookback window,the number of neurons in the hidden layers,and the gradient descent optimization.Finally,the fusion of the obtained results leads to a final traffic volume prediction.The experimental results show that the proposed method outperforms other benchmarks regarding various evaluation measures,including mean absolute error,root mean squared error,mean absolute percentage error,and R-squared.The achieved R-squared value reaches an impressive 98%,while the other evaluation indices surpass the competing.These findings highlight the accuracy of traffic pattern prediction.Consequently,this offers promising prospects for enhancing transportation management systems and urban infrastructure planning.
基金funded by the Major Humanities and Social Sciences Research Projects in Zhejiang higher education institutions,grant number 2023QN082,awarded to Cheng ZhaoThe National Natural Science Foundation of China also provided funding,grant number 61902349,awarded to Cheng Zhao.
文摘The present study examines the impact of short-term public opinion sentiment on the secondary market,with a focus on the potential for such sentiment to cause dramatic stock price fluctuations and increase investment risk.The quantification of investment sentiment indicators and the persistent analysis of their impact has been a complex and significant area of research.In this paper,a structured multi-head attention stock index prediction method based adaptive public opinion sentiment vector is proposed.The proposedmethod utilizes an innovative approach to transform numerous investor comments on social platforms over time into public opinion sentiment vectors expressing complex sentiments.It then analyzes the continuous impact of these vectors on the market through the use of aggregating techniques and public opinion data via a structured multi-head attention mechanism.The experimental results demonstrate that the public opinion sentiment vector can provide more comprehensive feedback on market sentiment than traditional sentiment polarity analysis.Furthermore,the multi-head attention mechanism is shown to improve prediction accuracy through attention convergence on each type of input information separately.Themean absolute percentage error(MAPE)of the proposedmethod is 0.463%,a reduction of 0.294% compared to the benchmark attention algorithm.Additionally,the market backtesting results indicate that the return was 24.560%,an improvement of 8.202% compared to the benchmark algorithm.These results suggest that themarket trading strategy based on thismethod has the potential to improve trading profits.
文摘To tackle the problem of inaccurate short-term bus load prediction,especially during holidays,a Transformer-based scheme with tailored architectural enhancements is proposed.First,the input data are clustered to reduce complexity and capture inherent characteristics more effectively.Gated residual connections are then employed to selectively propagate salient features across layers,while an attention mechanism focuses on identifying prominent patterns in multivariate time-series data.Ultimately,a pre-trained structure is incorporated to reduce computational complexity.Experimental results based on extensive data show that the proposed scheme achieves improved prediction accuracy over comparative algorithms by at least 32.00%consistently across all buses evaluated,and the fitting effect of holiday load curves is outstanding.Meanwhile,the pre-trained structure drastically reduces the training time of the proposed algorithm by more than 65.75%.The proposed scheme can efficiently predict bus load results while enhancing robustness for holiday predictions,making it better adapted to real-world prediction scenarios.
基金funded by Liaoning Provincial Department of Science and Technology(2023JH2/101600058)。
文摘With the continuous advancement of China’s“peak carbon dioxide emissions and Carbon Neutrality”process,the proportion of wind power is increasing.In the current research,aiming at the problem that the forecasting model is outdated due to the continuous updating of wind power data,a short-term wind power forecasting algorithm based on Incremental Learning-Bagging Deep Hybrid Kernel Extreme Learning Machine(IL-Bagging-DHKELM)error affinity propagation cluster analysis is proposed.The algorithm effectively combines deep hybrid kernel extreme learning machine(DHKELM)with incremental learning(IL).Firstly,an initial wind power prediction model is trained using the Bagging-DHKELM model.Secondly,Euclidean morphological distance affinity propagation AP clustering algorithm is used to cluster and analyze the prediction error of wind power obtained from the initial training model.Finally,the correlation between wind power prediction errors and Numerical Weather Prediction(NWP)data is introduced as incremental updates to the initial wind power prediction model.During the incremental learning process,multiple error performance indicators are used to measure the overall model performance,thereby enabling incremental updates of wind power models.Practical examples show the method proposed in this article reduces the root mean square error of the initial model by 1.9 percentage points,indicating that this method can be better adapted to the current scenario of the continuous increase in wind power penetration rate.The accuracy and precision of wind power generation prediction are effectively improved through the method.
文摘BACKGROUND Endometrial cancer(EC)is a common gynecological malignancy that typically requires prompt surgical intervention;however,the advantage of surgical management is limited by the high postoperative recurrence rates and adverse outcomes.Previous studies have highlighted the prognostic potential of circulating tumor DNA(ctDNA)monitoring for minimal residual disease in patients with EC.AIM To develop and validate an optimized ctDNA-based model for predicting shortterm postoperative EC recurrence.METHODS We retrospectively analyzed 294 EC patients treated surgically from 2015-2019 to devise a short-term recurrence prediction model,which was validated on 143 EC patients operated between 2020 and 2021.Prognostic factors were identified using univariate Cox,Lasso,and multivariate Cox regressions.A nomogram was created to predict the 1,1.5,and 2-year recurrence-free survival(RFS).Model performance was assessed via receiver operating characteristic(ROC),calibration,and decision curve analyses(DCA),leading to a recurrence risk stratification system.RESULTS Based on the regression analysis and the nomogram created,patients with postoperative ctDNA-negativity,postoperative carcinoembryonic antigen 125(CA125)levels of<19 U/mL,and grade G1 tumors had improved RFS after surgery.The nomogram’s efficacy for recurrence prediction was confirmed through ROC analysis,calibration curves,and DCA methods,highlighting its high accuracy and clinical utility.Furthermore,using the nomogram,the patients were successfully classified into three risk subgroups.CONCLUSION The nomogram accurately predicted RFS after EC surgery at 1,1.5,and 2 years.This model will help clinicians personalize treatments,stratify risks,and enhance clinical outcomes for patients with EC.
基金supported in part by the National Natural Science Foundation of China under Grant 62203468in part by the Technological Research and Development Program of China State Railway Group Co.,Ltd.under Grant Q2023X011+1 种基金in part by the Young Elite Scientist Sponsorship Program by China Association for Science and Technology(CAST)under Grant 2022QNRC001in part by the Youth Talent Program Supported by China Railway Society,and in part by the Research Program of China Academy of Railway Sciences Corporation Limited under Grant 2023YJ112.
文摘Purpose-To optimize train operations,dispatchers currently rely on experience for quick adjustments when delays occur.However,delay predictions often involve imprecise shifts based on known delay times.Real-time and accurate train delay predictions,facilitated by data-driven neural network models,can significantly reduce dispatcher stress and improve adjustment plans.Leveraging current train operation data,these models enable swift and precise predictions,addressing challenges posed by train delays in high-speed rail networks during unforeseen events.Design/methodology/approach-This paper proposes CBLA-net,a neural network architecture for predicting late arrival times.It combines CNN,Bi-LSTM,and attention mechanisms to extract features,handle time series data,and enhance information utilization.Trained on operational data from the Beijing-Tianjin line,it predicts the late arrival time of a target train at the next station using multidimensional input data from the target and preceding trains.Findings-This study evaluates our model’s predictive performance using two data approaches:one considering full data and another focusing only on late arrivals.Results show precise and rapid predictions.Training with full data achieves aMAEof approximately 0.54 minutes and a RMSEof 0.65 minutes,surpassing the model trained solely on delay data(MAE:is about 1.02 min,RMSE:is about 1.52 min).Despite superior overall performance with full data,the model excels at predicting delays exceeding 15 minutes when trained exclusively on late arrivals.For enhanced adaptability to real-world train operations,training with full data is recommended.Originality/value-This paper introduces a novel neural network model,CBLA-net,for predicting train delay times.It innovatively compares and analyzes the model’s performance using both full data and delay data formats.Additionally,the evaluation of the network’s predictive capabilities considers different scenarios,providing a comprehensive demonstration of the model’s predictive performance.
文摘The research focuses on improving predictive accuracy in the financial sector through the exploration of machine learning algorithms for stock price prediction. The research follows an organized process combining Agile Scrum and the Obtain, Scrub, Explore, Model, and iNterpret (OSEMN) methodology. Six machine learning models, namely Linear Forecast, Naive Forecast, Simple Moving Average with weekly window (SMA 5), Simple Moving Average with monthly window (SMA 20), Autoregressive Integrated Moving Average (ARIMA), and Long Short-Term Memory (LSTM), are compared and evaluated through Mean Absolute Error (MAE), with the LSTM model performing the best, showcasing its potential for practical financial applications. A Django web application “Predict It” is developed to implement the LSTM model. Ethical concerns related to predictive modeling in finance are addressed. Data quality, algorithm choice, feature engineering, and preprocessing techniques are emphasized for better model performance. The research acknowledges limitations and suggests future research directions, aiming to equip investors and financial professionals with reliable predictive models for dynamic markets.
文摘The stock market, as one of the hotspots in the financial field, forms a data system with a huge volume of data and complex relationships between various factors, making stock price prediction an area of keen interest for further in-depth mining and research. Mathematical statistics methods struggle to deal with nonlinear relationships in practical applications, making it difficult to explore deep information about stocks. Meanwhile, machine learning methods, particularly neural network models and composite models, which have achieved outstanding results in other fields, are being applied to the stock market with significant results. However, researchers have found that these methods do not grasp the essential information of the data as well as expected. In response to these issues, researchers are exploring better neural network models and combining them with other methods to analyze stock data. Thus, this paper proposes the ABiGRU composite model, which combines the attention mechanism and bidirectional gated recurrent unit (GRU) that can effectively extract data features for stock price prediction research. Models such as LSTM, GRU, and Bi-LSTM are selected for comparative experiments. To ensure the credibility and representativeness of the research data, daily stock price indices of BYD are chosen for closing price prediction studies across different models. The results show that the ABiGRU model has a lower prediction error and better fitting effect on three index-based stock prices, enhancing the learning efficiency of the neural network model and demonstrating good prediction stability. This suggests that the ABiGRU model is highly adaptable for stock price prediction.
文摘The stock market is a vital component of the broader financial system,with its dynamics closely linked to economic growth.The challenges associated with analyzing and forecasting stock prices have persisted since the inception of financial markets.By examining historical transaction data,latent opportunities for profit can be uncovered,providing valuable insights for both institutional and individual investors to make more informed decisions.This study focuses on analyzing historical transaction data from four banks to predict closing price trends.Various models,including decision trees,random forests,and Long Short-Term Memory(LSTM)networks,are employed to forecast stock price movements.Historical stock transaction data serves as the input for training these models,which are then used to predict upward or downward stock price trends.The study’s empirical results indicate that these methods are effective to a degree in predicting stock price movements.The LSTM-based deep neural network model,in particular,demonstrates a commendable level of predictive accuracy.This conclusion is reached following a thorough evaluation of model performance,highlighting the potential of LSTM models in stock market forecasting.The findings offer significant implications for advancing financial forecasting approaches,thereby improving the decision-making capabilities of investors and financial institutions.
基金funded by the National Natural Science Foundation of China (41807285)。
文摘The numerical simulation and slope stability prediction are the focus of slope disaster research.Recently,machine learning models are commonly used in the slope stability prediction.However,these machine learning models have some problems,such as poor nonlinear performance,local optimum and incomplete factors feature extraction.These issues can affect the accuracy of slope stability prediction.Therefore,a deep learning algorithm called Long short-term memory(LSTM)has been innovatively proposed to predict slope stability.Taking the Ganzhou City in China as the study area,the landslide inventory and their characteristics of geotechnical parameters,slope height and slope angle are analyzed.Based on these characteristics,typical soil slopes are constructed using the Geo-Studio software.Five control factors affecting slope stability,including slope height,slope angle,internal friction angle,cohesion and volumetric weight,are selected to form different slope and construct model input variables.Then,the limit equilibrium method is used to calculate the stability coefficients of these typical soil slopes under different control factors.Each slope stability coefficient and its corresponding control factors is a slope sample.As a result,a total of 2160 training samples and 450 testing samples are constructed.These sample sets are imported into LSTM for modelling and compared with the support vector machine(SVM),random forest(RF)and convo-lutional neural network(CNN).The results show that the LSTM overcomes the problem that the commonly used machine learning models have difficulty extracting global features.Furthermore,LSTM has a better prediction performance for slope stability compared to SVM,RF and CNN models.
文摘The growing global requirement for food and the need for sustainable farming in an era of a changing climate and scarce resources have inspired substantial crop yield prediction research.Deep learning(DL)and machine learning(ML)models effectively deal with such challenges.This research paper comprehensively analyses recent advancements in crop yield prediction from January 2016 to March 2024.In addition,it analyses the effectiveness of various input parameters considered in crop yield prediction models.We conducted an in-depth search and gathered studies that employed crop modeling and AI-based methods to predict crop yield.The total number of articles reviewed for crop yield prediction using ML,meta-modeling(Crop models coupled with ML/DL),and DL-based prediction models and input parameter selection is 125.We conduct the research by setting up five objectives for this research and discussing them after analyzing the selected research papers.Each study is assessed based on the crop type,input parameters employed for prediction,the modeling techniques adopted,and the evaluation metrics used for estimatingmodel performance.We also discuss the ethical and social impacts of AI on agriculture.However,various approaches presented in the scientific literature have delivered impressive predictions,they are complicateddue to intricate,multifactorial influences oncropgrowthand theneed for accuratedata-driven models.Therefore,thorough research is required to deal with challenges in predicting agricultural output.
基金funded by the Natural Science Foundation of Fujian Province,China (Grant No.2022J05291)Xiamen Scientific Research Funding for Overseas Chinese Scholars.
文摘Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep learning has largely contributed to the elevation of the prediction performance.Currently,the most up-to-date review of advanced machine learning techniques for financial time series prediction is still lacking,making it challenging for finance domain experts and relevant practitioners to determine which model potentially performs better,what techniques and components are involved,and how themodel can be designed and implemented.This review article provides an overview of techniques,components and frameworks for financial time series prediction,with an emphasis on state-of-the-art deep learning models in the literature from2015 to 2023,including standalonemodels like convolutional neural networks(CNN)that are capable of extracting spatial dependencies within data,and long short-term memory(LSTM)that is designed for handling temporal dependencies;and hybrid models integrating CNN,LSTM,attention mechanism(AM)and other techniques.For illustration and comparison purposes,models proposed in recent studies are mapped to relevant elements of a generalized framework comprised of input,output,feature extraction,prediction,and related processes.Among the state-of-the-artmodels,hybrid models like CNNLSTMand CNN-LSTM-AM in general have been reported superior in performance to stand-alone models like the CNN-only model.Some remaining challenges have been discussed,including non-friendliness for finance domain experts,delayed prediction,domain knowledge negligence,lack of standards,and inability of real-time and highfrequency predictions.The principal contributions of this paper are to provide a one-stop guide for both academia and industry to review,compare and summarize technologies and recent advances in this area,to facilitate smooth and informed implementation,and to highlight future research directions.
基金The National Natural Science Foundation of China under contract Nos 42266006 and 41806114the Jiangxi Provincial Natural Science Foundation under contract Nos 20232BAB204089 and 20202ACBL214019.
文摘The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended intervals and time delays in time series data.Additionally,the sequence-to-sequence(Seq2Seq)model,known for handling temporal relationships,adapting to variable-length sequences,effectively capturing historical information,and accommodating various influencing factors,emerges as a robust and flexible tool in discharge forecasting.In this study,we introduce the application of LSTM-based Seq2Seq models for the first time in forecasting the discharge of a tidal reach of the Changjiang River(Yangtze River)Estuary.This study focuses on discharge forecasting using three key input characteristics:flow velocity,water level,and discharge,which means the structure of multiple input and single output is adopted.The experiment used the discharge data of the whole year of 2020,of which the first 80%is used as the training set,and the last 20%is used as the test set.This means that the data covers different tidal cycles,which helps to test the forecasting effect of different models in different tidal cycles and different runoff.The experimental results indicate that the proposed models demonstrate advantages in long-term,mid-term,and short-term discharge forecasting.The Seq2Seq models improved by 6%-60%and 5%-20%of the relative standard deviation compared to the harmonic analysis models and improved back propagation neural network models in discharge prediction,respectively.In addition,the relative accuracy of the Seq2Seq model is 1%to 3%higher than that of the LSTM model.Analytical assessment of the prediction errors shows that the Seq2Seq models are insensitive to the forecast lead time and they can capture characteristic values such as maximum flood tide flow and maximum ebb tide flow in the tidal cycle well.This indicates the significance of the Seq2Seq models.
基金supported by NRPU Project No.20-16091awarded by Higher Education Commission,PakistanThe title of the project is“University Education and Occupational Skills Mismatch (A Case Study of SMEs in Khyber Pakhtunkhwa)”,by the National Natural Science Foundation of China (Grant No.61370073)the National High Technology Research and Development Program of China,the project of Science and Technology Department of Sichuan Province (Grant No.2021YFG0322).
文摘Stock market forecasting has drawn interest from both economists and computer scientists as a classic yet difficult topic.With the objective of constructing an effective prediction model,both linear and machine learning tools have been investigated for the past couple of decades.In recent years,recurrent neural networks(RNNs)have been observed to perform well on tasks involving sequence-based data in many research domains.With this motivation,we investigated the performance of long-short term memory(LSTM)and gated recurrent units(GRU)and their combination with the attention mechanism;LSTM+Attention,GRU+Attention,and LSTM+GRU+Attention.The methods were evaluated with stock data from three different stock indices:the KSE 100 index,the DSE 30 index,and the BSE Sensex.The results were compared to other machine learning models such as support vector regression,random forest,and k-nearest neighbor.The best results for the three datasets were obtained by the RNN-based models combined with the attention mechanism.The performances of the RNN and attention-based models are higher and would be more effective for applications in the business industry.
基金Supported by the National Key Research and Development Program (No.2019YFA0707201)the Key Work Program of Institute of Scientific and Technical Information of China (No.ZD2022-01,ZD2023-07)。
文摘Stock trend prediction is a challenging problem because it involves many variables.Aiming at the problem that some existing machine learning techniques, such as random forest(RF), probabilistic random forest(PRF), k-nearest neighbor(KNN), and fuzzy KNN(FKNN), have difficulty in accurately predicting the stock trend(uptrend or downtrend) for a given date, a generalized Heronian mean(GHM) based FKNN predictor named GHM-FKNN was proposed.GHM-FKNN combines GHM aggregation function with the ideas of the classical FKNN approach.After evaluation, the comparison results elucidated that GHM-FKNN outperformed the other best existing methods RF, PRF, KNN and FKNN on independent test datasets corresponding to three stocks, namely AAPL, AMZN and NFLX.Compared with RF, PRF, KNN and FKNN, GHM-FKNN achieved the best performance with accuracy of 62.37% for AAPL, 58.25% for AMZN, and 64.10% for NFLX.
基金Project(71071166)supported by the National Natural Science Foundation of China
文摘The aim of the present work is to examine whether the price volatility of nonferrous metal futures can be used to predict the aggregate stock market returns in China. During a sample period from January of 2004 to December of 2011, empirical results show that the price volatility of basic nonferrous metals is a good predictor of value-weighted stock portfolio at various horizons in both in-sample and out-of-sample regressions. The predictive power of metal copper volatility is greater than that of aluminum. The results are robust to alternative measurements of variables and econometric approaches. After controlling several well-known macro pricing variables, the predictive power of copper volatility declines but remains statistically significant. Since the predictability exists only during our sample period, we conjecture that the stock market predictability by metal price volatility is partly driven by commodity financialization.