Predicting stock price movement direction is a challenging problem influenced by different factors and capricious events. The conventional stock price prediction machine learning models heavily rely on the internal fi...Predicting stock price movement direction is a challenging problem influenced by different factors and capricious events. The conventional stock price prediction machine learning models heavily rely on the internal financial features, especially the stock price history. However, there are many outside-of-company features that deeply interact with the companies’ stock price performance, especially during the COVID period. In this study, we selected 9 COVID vaccine companies and collected their relevant features over the past 20 months. We added handcrafted external information, including COVID-related statistics and company-specific vaccine progress information. We implemented, evaluated, and compared several machine learning models, including Multilayer Perceptron Neural Networks with logistic regression and decision trees with boosting and bagging algorithms. The results suggest that the application of feature engineering and data mining techniques can effectively enhance the performance of models predicting stock price movement during the COVID period. The results show that COVID-related handcrafted features help to increase the model prediction accuracy by 7.3% and AUROC by 6.5% on average. Further exploration showed that with data selection the decision tree model with gradient, boosting algorithm achieved 70% in AUROC and 66% in the accuracy.展开更多
股价波动研究依赖分析金融新闻数据集浅层特征,而忽略了金融新闻句子中单词之间的结构关系,从而导致股价波动预测研究效果不佳。针对该问题,提出了一种基于双流长短时记忆网络(long short term memory network,LSTM)神经网络的股价趋势...股价波动研究依赖分析金融新闻数据集浅层特征,而忽略了金融新闻句子中单词之间的结构关系,从而导致股价波动预测研究效果不佳。针对该问题,提出了一种基于双流长短时记忆网络(long short term memory network,LSTM)神经网络的股价趋势预测模型(Sent2Vec-DLSTM)。该模型的创新之处在于:提出了基于金融股票新闻数据集和哈佛IV-4情绪词典训练的情感词向量生成模型——Sent2Vec;提出了新型的双流LSTM神经网络(Dual-stream LSTM,DLSTM)。在实验中,首先用标普500指数历史数据以及爬取获得的金融类文章进行标普500指数的趋势预测,然后用VietStock新闻和来自Cophieu68的股票价格数据预测VN指数的变化趋势。结果表明,Sent2Vec-DLSTM相较于现有模型在股价趋势预测中具有更好的效果。展开更多
文摘Predicting stock price movement direction is a challenging problem influenced by different factors and capricious events. The conventional stock price prediction machine learning models heavily rely on the internal financial features, especially the stock price history. However, there are many outside-of-company features that deeply interact with the companies’ stock price performance, especially during the COVID period. In this study, we selected 9 COVID vaccine companies and collected their relevant features over the past 20 months. We added handcrafted external information, including COVID-related statistics and company-specific vaccine progress information. We implemented, evaluated, and compared several machine learning models, including Multilayer Perceptron Neural Networks with logistic regression and decision trees with boosting and bagging algorithms. The results suggest that the application of feature engineering and data mining techniques can effectively enhance the performance of models predicting stock price movement during the COVID period. The results show that COVID-related handcrafted features help to increase the model prediction accuracy by 7.3% and AUROC by 6.5% on average. Further exploration showed that with data selection the decision tree model with gradient, boosting algorithm achieved 70% in AUROC and 66% in the accuracy.
文摘股价波动研究依赖分析金融新闻数据集浅层特征,而忽略了金融新闻句子中单词之间的结构关系,从而导致股价波动预测研究效果不佳。针对该问题,提出了一种基于双流长短时记忆网络(long short term memory network,LSTM)神经网络的股价趋势预测模型(Sent2Vec-DLSTM)。该模型的创新之处在于:提出了基于金融股票新闻数据集和哈佛IV-4情绪词典训练的情感词向量生成模型——Sent2Vec;提出了新型的双流LSTM神经网络(Dual-stream LSTM,DLSTM)。在实验中,首先用标普500指数历史数据以及爬取获得的金融类文章进行标普500指数的趋势预测,然后用VietStock新闻和来自Cophieu68的股票价格数据预测VN指数的变化趋势。结果表明,Sent2Vec-DLSTM相较于现有模型在股价趋势预测中具有更好的效果。