摘要
随着我国A股市场的不断发展,越来越多的公司经过'包装'走上了上市融资的道路。如何甄别上市公司是否值得投资变成了至关重要的问题。本文基于wind数据库的数据:财务数据,预警数据,舆情数据,技术指标数据,基本面指标数据等,通过SPSS,R等工具来实现因子分析、聚类、机器学习,从而来研究上市公司的经营风险。本文通过研究可得:①对于上市公司的财务因素数据,随机森林方法的学习准确率最高,可利用基尼系数指标选择相关显著得财务因素。②对于上市公司的经营风险,多分类的logistic回归能够较好拟合。③非结构化信息比如舆情指标以及基本技术指标和基本面指标也对模型有一定的贡献。④综合随机森林的基尼系数以及条件推断树的条件分割点来选取关键变量,从中选取权重可以构造衡量经营风险的综合评价指标。
With the continuous development of A stock market in our country,more and more companies have embarked on the road of listing and financing through'packaging'.How to identify whether a listed company is worth investing has become a crucial issue.The wind database includes financial data,based on warning data,public opinion data,technical index data,basic index data,through SPSS,R and other tools toachieve the factor analysis,clustering,machine learning,to study the listed company’s operating risk.Through the research we can get:1.For the financial factors of listed companies,the random forest method has the highest learning accuracy,and the Gini coefficient can be used to select the relevant significant financial factors.2.,for the listed company’s operational risk,the multi class logistic regression can be better fitted.3.unstructured information such as public opinion indicators as well as basic technical indicators and basic indicators also contribute to the model.4.,the Gini coefficient index of the integrated random forest and the conditional segmentation point of the conditional inference tree are selected to select the key variables,and the weights are constructed to evaluate the operational risk.
出处
《金融管理研究》
2018年第2期153-176,共24页
The Journal of Finance and Management Research
关键词
上市公司
经营风险
大数据
机器学习
Listed Company Operating Risk
Big Data
Machine Learning