期刊文献+

年报文本信息质量与财务违规预测——基于结构化主题模型的机器学习方法 被引量:2

Measuring Financial Information Quality and Forecasting Financial Fraud: Based on a Structural Topic Model Machine Learning Design
原文传递
导出
摘要 现有研究发现引入LDA(latent Dirichlet allocation)主题模型能够提高对公司财务违规的预测效果.为了进一步探讨主体模型中主题因子预测能力的来源,本文以2008—2019年我国3,397家A股上市公司18,220份年度报告为样本,在以往LDA的基础上加入公司、经理、宏观基本面变量作为主题选择变量,加入违规标签作为情景变量,对年报信息质量进行分析并提取高质量和低质量主题因子.本文基于半监督思想的结构化主题模型(structural topic model)的实证结果表明,基于STM的财务违规预测模型优于基于LDA、词频和财务指标的财务违规预测模型,其错误分类成本降低13%以上.进一步研究发现主题因子的预测能力更多地来源于公司规模、年龄、杠杆、固定资产占比等公司特征,而非反映经理特征的有关变量.本文的预测模型不仅能够预测重大违规事件,也能在精确度较高的前提下给出覆盖率较高的违规公司或安全投资标的.本文的研究在实践上对监管机构监测违规公司和投资者构建安全投资标的池具有重要参考价值. Current research has found that the introduction of latent Dirichlet allocation(LDA)topic modeling can improve the prediction of corporate financial fraud.To further explore the source of predictive ability in the topic model,this study uses a sample of 18,220 annual reports from 3,397 A-share listed companies in China from 2008 to 2019.Building upon previous LDA models,the study incorporates company,manager,and macro fundamental variables as topic selection variables,and includes a fraud label as content variable to analyze the quality of annual report information and extract high-quality and low-quality topic factors.The empirical results of this study show that the semi-supervised STM-based financial fraud prediction model outperforms models based on LDA,word frequency,and financial indicators,reducing misclassification costs by more than 13%.Further research reveals that the predictive ability of topic factors is more closely related to company characteristics such as size,age,leverage,and proportion of PPE,rather than variables reflecting managerial characteristics.The proposed predictive model not only predicts major frauds but also provides a higher coverage of violating companies or safe investment targets with high accuracy.The findings of this study have important practical implications for regulatory agencies monitoring financial frauds and investors constructing safe investment portfolios.
作者 李广众 高庆 杨海生 陈少凌 Guangzhong LI;Qing GAO;Haisheng YANG;Shaoling CHEN(Business School,Sun Yat-sen University,Shenzhen 518107,China;Lingnan College,Sun Yat-sen University,Guangzhou 510275,China;School of Economics,Jinan University,Guangzhou 510632,China)
出处 《计量经济学报》 CSSCI CSCD 2023年第4期1032-1062,共31页 China Journal of Econometrics
基金 国家社会科学基金重大项目(21&ZD143) 国家自然科学基金(72173141) 广东省自然科学基金(2023A1515012434) 教育部人文社会科学研究规划基金(21YJA790005)。
关键词 财务违规预测 信息质量 结构化主题模型 文本分析 机器学习 financial fraud financial information quality structural topic model textual analysis machine learning
  • 相关文献

参考文献5

二级参考文献65

共引文献94

同被引文献68

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部