摘要
以我国2010—2019年的A股上市银行年报为样本,利用LDA主题模型深度挖掘年报语义信息并构建银行年报的主题指标,在多种机器学习模型上对比主题指标与常用的财务指标、文本特征指标及其与主题指标的合并指标在检测上市银行违规时的性能。研究发现:年报文本主题内容对上市银行的违规行为有一定的预测作用,且与单一传统指标相比,主题指标可以提升传统指标的违规识别性能。研究结果为使用年报文本主题信息和机器学习方法识别上市银行违规的有效性提供了直接的证据,为市场构建了一种有效的违规识别指标体系,为审计师找到了一种较为高效的违规识别方法,有助于进一步规避与防范审计风险。
This paper takes the annual reports of A-share listed banks in China from 2010 to 2019 as the research sample,by using the LDA topic model to deeply mine the semantic information of Chinese annual reports and construct the topic measure of the banks annual reports,and compare the performance of topic measure with commonly used financial measure,text feature measure and their combined measure with topic measure in detecting frauds of listed banks on a variety of machine learning models.This paper found that the topic content of the Chinese annual report has a certain predictive effect on the frauds of listed banks,and compared with a single traditional indicator,the topic measure can improve the fraud detection performance of the traditional indicators.The results of the study provide direct evidence for the effectiveness of using annual report topic content information and machine learning methods to detect listed banks frauds,build a more effective fraud detection measure system for the Chinese market,and find a more efficient method for auditors,which is conducive to further avoiding and preventing audit risks.
作者
张熠
徐阳
李维萍
ZHANG Yi;XU Yang;LI Weiping(School of Information Engineering,Nanjing Audit University,Nanjing 211815,China)
出处
《审计与经济研究》
CSSCI
北大核心
2022年第5期107-116,共10页
Journal of Audit & Economics
基金
江苏省社会科学基金项目(21GLD009)。
关键词
上市公司违规识别
年度报告
LDA主题模型
机器学习
违规预测
财务报表
fraud detection of listed companies
annual report
LDA topic model
machine learning
fraud prediction
financial statements