摘要
异常点诊断是统计学中的经典问题.发现并减少异常点对纳税评估数据分析的影响是一项很有意义的研究.然而,通常的异常点诊断一般采用适用于单峰分布的全局识别方法.借鉴局部域相关积分(Local correlation integral)理论,提出基于非参数密度估计的识别方法.方法适用于多峰分布,能识别局域性质的异常点,对异常点占比较高的样本也有较强的识别能力.基于某市10 920个企业样本,实证分析对比研究了税务局目前使用的和建议的纳税评估方法,结果表明税务局采用的方法有较大的纳税评估风险(误判风险).
Outlier detection is a classical problem in statistics. It is a very meaningful research to find and reduce the effects on analysis of outlier on tax assessment data. However, the former outlier diagnosis generally applied the global recognition method which suits for the unimodal distribution. This paper adopts the theory of local correlation integral and proposes the detection method based on nonparametric density estimation. This method suits for the multimodal distribution, can detect the local outliner, and have strong recognition ability about the sample which has the high proportion of outliner. Based on the samples of 10920 enterprises, the empirical analysis compares the tax assessment 'methods used by Tax Bureau currently and proposed by this paper, and the result shows the method used by Tax Bureau has great risk of tax assessment (the misjudged risk).
出处
《数学的实践与认识》
CSCD
北大核心
2014年第16期141-149,共9页
Mathematics in Practice and Theory
基金
国家自然科学基金(71003100)
教育部人文社会科学研究一般项目(11YJC630270)
中央高校基本科研业务费专项资金(11XNK027
10XNF020)
关键词
异常点诊断
纳税评估
非参数密度估计
局部域相关积分
outlier detection
tax assessment
nonparametric density estimation
local correlation integral