摘要
土壤是一种重要的法庭痕迹科学证据,可提供有价值的信息,在案件侦破和法庭审理中发挥关键作用。对于一个未知的土样,怎样确定其来源地,是一个值得研究的问题。分别从跨省和省内两种尺度,基于黑龙江、安徽和江苏三个省市的土壤可见-近红外波段光谱以及土壤化学数据,采用随机森林模型对土壤样本的来源地进行判别,比较了不同土壤测定数据集及其组合方案的判别效果,并分析了土壤化学属性和光谱数据在来源地判别中的相对重要性,以判别正确的样点占总样点数作为验证精度进行评价。结果发现:跨省尺度下,光谱主成分和化学数据结合建模判别验证精度最佳,为0.92;土壤光谱测量所需样品量少,当土壤物证材料量少,化学数据难以获取时,光谱主成分和吸收峰结合建模验证精度最高,为0.82。省内尺度下,依旧是光谱主成分和化学数据结合建模精度最佳,为0.83;在化学数据难以获取时,仅利用光谱主成分与吸收峰也取得了相当的精度(0.82),可见在省内尺度,可以利用光谱来替代化学数据进行建模判别。计算两种尺度下判别因子的重要性发现,跨省尺度下,影响模型判别的化学数据主要是土壤中的全钾、全磷,光谱数据主要是光谱第一主成分以及350~600nm与1800~2100nm波段的吸收峰;省内化学数据主要是全磷,光谱数据主要是第七主成分与1300~1600nm以及2100~2200nm波段的吸收峰。这表明,利用土壤可见-近红外光谱与化学数据可以有效地判别土壤的来源地。当模型的样点空间分布范围有差异时,可以考虑利用不同的判别因子建模和多个指标来评估判别结果。
【Objective】As an important kind of forensic evidence with valuable information, soil plays a key role in case detection and court trials. For an unknown soil sample, how to determine its source is an issue worth studying.【Method】 In this paper, a stochastical forest model was adopted to identify sources of soil samples based on vis–NIR spectra and soil chemical properties of the soils in Heilongjiang Anhui and Jiangsu, on a trans-provincial and a provincial scale;comparison performed of the usages of different soil datasets and combination schemes in effect of the identification;analysis conducted of relative importances of soil chemical attributes and spectral data;and evaluation made of determination accuracy based on ratio of the number of the samples correctly determined to the total number of samples.【Result】 Results show that the model combining spectral principal component (PC) and chemical data is the best one in determining sources of soil samples on the cross-provincial scale, with accuracy being 0.92. As spectral measurement does not require many soil samples, in the case the amount of soil samples is limited and soil chemical data is hard to obtain, the spectral-PC-and-absorption-peak-combining model is the highest in accuracy, reaching 0.82. On the provincial scale, the combination of spectral PC and soil chemical property data is still the best one with accuracy being 0.83. When soil chemical property data are hard to obtain, the spectral-PC-and-absorption-peak-combining model can achieve considerable accuracy (0.82), which indicates that spectra can be used to replace soil chemical property data in modeling for determination of sources of soils on the provincial scale. To evaluate importance of discriminant factors on the two scales, it is found that the contents of total potassium (TK) and total phosphorus (TP), the first PC of spectra and spectral absorption peaks at 350~600 nm and 1 800~ 2 100 nm band are the most important indices in the model for determination on the trans-provincial scale. While the content of TP and the seventh PC of spectra and spectral absorption peaks at 350~600 nm and 1 800~ 2 100 nm band were in the model for determination on the province scale.【Conclusion】All the findings indicate that source of a soil sample can be accurately identified based on vis–NIR spectroscopy and soil chemical property data. When spatial distribution of sampling sites varies in range in the model, it is advisable to consider the use of different determination factors in modeling and multiple indices in evaluating accuracy of the determination.
作者
张欣跃
赵玉国
刘峰
曾荣
高鸿
林卡
张甘霖
ZHANG Xinyue;ZHAO Yuguo;LIU Feng;ZENG Rong;GAO Hong;LIN Ka;ZHANG Ganlin(State Key Laboratory of Soil and Sustainable Agriculture,Institute of Soil Science,Chinese Academy of Sciences,Nanjing 210008,China;University of Chinese Academy of Sciences,Beijing 100049,China;School of Geographical Sciences,Nanjing University of Information Science & Technology,Nanjing 210044,China;School of Geographic Sciences,Nanjing Normal University,Key Laboratory of Virtual Geographic Environment,Ministry of Education,Nanjing Normal University,Nanjing 210023,China)
出处
《土壤学报》
CAS
CSCD
北大核心
2019年第5期1060-1071,共12页
Acta Pedologica Sinica
基金
国家重点研发计划(2017YFC0803807)
科技部基础性工作重点专项(2014FY110200)
公安部物证鉴定中心现场物证溯源技术国家工程实验室开放课题(2017NELKFKT03)、公安部物证鉴定中心协同创新工作项目(2016XTCX03)共同资助~~
关键词
土壤光谱
化学属性
来源地
随机森林
Soil spectrum
Chemical data
Originated location
Random forests