摘要
提出了一种基于MobileNetV2和VGG11组分拟合(CF-VGG11)卷积神经网络(CNN)与平行因子分析(PARAFAC)结合的水样分类和荧光组分拟合方法,通过输入单个三维荧光光谱(3D-EEM)数据来预测水样类别、溶解性有机物(DOM)质量浓度等级和荧光组分。算法以PARAFAC结果为基础建立荧光光谱数据集,分两步完成类别与组分的预测:第一步使用MobileNetV2算法对不同水样进行类别预测和DOM质量浓度分级;第二步使用CF-VGG11网络拟合荧光组分。采集地表水、工业废水处理水、污水处理厂进出口水和乡村饮用水4种类型的水样构建数据集,获得了95.83%的分类精度和98.11%的组分拟合精度。实验结果表明,所提方法可对不同水样和DOM质量浓度等级进行准确分类,拟合特定荧光组分,精确定位污染源,并能进行超标预警。
Objective The treatment of organic pollutants in surface water,drinking water,and wastewater is one of the urgent social problems to be solved in the development of human society.Three-dimensional excitation-emission matrix(3DEEM)fluorescence spectroscopy technology has been widely used to detect fluorescence components in surface water,sewage,and other samples.There are a lot of interference noises and fluorescence overlap information in the original 3DEEM data,so there is an urgent need for a fast and accurate method to extract and analyze the useful information in 3DEEM spectra.At present,parallel factor analysis(PARAFAC)is commonly used to decompose the overlapping fluorescence signals in 3D-EEM,but the analysis process of this method is complex,and the data set is strict,which greatly limits the on-line monitoring and analysis of samples.In this study,according to the results of PARAFAC,we propose a convolutional fast classification and recognition network model,which can quickly obtain water sample types,mass concentration grades,and fluorescent component maps by using only two convolutional neural network(CNN)models.As a result,it provides effective technical means for rapid detection of scenes such as surface water,drinking water,wastewater monitoring,and so on.Methods In this study,a method of water sample classification and fluorescence component fitting based on MobileNetV2,VGG11 component fitting(CF-VGG11)CNN,and PARAFAC is proposed.The 3D-EEM data of four types of water samples including surface water(DB),treated industrial wastewater(FS),sewage treatment plant inlet and outlet water(WS),and rural drinking water(XCYY)are collected,and the multi-output classification model of different water samples and the prediction and fitting model of fluorescence component maps are established with the results of PARAFAC as labels.The prediction of types and components is completed in two steps.In the first step,the MobileNetV2 algorithm is used to predict and classify different water samples.The second step is to use the CF-VGG11 network to fit the fluorescence component map.Results and Discussions The data sets of all kinds of water samples are analyzed by PARAFAC,and four fluorescence components are shown(Fig.6).Then,the PARAFAC results are uploaded to the OpenFluor database to obtain possible substances of various types of fluorescence components in water samples(Table 2).The similarity comparison scores of all components are more than 95%. Combined with the PARAFAC results as network labels, the MobileNetV2classification network and CF-VGG11 component fitting network obtain a classification accuracy of 95. 83% and acomponent fitting accuracy of 98. 11%, respectively (Table 3). In order to show that the trained model has goodclassification and fitting performance, a part of untrained 3D-EEM data is selected for the test, and the results show thatMobileNetV2 and CF-VGG11 can classify and fit the 3D-EEM of water samples very well (Fig. 7), and MobileNetV2and CF-VGG11 network models have certain advantages compared with PARAFAC in terms of time cost, datarequirement, and analysis process (Table 4).Conclusions In this study, a fast CNN classification and recognition algorithm based on fluorescence spectrum isproposed to predict the types and mass concentrations of different water samples, as well as the overlapping fluorescencecomponents in 3D-EEM. This study relies on PARAFAC for preliminary data preparation and MobileNetV2 network forclassification of water sample types and mass concentration grades, which can achieve water pollution traceability andexceedance warning, and the CF-VGG11 network is used to fit the fluorescence component map of water samples. Theresults show that the fast classification and identification network model based on the results of PARAFAC can quicklypredict the types and mass concentration grades of water samples and fit their specific fluorescence components by inputting3D-EEM data of a single water sample, and there is no need to repeat the complex PARAFAC. Therefore, this studyprovides certain theoretical support for detecting water pollution by three-dimensional fluorescence spectrometry and is of acertain practical significance.
作者
陈庆
汤斌
缪俊锋
周彦
龙邹荣
张金富
王建旭
周密
叶彬强
赵明富
钟年丙
Chen Qing;Tang Bin;Miao Junfeng;Zhou Yan;Long Zourong;Zhang Jinfu;Wang Jianxu;Zhou Mi;Ye Binqiang;Zhao Mingfu;Zhong Nianbing(Chongqing Key Laboratory of Fiber Optic Sensor and Photodetector,Chongqing University of Technology,Chongqing 400054,China;School of Microelectronics and Communication Engineering,Chongqing University,Chongqing 400044,China;Tongliang District Environmental Protection Bureau of Chongqing,Chongqing 402560,China)
出处
《光学学报》
EI
CAS
CSCD
北大核心
2023年第6期318-328,共11页
Acta Optica Sinica
基金
国家自然科学基金(61805029)
重庆市自然科学基金面上项目(cstc2020jcyj-msxmX0879)
重庆市高校创新研究群体项目(CXQT21035)
重庆市铜梁区科学技术局技术创新与应用发展专项(CCF20220623)
重庆理工大学科研启动经费资助项目(0107210299、0107200283)。
关键词
光谱学
三维荧光光谱
水污染
分类
卷积神经网络
spectroscopy
three-dimensional excitation-emission matrix fluorescence spectroscopy
water pollution
classification
convolution neural network