摘要
根据偏最小二乘法(PLS)建模中的回归系数等一些信息,筛选原始自变量,在不损失模型预报能力的前提下,除去冗余的或影响不大的一些原始自变量,使模型更简单。本研究中找到了用于删除变量的一种新判据,计算简单,使用效果好。研究结果表明,利用PL3法得到的删除变量的新判据筛选变量是一种非常实用和有效的变量筛选方法,该法非常适合处理海量数据或变量数很大的建模问题,可使最终所得的模型中变量数大大减少,使模型大大简化,因而便于分析和解决实际问题。在处理中药指纹图谱数据时,与传统的算法比较,模型得到了大大简化。
The information including regression coefficients etc. from PLS modeling has been used to select original regression variables, to eliminate some unimportant or uninformative variables and to obtain the simpler model without loss prediction power. The new criteria of deleting unimportant variables has been found, which can be easily calculated and used. The results indicate that the variable selection method adopted the new criteria of deleting unimportant variables from PLS is very practical and effective. This method is very suitable to deal with the modeling problem with huge data or contained much more variables. By use of the method the number of variables in the built model decreases largely and the built model simplified greatly, which hence is convenient to analysis and solve the practical problem. In dealing with the fingerprinting data of Chinese medicine the model has been simplified greatly by comparing with the traditional method.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2007年第6期741-745,共5页
Computers and Applied Chemistry
基金
福建省自然科学基金项目(Z0513003)
关键词
变量筛选
偏最小二乘法
回归分析
判别分析
variable selection, partial least squares, regression analysis, discrimination analysis