摘要
By sampling 100 encoding proteins from SARS-coronavirus (SARS-CoV, NC 004718) and other six coronaviruses and selecting 23 variables through stepwise multiple regression (SMR) from 172 variables, the multiple linear regression (MLR) model was established with good results of the quantitative modelling correlation coefficient R2 = 0.645 and the cross-validation correlation coefficient RCV = 0.375. After removing 4 outliers, the quantitative 2 modelling and cross-validation correlation coefficients were R2= 0.743 and RCV = 0.543, respectively.
By sampling 100 encoding proteins from SARS-coronavirus (SARS-CoV, NC 004718) and other six coronaviruses and selecting 23 variables through stepwise multiple regression (SMR) from 172 variables, the multiple linear regression (MLR) model was established with good results of the quantitative modelling correlation coefficient R2 = 0.645 and the cross-validation correlation coefficient RCV = 0.375. After removing 4 outliers, the quantitative 2 modelling and cross-validation correlation coefficients were R2= 0.743 and RCV = 0.543, respectively. 2
关键词
SARS-COV
SMR
NC
SARS-CoV,coronavirus,multiple linear regression (MLR),stepwise multiple regression (SMR),encoding protein,identification.