磷酸化是蛋白质最重要的翻译后修饰之一.应用基于多样性增量的二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法对CK2,PKA和PKC三种类型磷酸化位点进行预测,k-fold交叉检验的正确率分别为86%,90%和...磷酸化是蛋白质最重要的翻译后修饰之一.应用基于多样性增量的二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法对CK2,PKA和PKC三种类型磷酸化位点进行预测,k-fold交叉检验的正确率分别为86%,90%和85%,独立测试集检验的正确率分别为86%,88%和84%.所得结果高于包括支持向量机在内的现有预测方法.展开更多
应用多样性增量结合二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法,对大肠杆菌σ70启动子进行识别。使用受试者操作特性(receiver operating characteristic,ROC)曲线和精度召回率曲线(Precisio...应用多样性增量结合二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法,对大肠杆菌σ70启动子进行识别。使用受试者操作特性(receiver operating characteristic,ROC)曲线和精度召回率曲线(Precision Recall Curves,PRC)进行性能评估。10-fold交叉检验给出,在正负集之比为1∶1时,ROC曲线下面积和PRC曲线下面积均为95%。结果表明,IDQD算法有能力应用于原核启动子的识别。识别精度高于现有算法。展开更多
应用多样性增量结合二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法,对酵母基因组中的核小体强/弱偏好序列进行了识别。10交叉检验的预测成功率超过了97%,受试者操作特性(receiver operating cha...应用多样性增量结合二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法,对酵母基因组中的核小体强/弱偏好序列进行了识别。10交叉检验的预测成功率超过了97%,受试者操作特性(receiver operating characteristic,ROC)曲线下面积达到了0.99,预测成功率高于现有SVM算法。最后利用构建好的分类器对酵母基因组中三类包含TATA盒基因的起始密码子ATG上游400nt下游100nt区域进行了分析。结果表明,IDQD算法有能力应用于基因组中核小体序列的识别。展开更多
对未知蛋白的功能注释是蛋白质组学的主要目标.一个关键的注释是蛋白质亚细胞定位的预测.应用基于多样性增量的二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法进行蛋白质亚细胞定位预测,对4个植...对未知蛋白的功能注释是蛋白质组学的主要目标.一个关键的注释是蛋白质亚细胞定位的预测.应用基于多样性增量的二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法进行蛋白质亚细胞定位预测,对4个植物定位类型和3个非植物定位类型,5-fold交叉检验的总精度分别为87%和91%,所得结果与已有模型相比,预测结果较好.展开更多
Topology of the transmembrane protein is closely related to its functions. So, it is desiderated to depict the topology of transmembrane proteins. In this work, an effective approach
组蛋白H2A的变体H2A.Z在基因的表达过程中发挥着重要的作用。根据H2A.Z和H2A核小体中组蛋白甲基化修饰方式的不同,作者应用多样性增量二次判别方法(increment of diversity with quadratic discriminant,IDQD)成功地对H2A.Z和H2A核小体...组蛋白H2A的变体H2A.Z在基因的表达过程中发挥着重要的作用。根据H2A.Z和H2A核小体中组蛋白甲基化修饰方式的不同,作者应用多样性增量二次判别方法(increment of diversity with quadratic discriminant,IDQD)成功地对H2A.Z和H2A核小体进行了识别,说明了以组蛋白甲基化信息作为特征参数的IDQD模型对H2A.Z和H2A核小体识别的有效性。通过计算DNA序列的柔性,发现H2A.Z核小体对应的DNA序列的平均柔性比常规H2A核小体对应的DNA序列的平均柔性弱。展开更多
In this paper, we first combine tetra-peptide structural words with contact number for protein secondary structure prediction. We used the method of increment of diversity combined with quadratic discriminant analysis...In this paper, we first combine tetra-peptide structural words with contact number for protein secondary structure prediction. We used the method of increment of diversity combined with quadratic discriminant analysis to predict the structure of central residue for a sequence fragment. The method is used tetra-peptide structural words and long- range contact number as information resources. The accuracy of Q3 is over 83% in 194 proteins. The accuracies of predicted secondary structures for 20 amino acid residues are ranged from 81% to 88%. Moreover, we have introduced the residue long-range contact, which directly indicates the separation of contacting residue in terms of the position in the sequence, and examined the negative influence of long-range residue interactions on predicting secondary structure in a protein. The method is also compared with existing prediction methods. The results show that our method is more effective in protein secondary structures prediction.展开更多
文摘磷酸化是蛋白质最重要的翻译后修饰之一.应用基于多样性增量的二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法对CK2,PKA和PKC三种类型磷酸化位点进行预测,k-fold交叉检验的正确率分别为86%,90%和85%,独立测试集检验的正确率分别为86%,88%和84%.所得结果高于包括支持向量机在内的现有预测方法.
文摘应用多样性增量结合二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法,对酵母基因组中的核小体强/弱偏好序列进行了识别。10交叉检验的预测成功率超过了97%,受试者操作特性(receiver operating characteristic,ROC)曲线下面积达到了0.99,预测成功率高于现有SVM算法。最后利用构建好的分类器对酵母基因组中三类包含TATA盒基因的起始密码子ATG上游400nt下游100nt区域进行了分析。结果表明,IDQD算法有能力应用于基因组中核小体序列的识别。
文摘对未知蛋白的功能注释是蛋白质组学的主要目标.一个关键的注释是蛋白质亚细胞定位的预测.应用基于多样性增量的二次判别分析(Increment of Diversity with Quadratic Discriminant analysis,IDQD)方法进行蛋白质亚细胞定位预测,对4个植物定位类型和3个非植物定位类型,5-fold交叉检验的总精度分别为87%和91%,所得结果与已有模型相比,预测结果较好.
基金supported by National Natural Science Foundation (No.90403010)
文摘Topology of the transmembrane protein is closely related to its functions. So, it is desiderated to depict the topology of transmembrane proteins. In this work, an effective approach
文摘组蛋白H2A的变体H2A.Z在基因的表达过程中发挥着重要的作用。根据H2A.Z和H2A核小体中组蛋白甲基化修饰方式的不同,作者应用多样性增量二次判别方法(increment of diversity with quadratic discriminant,IDQD)成功地对H2A.Z和H2A核小体进行了识别,说明了以组蛋白甲基化信息作为特征参数的IDQD模型对H2A.Z和H2A核小体识别的有效性。通过计算DNA序列的柔性,发现H2A.Z核小体对应的DNA序列的平均柔性比常规H2A核小体对应的DNA序列的平均柔性弱。
文摘In this paper, we first combine tetra-peptide structural words with contact number for protein secondary structure prediction. We used the method of increment of diversity combined with quadratic discriminant analysis to predict the structure of central residue for a sequence fragment. The method is used tetra-peptide structural words and long- range contact number as information resources. The accuracy of Q3 is over 83% in 194 proteins. The accuracies of predicted secondary structures for 20 amino acid residues are ranged from 81% to 88%. Moreover, we have introduced the residue long-range contact, which directly indicates the separation of contacting residue in terms of the position in the sequence, and examined the negative influence of long-range residue interactions on predicting secondary structure in a protein. The method is also compared with existing prediction methods. The results show that our method is more effective in protein secondary structures prediction.