期刊文献+

基于加权支持向量机的膜蛋白类型预测中不平衡问题处理 被引量:2

Solving the Problem of Imbalanced Dataset in the Prediction of Membrane Protein Types Based on Weighted SVM
下载PDF
导出
摘要 针对膜蛋白类型预测中普遍存在的不平衡样本问题,分析一般支持向量机(SVM)在处理不平衡样本时的缺陷,引入加权SVM来补偿由于训练集中的类别差异引起的分类结果偏向于多样本类别的问题.采用统计预测中的一致测试、交叉校验和独立测试方法进行测试.实验表明,不平衡处理后的效果非常理想,该方法可以成为现有方法一个有效的补充分析工具. To deal with the common imbalance problem in the prediction of membrane protein types, the weighted-SVM was introduced to compensate the bias toward large size training set caused by the imbalance size of imbalance training sets. The self-consistency test, jackknife test, and independent dataset test were conducted. The results indicate that the current approach may serve as a powerful complemental tool to other existing methods.
出处 《上海交通大学学报》 EI CAS CSCD 北大核心 2005年第10期1676-1679,1684,共5页 Journal of Shanghai Jiaotong University
基金 国家自然科学基金项目(50174038 30170274) 上海高校优秀青年教师后备人选科研项目(03YQHB020)
关键词 生物膜蛋白分类预测 生物信息学 支持向量机 不平衡样本 membrane protein prediction bioinformatics support vector machine (SVM) imbalance sample
  • 相关文献

参考文献10

  • 1Burges C J C. A tutorial on support vector machines for pattern recognition[J]. Data Mining and Knowledge Discovery, 1998,2(2) :955- 974.
  • 2Cai Y D, Liu X J, Xu X B, et al. Support vector machines for predicting protein structural class [J].Bioinformatics, 2001, 221(1):115-120.
  • 3Hua S J, Sun Z. Support vector machine approach for protein subcellular localization prediction [J].Bioinformatics, 2001, 17(8) : 721- 728.
  • 4郭宗明,张治洲,潘宇曦,黄振德,冯国鄞,贺林.利用支持向量机预测生物膜蛋白类型[J].上海交通大学学报,2004,38(5):806-809. 被引量:6
  • 5Raskutti B, Kowalczyk A. Extreme re-balancing for SVM's: A case study [A]. Proceedings of the ICML'03 Workshop on Learning from Imbalanced Data Sets[C]. NY: ACM Press , 2004.60-69.
  • 6An A J, Cercone N, Huang X J. A case study for learning from imbalanced data sets[A]. Advances in Artificial Intelligence: Proceedings of the 14th Biennial Conference of the Canadian Society for Computational Studies of Intelligence [ C ]. London;Springer-Verlag, 2001. 1-15.
  • 7Chou K C, Elrod D W. Prediction of membrane protein types and subcellular locations [J]. Proteins:Structure, Function, and Genetics, 1999, 34(1):137-153.
  • 8Chou K C. Prediction of protein cellular attributes using pseudo-amino acid composition [J]. Proteins:Structure, Function, and Genetics. 2001, 43(3):246-255.
  • 9Chou K C. A novel approach to predicting structural classes in a (20-1)-D amino acid composition space[J]. Proteins: Structure, Function and Genetics,1995, 21(4):319-344.
  • 10Hsu C W, Lin C J. A comparison of methods for multi-class support vector machines [J ]. IEEE Transactions on Neural Networks, 2002,13: 415 -425.

二级参考文献14

  • 1Chou K C,Elrod D W.Prediction of membrane protein types and subcellular locations [J].Proteins:Structure,Function,and Genetics,1999,34(1):137-153.
  • 2Chou K C.Prediction of protein cellular attributes using pseudo-amino acid composition [J].Proteins:Structure,Function,and Genetics,2001,43 (3):246-255.
  • 3CaiY D,Liu X J,Chou K C.Artificial neural network model for predicting membrane protein types[J].J Biomol Struct Dyn,2001,18(4):607-610.
  • 4Feng Z P,Zhang C T.Prediction of membrane protein types based on the hydrophobic index of amino acids [J].J Protein Chem,2000,19(4):269-275.
  • 5Yang X G,Feng Z P.Predicting membrane protein types using residue-pair models based on reduced similarity dataset [J].J Biomol Struct Dyn,2002,20(2):163-172.
  • 6Chou K C,Zhang C T.Prediction of protein structural classes [J].Critical Reviews in Biochemistry and Molecular Biology,1995,30(4):275-349.
  • 7CaiYD,LiuXJ,XuXB,etal.Support vector machines for predicting protein structural class [J].Bioinformatics,2001,221(1):115-120.
  • 8Chou K C,Cai Y D.Using functional domain composition and support vector machines for prediction of protein subcellular location [J].Journal of Biological Chemistry,2002,277(48):45765- 45769.
  • 9Nakashima H,Nishikawa K,Ooi T.The folding type of a protein is relevant to the amino acid composition [J].J Biochem,1986,99(1):152-162.
  • 10Cedano J,Aloy P,Querol E,et al.Relation between amino acid composition and cellular location of proteins [J].JMolBiol,1997,266(3):594-600.

共引文献5

同被引文献28

  • 1纪延光,徐启华,韩之俊.基于支持向量机的R&D项目过程质量度量[J].中国管理科学,2004,12(6):62-67. 被引量:7
  • 2杜小芳,张金隆.农产品销量预测的支持向量机方法[J].中国管理科学,2005,13(4):129-134. 被引量:15
  • 3范玉刚,李平,宋执环.动态加权最小二乘支持向量机[J].控制与决策,2006,21(10):1129-1133. 被引量:34
  • 4Suykens J A K, VanDewale J. Least squares support vector machine classifiers [J].Neural Processing Letters, 1999, 9 (3) ,293-300.
  • 5Suykens J A K, De Brabanter J, Lukas L, et al. Weighted least squares support vector machines robustness and sparse approximation [J]. Neuro-computing, 2002,48 : 85-105.
  • 6Sarimveis H, Alexandridis A, Mazarakis S, et al. A fast training algorithm for RBF networks based on substraetive clustering [J]. Neuro-computing, 2003,51 ; 501-505.
  • 7LIN Chunfu, WANG Shengde. Fuzzy support vector machines[J].IEEE Trains on Neural Networks, 2002,13(2) : 464-471.
  • 8CHEW H G, BOGNER R E, LIM C C. Dual V-support vector machine with error rate and training size beasing [C]// Proceedings of 2001 IEEE Int Conf on Acoustics, Speech, and Signal Processing. Salt Lake City, USA: IEEE, 2001:1269-1272.
  • 9[1]Ernst R.,Cohen M.A..Operations related groups(ORGs):a clustering procedure for production/inventory systems[J].Journal of Operations Management,1990,9(4):574-598.
  • 10[2]Gajpal P.P.,Ganesh L.S.,Rajendmn C.Criticality analysis of spar parts using the analytic hierarchy process[J].Intemational Journal of Production Economics.1994.35(1-3):293-297.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部