Cla_Factor:一个基于支持向量机的人类转录因子分类方法

Cla_Factor: An Approach to Human Transcription Factors Classification with Support Vector Machine

下载PDF

导出

摘要转录因子识别对于理解转录机制起着重要作用,转录因子根据DNA绑定域的结构可以分为四大类.随着数据库中新蛋白序列的快速增加,设计一个高通量、高准确率的分类器来预测新蛋白是否转录因子及其类别是非常重要的,提出了一个基于支持向量机的人类转录因子分类算法Cla_Factor. Cla_Factor使用蛋白域作为向量基来表示蛋白质序列,在此高维向量表示方法下利用支持向量机来对人类转录因子分类.通过对来自于Transfac, Swiss_Prot的数据进行交叉验证测试、推广能力测试,证明了Cla_Factor算法同其他算法相比,具有更高准确率、敏感性、特异性以及推广能力.

作者周强陈越熊赟朱扬勇

机构地区复旦大学计算机与信息技术系

出处《计算机研究与发展》 EI CSCD 北大核心 2007年第z3期279-283,共6页 Journal of Computer Research and Development

基金国家"八六三"高技术研究发展计划基金项目(2006AA02Z329) 国家自然科学基金项目(60573093)

关键词转录因子支持向量机蛋白质功能域分类

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1[1]GS Manku,R Motwani.Approximate frequency counts over data streams.In:P Bernstein,Y Ioannidis,R Ramakrishnan,eds.In:Proc of the 28th Int'l Conf on Very Large Data Bases.San Francisco:Morgan Kaufmann Publishers,2002.346-357
2[2]PB Gibbons,Y Matias.New sampling-based summary statistics for improving approximate query answers.In:LM Haas,A Tiwary,eds.Proc of the ACM SIGMOD Int'l Conf on Management of Data.New York:ACM Press,1998.331-342
3[3]Hua-Fu Li,Suh-Yin Lee,Man-Kwan Shan.An efficient algorithm for mining frequent itemsets over the entire history of data streams.Int'l Workshop on Knowledge Discovery in Data Streams,Pisa,Italy,2004
4[4]M Charikar,K Chen,M Farach-Colton.Finding frequent items in data streams.In:P Widmayer,F T Ruiz,R M Bueno,eds.Proc of the Int'l Colloquium on Automata,Languages and Programming.Berlin:Springer-Verlag,2002.693-703
5[5]JX Yu,ZH Chong,HJ Lu,et al.False positive or false negative:Mining frequent Itemsets from high speed transactional data streams.In:MA Nascimento,D Kossmann,eds.Proc of the 30th Int'l Conf on Very Large Data Bases (VLDB 2004).San Francisco:Morgan Kaufmann,2004.204-215
6[6]J Han,P Jian,Y Yiwen.Mining frequent patterns without candidate generation.The 2000 ACM SIGMOD Int'l Conf on Management of Data.Dallas,USA,2000
7[7]Haixun Wang,Wei Fan,Philip S Yu,et al.Mining concept-drifting data streams using ensemble classifiers.In:Proc of the Int'l Conf on Knowledge Discovery and Data Mining (SIGKDD03).New York:ACM Press,2003
8[8]A Metwally,D Agrawal,AE Abbadi.Efficient computation of frequent and top-k elements in data streams.In:T Eiter,L Libkin,eds.Proc of the Int'l Conf on Data Theory.Berlin:Springer-Verlag,2005.398-412
9[9]R Agrawal,R Srikant.Fast algorithms for mining association rules.In:JB Bocca,M Jarke,C Zaniolo,eds.Proc of the 20th Int'l Conf on Very Large Data Bases.San Francisco:Morgan Kaufmann,1994.487-499

1林强,董平,林嘉宇.基于增量的ISOMAP算法研究[J].数字技术与应用,2015,33(5):125-127.
2张永亮,朱美正,李欣,郑昊.基于稠密与稀疏高程点的DEM插值算法[J].计算机工程与应用,2014,50(1):167-174. 被引量：4
3王雨.机场跑道的本体建模简介[J].科技风,2013(6):157-157.
4高翠珍,胡建龙,李德玉.保持局部邻域关系的增量Hessian LLE算法[J].计算机科学,2012,39(4):217-219. 被引量：2
5花的神明.为打印机巧妙加锁[J].网友世界,2011(2):17-17.
6薛亚平.浅谈Proteus软件在单片机实验教学中的应用[J].科技经济市场,2011(11):20-21.
7刘永定,阳爱民,邓河.基于机器学习的P2P流量分类系统设计与实现[J].微计算机信息,2009,25(33):129-131. 被引量：2
8朱明旱,罗大庸.基于逆迭代的增量LLE算法[J].计算机工程与应用,2010,46(17):176-178. 被引量：1
9陈昆.计算机技术在转录因子结合位点识别的研究及应用[J].电子技术与软件工程,2014(20):189-191.
10Matgorzata Jedynak,Ewa Jdzefowicz.ELF Interactions Among Chinese, Greek, and Swiss Speakers of English[J].Sino-US English Teaching,2014,11(1):40-58.

计算机研究与发展

2007年第z3期

浏览历史

内容加载中请稍等...

Cla_Factor:一个基于支持向量机的人类转录因子分类方法

参考文献9

相关作者

相关机构

相关主题

浏览历史