期刊文献+

基于碱基组成和分布的DNA序列特征提取方法及应用 被引量:1

Feature extraction of DNA sequence based on the base composition and distribution and its applications
下载PDF
导出
摘要 通过特征提取方式挖掘生物信息数据中潜在的规律是生物信息学研究的基本问题之一。基于DNA序列的碱基转移概率、含量和位置比三类特征构造了24维特征向量,成功应用于11物种的β-珠蛋白基因完整编码序列和18哺乳动物线粒体基因组序列的相似性比较,构建的系统发生树与进化事实相符。基于该特征向量,结合支持向量机分类方法识别了28株细菌中的必需基因,平均AUC值高达0.808,高于部分识别方法。实验结果说明:生物序列基本构成元素的转移概率、含量和位置比可作为研究生物信息学中相关分类问题的选择性工具。 To exploit some potential rules in biological information data based on the feature extraction is one of the basic problems in bioinformatics.The constructed24-D feature vector is composed of base transition probabilities,base contents and base position ratios,and is applied to compare complete coding sequences of p-globin genes of11species and whole mitochondrial genomes of18eutherian mammals respectively.The derived phylogenetic trees are quite agreement with the evolutionary relationship.In addition,the essential genes of28bacteria are successfully identified by combining the feature vector and the support vector machine.The average AUC value is0.808,much higher than some other methods.The results of experiments demonstrate that the proposed three characteristics are alternative classifiers in related bioinformatics research.
作者 李玉双 魏东 吕艳芬 LI Yushuang;WEI Dong;LU Yanfen(School of Sciences, Yanshan University, Qinhuangdao, Hebei 066004, China)
机构地区 燕山大学理学院
出处 《燕山大学学报》 CAS 北大核心 2018年第1期59-66,74,共9页 Journal of Yanshan University
基金 河北省高等学校青年拔尖人才计划资助项目(BJ2014060) 燕山大学"新锐工程"人才支持计划项目
关键词 转移概率 特征向量 系统发生树 必需基因 支持向量机 transition probability feature vector phylogenetic tree essential gene support vector machine
  • 相关文献

参考文献2

二级参考文献78

  • 1窦运涛.基于必需基因数据库的微生物必需基因的分析[J].天津理工大学学报,2006,22(2):9-13. 被引量:1
  • 2Hu WQ,Sillaots S,Lemieux S,Davison J,Kauffman S,Breton A,Linteau A,Xin CL,Bowman J,Becker J,Jiang B,Roemer T.Essential gene identification and drug target prioritization in Aspergillus fumigatus.PLoS Pathog,2007,3(3):e24.
  • 3Hutchison CA III,Peterson SN,Gill SR,Cline RT,White O,Fraser CM,Smith HO,Venter1 JC.Global transposon mutagenesis and a minimal Mycoplasma genome.Science,1999,286(5447):2165–2169.
  • 4Ko KS,Lee JY,Song JH,Baek JY,Oh WS,Chun JK,Yoon HS.Screening of Essential genes in Staphylococcus aureus N315 using comparative genomics and allelic replacement mutagenesis.J Microbiol Biotechnol,2006,16(4):623–632.
  • 5Chaudhuri RR,Allen AG,Owen PJ,Shalom G,Stone K,Harrison M,Burgis TA,Lockyer M,Jorge GL,Foster SJ,Pleasance SJ,Peters SE,Maskell DJ,Charles IG.Comprehensive identification of essential Staphylococcus aureus genes using transposon-mediated differential hybridisation (TMDH).BMC Genomics,2009,10(1):291.
  • 6Sassetti CM,Boyd DH,Rubin EJ.Genes required for mycobacterial growth defined by high density mutagenesis.Mol Microbiol,2003,48(1):77–84.
  • 7Song JH,Ko KS,Lee JY,Baek JY,Oh WS,Yoon HS,Jeong JY,Chun J.Identification of essential genes in Streptococcus pneumoniae by allelic replacement mutagenesis.Mol Cells,2005,19(3):365–374.
  • 8de Berardinis V,Vallenet D,Castelli V,Besnard M,Pinet A,Cruaud C,Samair S,Lechaplais C,Gyapay G,Richez C,Durot M,Kreimeyer A,Fèvre FL,Sch-chter V,Pezo V,D-ring V,Scarpelli C,Médigue C,Cohen GN,Marlière P,Salanoubat M,Weissenbach J.A complete collection of single-gene deletion mutants of Acinetobacter baylyi ADP1.Mol Syst Biol,2008,4:174.
  • 9Seringhaus M,Paccanaro A,Borneman A,Snyder M,Gerstein M.Predicting essential genes in fungal genomes.Genome Res,2006,16(1):1126–1135.
  • 10Hirsh AE,Fraser HB.Protein dispensability and rate of evolution.Nature,2001,411(6841):1046–1049.

共引文献19

同被引文献11

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部