概率密度函数的特征相关法DNA序列分析(英文)

Analysis of DNA Sequences Using Feature Correlation of Probability Density Function

下载PDF

导出

摘要提出了一种客观的特征提取和相关的方法用于DNA序列的结构分析.这种方法是从DNA序列码的碱基和片段码中提取统计特征和相关特征.然后计算样本序列和已知类之间的平均相关系数.如果最大的相关系数大于对应类的平均相关系数,则该样本被分类到对应的类中去.利用一组DNA序列样本做了试验,结果表明,这种方法适合于任何DNA序列的结构分析而不需要先念的生物信息,对发掘人类基因隐藏信息的研究大有用处。 Propose an unbiased method of feature extraction and classification for DNA sequence analysis. In the method, statistical and correlation features are extracted from raw DNA sequence data and the mean correlation features of a sample DNA sequence to all given classes are calculat- ed. If the maximal mean correlation feature exceeds the mean correlation feature of an existing class, the sample is grouped into the corresponding class. Otherwise, it is group into a new class- Using a set of sample DNA sequences, we demonstrate that the method is suitable for analysis of any DNA sequence data without a priori knowledge of functional information. Such approach should be useful in discovering conserved sequence elements in the human genome.

作者罗代升罗辑谢明吴晓红余艳梅

机构地区四川大学电子信息学院

出处《四川大学学报（自然科学版）》 CAS CSCD 北大核心 2006年第2期334-340,共7页 Journal of Sichuan University(Natural Science Edition)

关键词生物信息解译 DNA序列结构分析特征提取模式分类 bioinformatics DNA sequence analysis feature extraction pattern classification

分类号 TP392 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1VenterJ C,etal.The Sequence of the Human Genome[J].Science,2001,291:1304.
2Lander E S.Initial sequencing and analysis of the human genome[J].Nature,2001,409:860.
3Birney S.Ensembl 2006[EB/OL].Nucleic Acids Res.34:D556 -61 (2006).http://www.ensembl.org/index.html.
4Altschul S F.Basic local alignment search tool[J].J.Mol.Biol,1990,215:403.
5Mathog D.Fundamentals of Sequence Analysis,1998-1999,Lecture 10[EB/OL].http://seqaxp.bio.caltech.edu/www/seqanalysis/lecture-10.html
6Tompa M.Lecture Notes on Biological Sequence Analysis,Technical Report,2000[EB/OL].http://www.cs.uml.edu/bioin formatics/resources/Lectures/tompa001ecture.pdf.
7Hertz Z G,Stormo G D.Identifying DNA and Protein Patterns with Statistically Significant Alignments of Multiple Sequences[J].Bioinformatics,1999,15 (7/8):563.
8Castleman K R.Digital Image Processing[M].USA:Prentice Hall,1998.
9Ruan Q.Digital Image Processing[M].USA:Publishing House of Electronics Industry,2001.

1罗代升,膝奇志,余艳梅.用于DNA序列结构分析的特征抽取方法(英文)[J].四川大学学报（自然科学版）,2005,42(1):87-92.
2詹青.DNA序列分析中的信息熵应用现状[J].生物信息学,2012,10(1):44-49. 被引量：1
3邱洪君,毛国君,罗春雨.数据挖掘技术在DNA序列分割中的应用[J].计算机应用研究,2006,23(6):23-25.
4卢炎生,崔得暄,邹磊.特征序列分析方法在文本分类中的应用[J].计算机工程,2006,32(20):92-94.
5Jia-WeiHan,JianPei,Xi-FengYan.从序列模式挖掘到结构模式挖掘：一种模式扩展方法[J].Journal of Computer Science & Technology,2004,19(C00):3-3.
6魏大木,陶宏才,李伟,李斌.数据挖掘在基因芯片中的探索[J].中国科技博览,2010(9):252-252.
7基因让你会跳舞[J].科学（中文版）,2006(5):7-7.
8基因让你会跳舞[J].生命世界,2006(3):9-9.
9李文举,梅丽,信润海,韦丽华.基于KL散度和BP神经网络的人类基因启动子识别[J].辽宁师范大学学报（自然科学版）,2010,33(1):42-45. 被引量：2
10马彪,吴东月,高强.人类SNP数据库系统的构建与应用研究[J].仪器仪表用户,2016,23(6):31-33.

四川大学学报（自然科学版）

2006年第2期

浏览历史

内容加载中请稍等...

概率密度函数的特征相关法DNA序列分析(英文)

参考文献9

相关作者

相关机构

相关主题

浏览历史