基于希尔伯特分形的基因组序列压缩算法被引量：2

The Genome Sequence Compression Algorithm Based on the Hilbert Grouping

下载PDF

导出

摘要给出一种基于希尔伯特分形的基因组序列压缩算法.为充分利用碱基间的相关性,算法首先使用希尔伯特分形曲线将基因组序列从一维映射到二维,从而得到映射图像.再对映射图像使用Context加权建模熵编码技术进行压缩.在Context加权中,权值的确定与各Context模型对应的描述长度有关.当接收端收到压缩图像后,对其进行解码,然后根据拟希尔伯特逆矩阵将映射图像转为一维,从而获得基因组序列.实验结果表明,尽管基于希尔伯特空间填充的二维基因组Context建模会引入无效编码区,但最终的压缩结果要略好于其他直接进行Context建模的算法. The genome compression algorithm based on the Hilbert grouping is proposed to fully utilize the correlations among the basic groups.The Hilbert grouping curve is first used in algorithm to map the genome sequence from one dimension into a new 2-D to obtain the image then compressed by the Context weighting modeling encode technology.In Context weighting,the values of weights are decid-ed by the corresponding description length of the Context models.When the receiver obtains the compressed image and decoded,the supposed Hilbert inverse matrix is used to turn the mapping image into one dimension so as to get genome sequence.The experiments results indicate that although the valid coding area will be led by 2-D genome sequence Context modeling based on the Hilbert space filling,the final compression results by our algorithm are a bit better than other results by the direct Context modeling algorithm.

作者陈旻王开云吴建国李建军

机构地区云南大学信息学院昆明学院学报编辑部云南警官学院信息网络安全学院

出处《昆明学院学报》 2014年第6期42-46,65,共6页 Journal of Kunming University

基金云南省自然科学基金青年基金资助项目(2013FD042) 云南大学研究生重点科研基金资助项目(ynuy201383)

关键词基因组压缩希尔伯特空间填充 Context加权描述长度 genome sequence compression Hilbert space filling context weighting description length

分类号 TP919.1 [自动化与计算机技术]

引文网络
相关文献

同被引文献16

1GRUMBACH S,TAHIF.CompressionofDNA sequences[C]//ProcDataCompressionConference.Snowbird:IEEEComputerSociety,1993:340-350.
2GRUMBACHS,TAHIF.Anewchallengeforcompressionalgorithms:Geneticsequences[J].InformationProcessing&Management,1994,30(6):866-875.
3RIVALSE,DELAHAYEJP,DAUCHETM,etal.Aguaranteedcompression schemeforrepetitiveDNA sequences[C]//ProcDataCompressionConference.Snowbird:IEEEComputerSociety,1996:453-471.
4CHENX,KWONGS,LIM.A compressionalgorithm forDNAsequencesanditsapplicationsingenomecomparison[C]//ProceedingsoftheFourthAnnualInternationalConferenceonComputationalMolecularBiology.New York:NY,2000:107-116.
5CHENX,LIM,MAB,etal.DNAcompress:FastandeffectiveDNAsequencecompression[J].Bioinformatics,2002,18(2):1696-1698.
6BEHZADIB,FESSANTFL.DNAcompressionchallengerevisited:Adynamicprogrammingapproach[J].CombinatorialPatternMatching,2005,353:190-200.
7MATSUMOTO T,SADAKANE K,IMAIH.Biologicalsequencecompressionalgorithms[J].GenomeInformatics,2000,11:43-52.
8TABUSI,KORODIG,RISSANENJ.DNAsequencecompressionusingthenormalizedmaxi-mumlikelihoodmodelfordiscreteregression[C]//ProcDataCompressionConference.Snowbird:IEEEComputerSociety,2003:253-263.
9KORODIG,TABUSI.Anefficientnormalizedmaximumlikelihoodalgorithm forDNA sequencecompression[J].ACMTransInfSyst,2005,23(1):3-34.
10SOLIMAN TH A.A losslesscompressionalgorithm forDNAsequence[J].JBioinformaticsResearchandApplication,2009,5(6):593-602.

引证文献2

1陈旻,王开云.基于分布式信源编码的微生物基因组序列压缩算法[J].昆明学院学报,2015,37(6):106-111.
2陈旻,王开云,刘建平,李红梅,曹好顺,彭宇.喷泉码在物联网信道中的压缩方法研究[J].昆明学院学报,2016,38(6):52-55.

1刘发升,周学毛.一种基于粗糙集带支持信息的挖掘算法[J].计算技术与自动化,2003,22(4):37-40. 被引量：1
2王付艳,卜春芬,陈旻.基于动态规划算法的最优Context量化器设计[J].昆明学院学报,2015,37(6):116-120.
3陈旻,王开云,贾学明,赵卿.Context模型奇异测度及其在量化中的应用[J].昆明学院学报,2015,37(3):105-109. 被引量：2
4肖卫东,孙扬,赵翔,周城,封孝生.层次信息可视化技术研究综述[J].小型微型计算机系统,2011,32(1):137-146. 被引量：35
5王鑫,刘恩海.基于Context模型与Wiener滤波的小波变换图像去噪方法[J].科技和产业,2014,14(12):166-169.
6薛乃玉,王玉德,赵焕利.基于Context模型的小波变换阈值自适应图像去噪[J].计算机工程与应用,2013,49(4):227-230. 被引量：4
7李楠,肖克炎,阴江宁,范建福,王琨.表面模型缓冲区分析方法[J].计算机辅助设计与图形学学报,2015,27(9):1625-1636. 被引量：9
8张民,于会山,张德伟,张英俊.一种基于分形编码的数字水印算法[J].微计算机信息,2009,25(36):86-87. 被引量：1
9孙国栋,吉书鹏,周桢.基于小波和Context模型的海面红外弱小目标检测[J].红外技术,2010,32(2):97-100. 被引量：5
10陈耀文,刘伟文,沈智威,黄静霞,吴仁华.基于Contourlet域Context模型的磁共振图像去噪方法[J].中国体视学与图像分析,2008,13(2):116-120.

昆明学院学报

2014年第6期

浏览历史

内容加载中请稍等...

基于希尔伯特分形的基因组序列压缩算法被引量：2

同被引文献16

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于希尔伯特分形的基因组序列压缩算法 被引量：2

同被引文献16

引证文献2

相关作者

相关机构

相关主题

浏览历史

基于希尔伯特分形的基因组序列压缩算法被引量：2