期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
A Statistical Approach Designed for Finding Mathematically Defined Repeats in Shotgun Data and Determining the Length Distribution of Clone-Inserts 被引量:1
1
作者 LanZhong KunlinZhang +5 位作者 XiangangHuang peixiangni YujunHan KaiWang JunWang SonggangLi 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2003年第1期43-51,共9页
The large amount of repeats, especially high copy repeats, in the genomes of higher animals and plants makes whole genome assembly (WGA) quite difficult. In order to solve this problem, we tried to identify repeats an... The large amount of repeats, especially high copy repeats, in the genomes of higher animals and plants makes whole genome assembly (WGA) quite difficult. In order to solve this problem, we tried to identify repeats and mask them prior to assembly even at the stage of genome survey. It is known that repeats of different copy number have different probabilities of appearance in shotgun data, so based on this principle, we constructed a statistical model and inferred criteria for mathematically defined repeats (MDRs) at different shotgun coverages. According to these criteria, we developed software MDRmasker to identify and mask MDRs in shotgun data. With repeats masked prior to assembly, the speed of assembly was increased with lower error probability. In addition, clone-insert size affects the accuracy of repeat assembly and scaffold construction. We also designed length distribution of clone-inserts using our model. In our simulated genomes of human and rice, the length distribution of repeats is different, so their optimal length distributions of clone-inserts were not the same. Thus with optimal length distribution of clone-inserts, a given genome could be assembled better at lower coverage. 展开更多
关键词 mathematically denned repeat (MDR) clone-inserts assembly
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部