Soybean(Glycine max)stands as a globally significant agricultural crop,and the comprehensive assembly of its genome is of paramount importance for unraveling its biological characteristics and evolutionary history.Nev...Soybean(Glycine max)stands as a globally significant agricultural crop,and the comprehensive assembly of its genome is of paramount importance for unraveling its biological characteristics and evolutionary history.Nevertheless,previous soybean genome assemblies have harbored gaps and incompleteness,which have constrained in-depth investigations into soybean.Here,we present Telomere-to-Telomere(T2T)assembly of the Chinese soybean cultivar Zhonghuang 13(ZH13)genome,termed ZH13-T2T,utilizing PacBio Hifi and ONT ultralong reads.We employed a multi-assembler approach,integrating Hifiasm,NextDenovo,and Canu,to minimize biases and enhance assembly accuracy.The assembly spans 1,015,024,879 bp,effectively resolving all 393 gaps that previously plagued the reference genome.Our annotation efforts identified 50,564 high-confidence protein-coding genes,707 of which are novel.ZH13-T2T revealed longer chromosomes,421 not-aligned regions(NARs),112 structure variations(SVs),and a substantial expansion of repetitive element compared to earlier assemblies.Specifically,we identified 25.67 Mb of tandem repeats,an enrichment of 5S and 48S rDNAs,and characterized their genotypic diversity.In summary,we deliver the first complete Chinese soybean cultivar T2T genome.The comprehensive annotation,along with precise centromere and telomere characterization,as well as insights into structural variations,further enhance our understanding of soybean genetics and evolution.展开更多
Sorghum(Sorghum bicolor(L.)Moench)is a world cereal crop used in China for producing Baijiu,a distilled spirit.We report a telomere-to-telomere genome assembly of the Baijiu cultivar Hongyingzi,HYZ-T2T,using ultralong...Sorghum(Sorghum bicolor(L.)Moench)is a world cereal crop used in China for producing Baijiu,a distilled spirit.We report a telomere-to-telomere genome assembly of the Baijiu cultivar Hongyingzi,HYZ-T2T,using ultralong reads.The 10 chromosome pairs contained 33,462 genes,of which 93%were functionally annotated.The 20 telomeres and 10 centromeric regions on the HYZ-T2T chromosomes were predicted and two consecutive large inversions on chromosome 2 were characterized.A 65-gene reconstruction of the metabolic pathway of tannins,the flavor substances in Baijiu,was performed and may advance the breeding of sorghum cultivars for Baijiu production.展开更多
Alkaline soils pose an increasing problem for agriculture worldwide,but using stress-tolerant plants as green manure can improve marginal land.Here,we show that the legume Sesbania cannabina is very tolerant to alkali...Alkaline soils pose an increasing problem for agriculture worldwide,but using stress-tolerant plants as green manure can improve marginal land.Here,we show that the legume Sesbania cannabina is very tolerant to alkaline conditions and,when used as a green manure,substantially improves alkaline soil.To understand genome evolution and the mechanisms of stress tolerance in this allotetraploid legume,we generated the first telomere-to-telomere genome assembly of S.cannabina spanning~2,087 Mb.The assembly included all centromeric regions,which contain centromeric satellite repeats,and complete chromosome ends with telomeric characteristics.Further genome analysis distinguished A and B subgenomes,which diverged approximately 7.9 million years ago.Comparative genomic analysis revealed that the chromosome homoeologs underwent large-scale inversion events(>10 Mb)and a significant,transposon-driven size expansion of the chromosome 5A homoeolog.We further identified four specific alkali-induced phosphate transporter genes in S.cannabina;these may function in alkali tolerance by relieving the deficiency in available phosphorus in alkaline soil.Our work highlights the significance of S.cannabina as a green tool to improve marginal lands and sheds light on subgenome evolution and adaptation to alkaline soils.展开更多
Since its initial release in 2001,the human reference genome has undergone continuous improvement in quality,and the recently released telomere-to-telomere(T2T)version-T2T-CHM13—reaches its highest level of continuit...Since its initial release in 2001,the human reference genome has undergone continuous improvement in quality,and the recently released telomere-to-telomere(T2T)version-T2T-CHM13—reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified,nearly homozygous genome of a hydatidiform mole cell line.Here,to provide an authentic complete diploid human genome reference for the Han Chinese,the largest population in the world,we assembled the genome of a male Han Chinese individual,T2T-YAO,which includes T2T assemblies of all the 22+X+M and 22+Y chromosomes in both haploids.The quality of T2T-YAO is much better than those of all currently available diploid assemblies,and its haploid version,T2T-YAO-hp,generated by selecting the better assembly for each autosome,reaches the top quality of fewer than one error per 29.5 Mb,even higher than that of T2T-CHM13.Derived from an individual living in the aboriginal region of the Han population,T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors.Each haplotype of T2TYAO possesses330-Mb exclusive sequences,3100 unique genes,and tens of thousands of nucleotide and structural variations as compared with CHM13,highlighting the necessity of a population-stratified reference genome.The construction of T2T-YAO,an accurate and authentic representative of the Chinese population,would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes,especially within the context of the unique variations of the Chinese population.展开更多
The high-fidelity(HiFi)long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies.However,these assemblies still contain base-level errors,particularly within...The high-fidelity(HiFi)long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies.However,these assemblies still contain base-level errors,particularly within the error-prone regions of HiFi long reads.Existing genome polishing tools usually introduce overcorrections and haplotype switch errors when correcting errors in genomes assembled from HiFi long reads.Here,we describe an upgraded genome polishing tool-NextPolish2,which can fix base errors remaining in those“highly accurate”genomes assembled from HiFi long reads without introducing excessive overcorrections and haplotype switch errors.We believe that NextPolish2 has a great significance to further improve the accuracy of telomere-to-telomere(T2T)genomes.NextPolish2 is freely available at https://github.com/Nextomics/NextPolish2.展开更多
Objective:To evaluate the performance of optical genomemapping(OGM)in identifying an inversion located in the short armof chromosome 8(8p,8p23.1),flanked by regions of complex segmental duplication(SD),using the GRCh3...Objective:To evaluate the performance of optical genomemapping(OGM)in identifying an inversion located in the short armof chromosome 8(8p,8p23.1),flanked by regions of complex segmental duplication(SD),using the GRCh38 and telomere-to-telomere(T2T)genome references.Methods:We investigated a couple suspected of carrying the 8p23.1 inversion due to a terminal deletion combined with an interstitial duplication of 8p found in their abortus.OGM was performed on both individuals.The data were mapped to the current GRCh38 and the updated T2T genome references,respectively.Results:The 8p23.1 inversion was observed in the female when mapping OGM data to the T2T assembly.In contrast,under the GRCh38 reference,the orientation between the suspected breakpoints within the SD regions could not be distinguished.Additional variants of uncertain significance were also identified in both individuals.Conclusion:Our findings highlight the superiority of the T2T reference in recognizing structural variations involving SD regions.The enhanced SV detection using the T2T reference may contribute to a better understanding of genome instability and human diseases.展开更多
Kiwifruit is a recently domesticated horticultural fruit crop with substantial economic and nutritional value,especially because of the high content of vitamin C in its fruit.In this study,we de novo assembled two tel...Kiwifruit is a recently domesticated horticultural fruit crop with substantial economic and nutritional value,especially because of the high content of vitamin C in its fruit.In this study,we de novo assembled two telomere-to-telomere kiwifruit genomes from Actinidia chinensis var.‘Donghong’(DH)and Actinidia latifolia‘Kuoye’(KY),with total lengths of 608327852 and 640561626 bp for 29 chromosomes,respectively.With a burst of structural variants involving inversion,translocations,and duplications within 8.39 million years,the metabolite content of DH and KY exhibited differences in saccharides,lignans,and vitamins.A regulatory ERF098 transcription factor family has expanded in KY and Actinidia eriantha,both of which have ultra-high vitamin C content.With each assembly phased into two complete haplotypes,we identified allelic variations between two sets of haplotypes,leading to protein sequence variations in 26494 and 27773 gene loci and allele-specific expression of 4687 and 12238 homozygous gene pairs.Synchronized metabolome and transcriptome changes during DH fruit development revealed the same dynamic patterns in expression levels and metabolite contents;free fatty acids and flavonols accumulated in the early stages,but sugar substances and amino acids accumulated in the late stages.The AcSWEET9b gene that exhibits allelic dominance was further identified to positively correlate with high sucrose content in fruit.Compared with wild varieties and other Actinidia species,AcSWEET9b promoters were selected in red-flesh kiwifruits that have increased fruit sucrose content,providing a possible explanation on why red-flesh kiwifruits are sweeter.Collectively,these two gap-free kiwifruit genomes provide a valuable genetic resource for investigating domestication mechanisms and genome-based breeding of kiwifruit.展开更多
Over the past 20 years,tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already,ranging from those of...Over the past 20 years,tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already,ranging from those of nonvascular plants to those of flowering plants.However,complex plant genome assembly is still challenging and remains difficult to fully resolve with conventional sequencing and assembly methods due to high heterozygosity,highly repetitive sequences,or high ploidy characteristics of complex genomes.Herein,we summarize the challenges of and advances in complex plant genome assembly,including feasible experimental strategies,upgrades to sequencing technology,existing assembly methods,and different phasing algorithms.Moreover,we list actual cases of complex genome projects for readers to refer to and draw upon to solve future problems related to complex genomes.Finally,we expect that the accurate,gapless,telomere-totelomere,and fully phased assembly of complex plant genomes could soon become routine.展开更多
Actinidia eriantha is a characteristic fruit tree featuring with great potential for its abundant vitamin C and strong disease resistance.It has been used in a wide range of breeding programs and functional genomics s...Actinidia eriantha is a characteristic fruit tree featuring with great potential for its abundant vitamin C and strong disease resistance.It has been used in a wide range of breeding programs and functional genomics studies.Previously published genome assemblies of A.eriantha are quite fragmented and not highly contiguous.Using multiple sequencing strategies,we get the haplotype-resolved and gap-free genomes of an elite breeding line“Midao 31”(MD),termed MDHAPA and MDHAPB.The new assemblies anchored to 29 pseudochromosome pairs with a length of 619.3 Mb and 611.7 Mb,as well as resolved 27 and 28 gap-close chromosomes in a telomere-to-telomere(T2T)manner.Based on the haplotype-resolved genome,we found that most alleles experienced purifying selection and coordinately expressed.Owing to the high continuity of assemblies,we defined the centromeric regions of A.eriantha,and identified the major repeating monomer,which is designated as Ae-CEN153.This resource lays a solid foundation for further functional genomics study and horticultural traits improvement in kiwifruit.展开更多
Arabidopsis thaliana is an important and long-established model species for plant molecular biology,genetics,epigenetics,and genomics.However,the latest version of reference genome still contains a significant number ...Arabidopsis thaliana is an important and long-established model species for plant molecular biology,genetics,epigenetics,and genomics.However,the latest version of reference genome still contains a significant number of missing segments.Here,we reported a high-quality and almost complete Col-0 genome assembly with two gaps(named Col-XJTU)by combining the Oxford Nanopore Technologies ultra-long reads,Pacific Biosciences high-fidelity long reads,and Hi-C data.The total genome assembly size is 133,725,193 bp,introducing 14.6 Mb of novel sequences compared to the TAIR10.1 reference genome.All five chromosomes of the Col-XJTU assembly are highly accurate with consensus quality(QV)scores>60(ranging from 62 to 68),which are higher than those of the TAIR10.1 reference(ranging from 45 to 52).We completely resolved chromosome(Chr)3 and Chr5 in a telomere-to-telomere manner.Chr4 was completely resolved except the nucleolar organizing regions,which comprise long repetitive DNA fragments.The Chrl centromere(CEN1),reportedly around 9 Mb in length,is particularly challenging to assemble due to the presence of tens of thousands of CEN180 satellite repeats.Using the cutting-edge sequencing data and novel computational approaches,we assembled a 3.8-Mb-long CEN1 and a 3.5-Mb-long CEN2.We also investigated the structure and epigenetics of centromeres.Four clusters of CEN180 monomers were detected,and the centromere-specific histone H3-like protein(CENH3)exhibited a strong preference for CEN180 Cluster 3.Moreover,we observed hypomethylation patterns in CENH3-enriched regions.We believe that this high-quality genome assembly,Col-XJTU,would serve as a valuable reference to better understand the global pattern of centromeric polymorphisms,as well as the genetic and epigenetic features in plants.展开更多
基金This work has been supported by the National Key Research and Development Program of China(2021YFF1200105)National Natural Science Foundation of China(62172125,62371161).
文摘Soybean(Glycine max)stands as a globally significant agricultural crop,and the comprehensive assembly of its genome is of paramount importance for unraveling its biological characteristics and evolutionary history.Nevertheless,previous soybean genome assemblies have harbored gaps and incompleteness,which have constrained in-depth investigations into soybean.Here,we present Telomere-to-Telomere(T2T)assembly of the Chinese soybean cultivar Zhonghuang 13(ZH13)genome,termed ZH13-T2T,utilizing PacBio Hifi and ONT ultralong reads.We employed a multi-assembler approach,integrating Hifiasm,NextDenovo,and Canu,to minimize biases and enhance assembly accuracy.The assembly spans 1,015,024,879 bp,effectively resolving all 393 gaps that previously plagued the reference genome.Our annotation efforts identified 50,564 high-confidence protein-coding genes,707 of which are novel.ZH13-T2T revealed longer chromosomes,421 not-aligned regions(NARs),112 structure variations(SVs),and a substantial expansion of repetitive element compared to earlier assemblies.Specifically,we identified 25.67 Mb of tandem repeats,an enrichment of 5S and 48S rDNAs,and characterized their genotypic diversity.In summary,we deliver the first complete Chinese soybean cultivar T2T genome.The comprehensive annotation,along with precise centromere and telomere characterization,as well as insights into structural variations,further enhance our understanding of soybean genetics and evolution.
基金supported by the Scientific Research Project of Kweichow Moutai Liquor Co.,Ltd.(MTGF2023007)the National Natural Science Foundation of China(32160459,32172036)+2 种基金the Guizhou Natural Science Foundation of China(QKHJC[2023]YB169)the Innovation Capacity Building Project of Guizhou Scientific Institutions(QKFQ[2022]007])the Guizhou Academy of Agricultural Sciences Project(Guizhou Agricultural Germplasm Resources(2023)06)。
文摘Sorghum(Sorghum bicolor(L.)Moench)is a world cereal crop used in China for producing Baijiu,a distilled spirit.We report a telomere-to-telomere genome assembly of the Baijiu cultivar Hongyingzi,HYZ-T2T,using ultralong reads.The 10 chromosome pairs contained 33,462 genes,of which 93%were functionally annotated.The 20 telomeres and 10 centromeric regions on the HYZ-T2T chromosomes were predicted and two consecutive large inversions on chromosome 2 were characterized.A 65-gene reconstruction of the metabolic pathway of tannins,the flavor substances in Baijiu,was performed and may advance the breeding of sorghum cultivars for Baijiu production.
基金This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA28030000)the National Key Research and Development Program of China(2022YFD1500503,2022YFF1003401)+2 种基金Science&Technology Specific Projects in Agricultural High-tech Industrial Demonstration Area of the Yellow River Delta(2022SZX14)the earmarked fund for CARS-Green Manure(CARS-22)the Youth Innovation Promotion Association of CAS(Y2022039).
文摘Alkaline soils pose an increasing problem for agriculture worldwide,but using stress-tolerant plants as green manure can improve marginal land.Here,we show that the legume Sesbania cannabina is very tolerant to alkaline conditions and,when used as a green manure,substantially improves alkaline soil.To understand genome evolution and the mechanisms of stress tolerance in this allotetraploid legume,we generated the first telomere-to-telomere genome assembly of S.cannabina spanning~2,087 Mb.The assembly included all centromeric regions,which contain centromeric satellite repeats,and complete chromosome ends with telomeric characteristics.Further genome analysis distinguished A and B subgenomes,which diverged approximately 7.9 million years ago.Comparative genomic analysis revealed that the chromosome homoeologs underwent large-scale inversion events(>10 Mb)and a significant,transposon-driven size expansion of the chromosome 5A homoeolog.We further identified four specific alkali-induced phosphate transporter genes in S.cannabina;these may function in alkali tolerance by relieving the deficiency in available phosphorus in alkaline soil.Our work highlights the significance of S.cannabina as a green tool to improve marginal lands and sheds light on subgenome evolution and adaptation to alkaline soils.
基金supported by the Science and Technology Research Project of Henan(Grant No.232102311003)the National Natural Science Foundation of China(Grant No.U1804282)。
文摘Since its initial release in 2001,the human reference genome has undergone continuous improvement in quality,and the recently released telomere-to-telomere(T2T)version-T2T-CHM13—reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified,nearly homozygous genome of a hydatidiform mole cell line.Here,to provide an authentic complete diploid human genome reference for the Han Chinese,the largest population in the world,we assembled the genome of a male Han Chinese individual,T2T-YAO,which includes T2T assemblies of all the 22+X+M and 22+Y chromosomes in both haploids.The quality of T2T-YAO is much better than those of all currently available diploid assemblies,and its haploid version,T2T-YAO-hp,generated by selecting the better assembly for each autosome,reaches the top quality of fewer than one error per 29.5 Mb,even higher than that of T2T-CHM13.Derived from an individual living in the aboriginal region of the Han population,T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors.Each haplotype of T2TYAO possesses330-Mb exclusive sequences,3100 unique genes,and tens of thousands of nucleotide and structural variations as compared with CHM13,highlighting the necessity of a population-stratified reference genome.The construction of T2T-YAO,an accurate and authentic representative of the Chinese population,would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes,especially within the context of the unique variations of the Chinese population.
基金supported by the National Key R&D Program of China(Grant No.2022YFC3400300)the National Natural Science Foundation of China(Grant Nos.32125009 and 32070663).
文摘The high-fidelity(HiFi)long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies.However,these assemblies still contain base-level errors,particularly within the error-prone regions of HiFi long reads.Existing genome polishing tools usually introduce overcorrections and haplotype switch errors when correcting errors in genomes assembled from HiFi long reads.Here,we describe an upgraded genome polishing tool-NextPolish2,which can fix base errors remaining in those“highly accurate”genomes assembled from HiFi long reads without introducing excessive overcorrections and haplotype switch errors.We believe that NextPolish2 has a great significance to further improve the accuracy of telomere-to-telomere(T2T)genomes.NextPolish2 is freely available at https://github.com/Nextomics/NextPolish2.
基金supported by funding for Clinical Trials from the Affiliated Drum Tower Hospital,Medical School of Nanjing University(2022-LCYJ-MS-06).
文摘Objective:To evaluate the performance of optical genomemapping(OGM)in identifying an inversion located in the short armof chromosome 8(8p,8p23.1),flanked by regions of complex segmental duplication(SD),using the GRCh38 and telomere-to-telomere(T2T)genome references.Methods:We investigated a couple suspected of carrying the 8p23.1 inversion due to a terminal deletion combined with an interstitial duplication of 8p found in their abortus.OGM was performed on both individuals.The data were mapped to the current GRCh38 and the updated T2T genome references,respectively.Results:The 8p23.1 inversion was observed in the female when mapping OGM data to the T2T assembly.In contrast,under the GRCh38 reference,the orientation between the suspected breakpoints within the SD regions could not be distinguished.Additional variants of uncertain significance were also identified in both individuals.Conclusion:Our findings highlight the superiority of the T2T reference in recognizing structural variations involving SD regions.The enhanced SV detection using the T2T reference may contribute to a better understanding of genome instability and human diseases.
基金supported by the Provincial Technology Innovation Program of Shandongan award from the Natural Science Foundation of Shandong Province(ZR2021ZD30)+2 种基金the Director’s Award from the Peking University Institute of Advanced Agricultural Sciences,the National Top Young Talents Program of Chinathe Boya Postdoctoral Program of Peking University,the National Key R&D Program of China(2019YFD1000200)the Youth Innovation Promotion Association CAS(2018376).
文摘Kiwifruit is a recently domesticated horticultural fruit crop with substantial economic and nutritional value,especially because of the high content of vitamin C in its fruit.In this study,we de novo assembled two telomere-to-telomere kiwifruit genomes from Actinidia chinensis var.‘Donghong’(DH)and Actinidia latifolia‘Kuoye’(KY),with total lengths of 608327852 and 640561626 bp for 29 chromosomes,respectively.With a burst of structural variants involving inversion,translocations,and duplications within 8.39 million years,the metabolite content of DH and KY exhibited differences in saccharides,lignans,and vitamins.A regulatory ERF098 transcription factor family has expanded in KY and Actinidia eriantha,both of which have ultra-high vitamin C content.With each assembly phased into two complete haplotypes,we identified allelic variations between two sets of haplotypes,leading to protein sequence variations in 26494 and 27773 gene loci and allele-specific expression of 4687 and 12238 homozygous gene pairs.Synchronized metabolome and transcriptome changes during DH fruit development revealed the same dynamic patterns in expression levels and metabolite contents;free fatty acids and flavonols accumulated in the early stages,but sugar substances and amino acids accumulated in the late stages.The AcSWEET9b gene that exhibits allelic dominance was further identified to positively correlate with high sucrose content in fruit.Compared with wild varieties and other Actinidia species,AcSWEET9b promoters were selected in red-flesh kiwifruits that have increased fruit sucrose content,providing a possible explanation on why red-flesh kiwifruits are sweeter.Collectively,these two gap-free kiwifruit genomes provide a valuable genetic resource for investigating domestication mechanisms and genome-based breeding of kiwifruit.
基金supported by the National Natural Science Foundation of China(Grant No.32222019)the National Key R&D Program of China(Grant No.2021YFF1000900).
文摘Over the past 20 years,tremendous advances in sequencing technologies and computational algorithms have spurred plant genomic research into a thriving era with hundreds of genomes decoded already,ranging from those of nonvascular plants to those of flowering plants.However,complex plant genome assembly is still challenging and remains difficult to fully resolve with conventional sequencing and assembly methods due to high heterozygosity,highly repetitive sequences,or high ploidy characteristics of complex genomes.Herein,we summarize the challenges of and advances in complex plant genome assembly,including feasible experimental strategies,upgrades to sequencing technology,existing assembly methods,and different phasing algorithms.Moreover,we list actual cases of complex genome projects for readers to refer to and draw upon to solve future problems related to complex genomes.Finally,we expect that the accurate,gapless,telomere-totelomere,and fully phased assembly of complex plant genomes could soon become routine.
基金Open access funding provided by Shanghai Jiao Tong Universitysupported by funds from the National Natural Science Foundation of China(31972474,31471157).
文摘Actinidia eriantha is a characteristic fruit tree featuring with great potential for its abundant vitamin C and strong disease resistance.It has been used in a wide range of breeding programs and functional genomics studies.Previously published genome assemblies of A.eriantha are quite fragmented and not highly contiguous.Using multiple sequencing strategies,we get the haplotype-resolved and gap-free genomes of an elite breeding line“Midao 31”(MD),termed MDHAPA and MDHAPB.The new assemblies anchored to 29 pseudochromosome pairs with a length of 619.3 Mb and 611.7 Mb,as well as resolved 27 and 28 gap-close chromosomes in a telomere-to-telomere(T2T)manner.Based on the haplotype-resolved genome,we found that most alleles experienced purifying selection and coordinately expressed.Owing to the high continuity of assemblies,we defined the centromeric regions of A.eriantha,and identified the major repeating monomer,which is designated as Ae-CEN153.This resource lays a solid foundation for further functional genomics study and horticultural traits improvement in kiwifruit.
基金supported by the National Natural Science Foundation of China(Grant Nos.62172325 and 32070663)the China Postdoctoral Science Foundation(Grant No.2020M673420)+2 种基金the Fundamental Research Funds for the Central Universities,Chinathe World-Class Universities(Disciplines)the Characteristic Development Guidance Funds for the Central Universities,China。
文摘Arabidopsis thaliana is an important and long-established model species for plant molecular biology,genetics,epigenetics,and genomics.However,the latest version of reference genome still contains a significant number of missing segments.Here,we reported a high-quality and almost complete Col-0 genome assembly with two gaps(named Col-XJTU)by combining the Oxford Nanopore Technologies ultra-long reads,Pacific Biosciences high-fidelity long reads,and Hi-C data.The total genome assembly size is 133,725,193 bp,introducing 14.6 Mb of novel sequences compared to the TAIR10.1 reference genome.All five chromosomes of the Col-XJTU assembly are highly accurate with consensus quality(QV)scores>60(ranging from 62 to 68),which are higher than those of the TAIR10.1 reference(ranging from 45 to 52).We completely resolved chromosome(Chr)3 and Chr5 in a telomere-to-telomere manner.Chr4 was completely resolved except the nucleolar organizing regions,which comprise long repetitive DNA fragments.The Chrl centromere(CEN1),reportedly around 9 Mb in length,is particularly challenging to assemble due to the presence of tens of thousands of CEN180 satellite repeats.Using the cutting-edge sequencing data and novel computational approaches,we assembled a 3.8-Mb-long CEN1 and a 3.5-Mb-long CEN2.We also investigated the structure and epigenetics of centromeres.Four clusters of CEN180 monomers were detected,and the centromere-specific histone H3-like protein(CENH3)exhibited a strong preference for CEN180 Cluster 3.Moreover,we observed hypomethylation patterns in CENH3-enriched regions.We believe that this high-quality genome assembly,Col-XJTU,would serve as a valuable reference to better understand the global pattern of centromeric polymorphisms,as well as the genetic and epigenetic features in plants.