Introduction:Genome sequence plays an important role in both basic and applied studies.Gossypium raimondii,the putative contributor of the D subgenome of upland cotton(G.hirsutum,highlights the need to improve the gen...Introduction:Genome sequence plays an important role in both basic and applied studies.Gossypium raimondii,the putative contributor of the D subgenome of upland cotton(G.hirsutum,highlights the need to improve the genome quality rapidly and efficiently.Methods:We performed Hi-C sequencing of G.raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds.We also compared the reassembled genome sequenee with the previously published G raimondii genomes for gene and genome sequence collinearity.Result:A total of 9842%of scaffold sequences were clustered successfully,among which 99.72%of the clustered sequences were ordered and 99.92%of the ordered sequences were oriented with high-quality.Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than the previous one(Nat Genet 44:98-1103,2012).Conclusion:This improvement in G raimondii genome not only provides a better reference to increase study efficiency but also offers a new way to assemble cotton genomes.Furthermore,Hi-C data of G.raimondii may be used for 3D structure research or regulating analysis.展开更多
Cotton, in the Gossypium genus, constitutes five tetraploid (2n = 4x = 52) and 45 diploid (2n = 2x = 26) species, which are believed to have originated from a common ancestor 5-10 million years ago (MYA). Upland...Cotton, in the Gossypium genus, constitutes five tetraploid (2n = 4x = 52) and 45 diploid (2n = 2x = 26) species, which are believed to have originated from a common ancestor 5-10 million years ago (MYA). Upland cotton (G. hirsutum, AADD, 2n = 4x = 52), which is responsible for over 90% of the world's cotton lint production, is thought to have undergone an allopolyploidization event about 1-2 MYA involving both A and D genome species (Wendel and Albert 1992). The progenitor of G. raimondii (DD, 2n = 2x = 26) is considered the contributor of the D subgenome, while ancestors of G. arboreum (AA, 2n = 2x = 26) may have contributed the A subgenome to G. hirsutum (Sunilkumar et al. 2006; Chen et al. 2007).展开更多
Microsatellite DNA or simple sequence repeats (SSRs) can be derived from expressed se- quence tags (ESTs). These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop EST-SSRs f...Microsatellite DNA or simple sequence repeats (SSRs) can be derived from expressed se- quence tags (ESTs). These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop EST-SSRs for cotton gene map- ping, we selected and characterized functional markers in Gossypium raimondii, which consisted of 58906 non-redundant EST sequences from NCBI. Among them there were 2620 microsatellite se- quences containing 2818 EST-SSRs, which amoun- ted to 4.45% of the non-redundant starting sequence population. This incidence was equivalent to one EST-SSR in every 14.8 kb of G. raimondii genetic material. Among the different motifs ranging from 1 to 6 bp, trinucleotide repeats were most abundant (38.31%), followed by dinucleotide repeats (24.09%) and mononucleotide repeats (23.35%). Among all identified motif types, A/T had the highest frequency (18.67%), followed by AT/TA (14.83%). Among the compound motifs, tandem trinucleotides occurred with the highest frequency (48.65%). In all, we identi- fied 1554 EST-SSRs primer pair sequences. 300 of them were randomly selected to screen the poly- morphisms between the mapping parents G· hirsutum acc. TM-1 and G· barbadense cv. Hai7124, to con- struct linkage groups in cultivated allotetraploid cot- ton. Among them, 129 (43%) primer pairs were found to have polymorphisms. Using these EST-SSRs we can compare EST-SSR distributions among different cotton species and various chromosomal locations.展开更多
磷脂酰乙醇胺结合蛋白(phosphatidyl ethanolamine-binding proteins,PEBP)基因家族广泛存在于真核生物中,在被子植物中主要起着促进或抑制开花和控制株型的作用。利用亚洲棉(Gossypium arboreum,A2)和雷蒙德氏棉(Gossypium raimondii,...磷脂酰乙醇胺结合蛋白(phosphatidyl ethanolamine-binding proteins,PEBP)基因家族广泛存在于真核生物中,在被子植物中主要起着促进或抑制开花和控制株型的作用。利用亚洲棉(Gossypium arboreum,A2)和雷蒙德氏棉(Gossypium raimondii,D5)的基因组数据库,分别搜索到8个棉花PEBP同源基因,都包含4个外显子和3个内含子,编码的蛋白都存在PEBP家族的保守基序和关键氨基酸位点,表明二倍体棉花中至少存在8个PEBP家族基因。进化分析表明,8个PEBP基因分属于3个亚家族,含FLOWERING LOCUS T(FT)-like亚家族1个、TERMINAL FLOWER 1(TFL1)-like亚家族5个(包括3个TFL1和2个BFT)、MOTHER OF FT AND TFL1(MFT)-like亚家族2个。实时荧光定量PCR分析陆地棉(Gossypium hirsutum)8个PEBP基因在根、茎、叶、幼苗顶端分生组织、花、胚珠和25 d的纤维组织中的表达,表明FT1在叶片中表达量最高,其次在纤维、胚珠和花中;MFT1在各组织中均表达,但在纤维中表达量最高,其次是花和叶片中,而MFT2以在叶片中表达为主;TFL1a、TFL1b和TFL1c均在根中表达量最高,但TFL1c在叶片、花和胚珠中也有相对较高的表达;BFT1和BFT2在叶片中表达量最高,但除幼苗顶端分生组织外,BFT1在其他各组织中的表达明显高于BFT2。这些结果表明,PEBP家族基因在棉花的生长发育中可能具有不同的功能。展开更多
【目的】丝裂原活化蛋白质激酶激酶激酶(Mitogen-activated protein kinase kinase kinase,MAPKKK)家族在植物的胁迫反应和发育过程中起重要调控作用。本研究旨在筛选雷蒙德氏棉MAPKKK基因并分析其功能。【方法】以已鉴定的拟南芥MAPKK...【目的】丝裂原活化蛋白质激酶激酶激酶(Mitogen-activated protein kinase kinase kinase,MAPKKK)家族在植物的胁迫反应和发育过程中起重要调控作用。本研究旨在筛选雷蒙德氏棉MAPKKK基因并分析其功能。【方法】以已鉴定的拟南芥MAPKKK蛋白序列为种子序列,在已发表的雷蒙德氏棉全基因组数据库中,通过本地BLAST以及Pfam和SMART鉴定雷蒙德氏棉MAPKKK基因家族成员;采用MEGA5、GSDS在线工具以及Mapchart进行进化树、基因结构及染色体定位分析;利用已有的陆地棉芯片数据进行响应逆境胁迫和纤维不同发育时期的表达谱分析。【结果】系统鉴定了114个雷蒙德氏棉MAPKKK家族基因,根据基因结构及进化树分析分为Raf、ZIK和MEKK三个亚家族。染色体定位表明,该基因家族广泛分布于13条染色体上,并存在基因复制。与最近公布的78个雷蒙德氏棉MAPKKK家族基因相比对,获得序列完全相同的基因47个。【结论】上述研究结果有助于了解雷蒙德氏棉MAPKKK基因家族的进化与功能,为后续研究棉花乃至棉属MAPKKK基因的功能奠定基础。展开更多
Background:DNA methylation is an important epigenetic factor that maintains and regulates gene expression.The mode and level of DNA methylation depend on the roles of DNA methyltransferase and demethylase,while DNA de...Background:DNA methylation is an important epigenetic factor that maintains and regulates gene expression.The mode and level of DNA methylation depend on the roles of DNA methyltransferase and demethylase,while DNA demethylase plays a key role in the process of DNA demethylation.The results showed that the plant’s DNA demethylase all contained conserved DNA glycosidase domain.This study identified the cotton DNA demethylase gene family and analyzed it using bioinformatics methods to lay the foundation for further study of cotton demethylase gene function.Results:This study used genomic information from diploid Gossypium raimondii JGI(D),Gossypium arboreum L.CRI(A),Gossypium hirsutum L.JGI(AD1) and Gossypium barbadebse L NAU(AD2) to Arabidopsis thaliana.Using DNA demethylase genes sequence of Arabidopsis as reference,25 DNA demethylase genes were identified in cotton by BLAST analysis.There are 4 genes in the genome D,5 genes in the genome A,10 genes in the genome AD1,and 6 genes in the genome AD2.The gene structure and evolution were analyzed by bioinformatics,and the expression patterns of DNA demethylase gene family in Gossypium hirsutum L were analyzed.From the phylogenetic tree analysis,the DNA demethylase gene family of cotton can be divided into four subfamilies:REPRESSOR of SILENCING 1(ROS1),DEMETER(DME),DEMETER-LIKE 2(DML2),and DEMETER-LIKE3(DML3).The sequence similarity of DNA demethylase genes in the same species was higher,and the genetic relationship was also relatively close.Analysis of the gene structure revealed that the DNA demethylase gene family members of the four subfamilies varied greatly.Among them,the number of introns of ROS1 and DME subfamily was larger,and the gene structure was more complex.For the analysis of the conserved domain,it was known that the DNA demethylase family gene member has an endonuclease Ⅲ(END03 c) domain.Conclusion:The genes of the DNA demethylase family are distributed differently in different cotton species,and the gene structure is very different.High expression of ROS1 genes in cotton were under abiotic stress.The expression levels of ROS1 genes were higher during the formation of cotton ovule.The transcription levels of ROS1 family genes were higher during cotton fiber development.展开更多
文摘Introduction:Genome sequence plays an important role in both basic and applied studies.Gossypium raimondii,the putative contributor of the D subgenome of upland cotton(G.hirsutum,highlights the need to improve the genome quality rapidly and efficiently.Methods:We performed Hi-C sequencing of G.raimondii and reassembled its genome based on a set of new Hi-C data and previously published scaffolds.We also compared the reassembled genome sequenee with the previously published G raimondii genomes for gene and genome sequence collinearity.Result:A total of 9842%of scaffold sequences were clustered successfully,among which 99.72%of the clustered sequences were ordered and 99.92%of the ordered sequences were oriented with high-quality.Further evaluation of results by heat-map and collinearity analysis revealed that the current reassembled genome is significantly improved than the previous one(Nat Genet 44:98-1103,2012).Conclusion:This improvement in G raimondii genome not only provides a better reference to increase study efficiency but also offers a new way to assemble cotton genomes.Furthermore,Hi-C data of G.raimondii may be used for 3D structure research or regulating analysis.
文摘Cotton, in the Gossypium genus, constitutes five tetraploid (2n = 4x = 52) and 45 diploid (2n = 2x = 26) species, which are believed to have originated from a common ancestor 5-10 million years ago (MYA). Upland cotton (G. hirsutum, AADD, 2n = 4x = 52), which is responsible for over 90% of the world's cotton lint production, is thought to have undergone an allopolyploidization event about 1-2 MYA involving both A and D genome species (Wendel and Albert 1992). The progenitor of G. raimondii (DD, 2n = 2x = 26) is considered the contributor of the D subgenome, while ancestors of G. arboreum (AA, 2n = 2x = 26) may have contributed the A subgenome to G. hirsutum (Sunilkumar et al. 2006; Chen et al. 2007).
基金supported in part by the National Natural Science Foundation of China(Grant Nos.30471104&30270806)Programs for Changjiang Scholars and Innovative Research Team in University and for New Century Excellent Talents in Ministry of Education(Grant No.NCET-04-0500)+1 种基金Program for Excellent Talents in Jiangsu Province(Grant No.BK2003414)Jiangsu High-Tech Project(Grant No.BG2004305).
文摘Microsatellite DNA or simple sequence repeats (SSRs) can be derived from expressed se- quence tags (ESTs). These markers are important for gene mapping as well as marker-assisted selection (MAS). To develop EST-SSRs for cotton gene map- ping, we selected and characterized functional markers in Gossypium raimondii, which consisted of 58906 non-redundant EST sequences from NCBI. Among them there were 2620 microsatellite se- quences containing 2818 EST-SSRs, which amoun- ted to 4.45% of the non-redundant starting sequence population. This incidence was equivalent to one EST-SSR in every 14.8 kb of G. raimondii genetic material. Among the different motifs ranging from 1 to 6 bp, trinucleotide repeats were most abundant (38.31%), followed by dinucleotide repeats (24.09%) and mononucleotide repeats (23.35%). Among all identified motif types, A/T had the highest frequency (18.67%), followed by AT/TA (14.83%). Among the compound motifs, tandem trinucleotides occurred with the highest frequency (48.65%). In all, we identi- fied 1554 EST-SSRs primer pair sequences. 300 of them were randomly selected to screen the poly- morphisms between the mapping parents G· hirsutum acc. TM-1 and G· barbadense cv. Hai7124, to con- struct linkage groups in cultivated allotetraploid cot- ton. Among them, 129 (43%) primer pairs were found to have polymorphisms. Using these EST-SSRs we can compare EST-SSR distributions among different cotton species and various chromosomal locations.
文摘磷脂酰乙醇胺结合蛋白(phosphatidyl ethanolamine-binding proteins,PEBP)基因家族广泛存在于真核生物中,在被子植物中主要起着促进或抑制开花和控制株型的作用。利用亚洲棉(Gossypium arboreum,A2)和雷蒙德氏棉(Gossypium raimondii,D5)的基因组数据库,分别搜索到8个棉花PEBP同源基因,都包含4个外显子和3个内含子,编码的蛋白都存在PEBP家族的保守基序和关键氨基酸位点,表明二倍体棉花中至少存在8个PEBP家族基因。进化分析表明,8个PEBP基因分属于3个亚家族,含FLOWERING LOCUS T(FT)-like亚家族1个、TERMINAL FLOWER 1(TFL1)-like亚家族5个(包括3个TFL1和2个BFT)、MOTHER OF FT AND TFL1(MFT)-like亚家族2个。实时荧光定量PCR分析陆地棉(Gossypium hirsutum)8个PEBP基因在根、茎、叶、幼苗顶端分生组织、花、胚珠和25 d的纤维组织中的表达,表明FT1在叶片中表达量最高,其次在纤维、胚珠和花中;MFT1在各组织中均表达,但在纤维中表达量最高,其次是花和叶片中,而MFT2以在叶片中表达为主;TFL1a、TFL1b和TFL1c均在根中表达量最高,但TFL1c在叶片、花和胚珠中也有相对较高的表达;BFT1和BFT2在叶片中表达量最高,但除幼苗顶端分生组织外,BFT1在其他各组织中的表达明显高于BFT2。这些结果表明,PEBP家族基因在棉花的生长发育中可能具有不同的功能。
文摘【目的】丝裂原活化蛋白质激酶激酶激酶(Mitogen-activated protein kinase kinase kinase,MAPKKK)家族在植物的胁迫反应和发育过程中起重要调控作用。本研究旨在筛选雷蒙德氏棉MAPKKK基因并分析其功能。【方法】以已鉴定的拟南芥MAPKKK蛋白序列为种子序列,在已发表的雷蒙德氏棉全基因组数据库中,通过本地BLAST以及Pfam和SMART鉴定雷蒙德氏棉MAPKKK基因家族成员;采用MEGA5、GSDS在线工具以及Mapchart进行进化树、基因结构及染色体定位分析;利用已有的陆地棉芯片数据进行响应逆境胁迫和纤维不同发育时期的表达谱分析。【结果】系统鉴定了114个雷蒙德氏棉MAPKKK家族基因,根据基因结构及进化树分析分为Raf、ZIK和MEKK三个亚家族。染色体定位表明,该基因家族广泛分布于13条染色体上,并存在基因复制。与最近公布的78个雷蒙德氏棉MAPKKK家族基因相比对,获得序列完全相同的基因47个。【结论】上述研究结果有助于了解雷蒙德氏棉MAPKKK基因家族的进化与功能,为后续研究棉花乃至棉属MAPKKK基因的功能奠定基础。
基金funded by the National Key Research and Development Program of China(2018YFD0100401)
文摘Background:DNA methylation is an important epigenetic factor that maintains and regulates gene expression.The mode and level of DNA methylation depend on the roles of DNA methyltransferase and demethylase,while DNA demethylase plays a key role in the process of DNA demethylation.The results showed that the plant’s DNA demethylase all contained conserved DNA glycosidase domain.This study identified the cotton DNA demethylase gene family and analyzed it using bioinformatics methods to lay the foundation for further study of cotton demethylase gene function.Results:This study used genomic information from diploid Gossypium raimondii JGI(D),Gossypium arboreum L.CRI(A),Gossypium hirsutum L.JGI(AD1) and Gossypium barbadebse L NAU(AD2) to Arabidopsis thaliana.Using DNA demethylase genes sequence of Arabidopsis as reference,25 DNA demethylase genes were identified in cotton by BLAST analysis.There are 4 genes in the genome D,5 genes in the genome A,10 genes in the genome AD1,and 6 genes in the genome AD2.The gene structure and evolution were analyzed by bioinformatics,and the expression patterns of DNA demethylase gene family in Gossypium hirsutum L were analyzed.From the phylogenetic tree analysis,the DNA demethylase gene family of cotton can be divided into four subfamilies:REPRESSOR of SILENCING 1(ROS1),DEMETER(DME),DEMETER-LIKE 2(DML2),and DEMETER-LIKE3(DML3).The sequence similarity of DNA demethylase genes in the same species was higher,and the genetic relationship was also relatively close.Analysis of the gene structure revealed that the DNA demethylase gene family members of the four subfamilies varied greatly.Among them,the number of introns of ROS1 and DME subfamily was larger,and the gene structure was more complex.For the analysis of the conserved domain,it was known that the DNA demethylase family gene member has an endonuclease Ⅲ(END03 c) domain.Conclusion:The genes of the DNA demethylase family are distributed differently in different cotton species,and the gene structure is very different.High expression of ROS1 genes in cotton were under abiotic stress.The expression levels of ROS1 genes were higher during the formation of cotton ovule.The transcription levels of ROS1 family genes were higher during cotton fiber development.