Thirteen cuticular protein (CP) families have been recognized in arthropods. In this study, 250 Anopheles sinensis CP genes were identified and named based on genome and transcriptome sequences. They were classified...Thirteen cuticular protein (CP) families have been recognized in arthropods. In this study, 250 Anopheles sinensis CP genes were identified and named based on genome and transcriptome sequences. They were classified into 10 families based on mo- tifs and phylogenetic analyses. In 11 other insect species, nine had CP numbers 〉 150 while Apis mellifera and Tribolium castaneum had CP numbers less than 52. The CPs of eight species occupied 〉 1.4% of the total genomic gene number, whereas in three species the CPs occupied 〈 1%. The phylogenies for each CP family in An. sinensis were constructed and discussed. The 250 CPs each had 1-8 exons with 144 CPs (57.6%) having two exons. The intron length ranged from 66--3888 bp with 174 introns (54.0%) being 66--100 bp long. Except for two CPs on two contigs, 248 CPs were mapped onto 28 scaffolds with 136 genes (54.4%) restricted to five scaffolds. A total of 107 CPs were clustered and located at 27 loci. The CPR family had the conserved motif GSYS- LVEPDGTVRTV. The RR- 1 subfamily had an additional 21 amino acid (aa) motifs with the YVADENGF sequence that is common in insects. The RR-2 subfamily had an additional 50 aa motifs with two additional regions RDGDWKG and G-x(3)-VV. A comparison with 115 orthologous counterparts of An. gambiae CPs suggested purifying selection for all of these genes. This study provides basic information useful for further studies on biological functions of An. sinensis CPs as well as for comparative genomics of insect CPs.展开更多
文摘Thirteen cuticular protein (CP) families have been recognized in arthropods. In this study, 250 Anopheles sinensis CP genes were identified and named based on genome and transcriptome sequences. They were classified into 10 families based on mo- tifs and phylogenetic analyses. In 11 other insect species, nine had CP numbers 〉 150 while Apis mellifera and Tribolium castaneum had CP numbers less than 52. The CPs of eight species occupied 〉 1.4% of the total genomic gene number, whereas in three species the CPs occupied 〈 1%. The phylogenies for each CP family in An. sinensis were constructed and discussed. The 250 CPs each had 1-8 exons with 144 CPs (57.6%) having two exons. The intron length ranged from 66--3888 bp with 174 introns (54.0%) being 66--100 bp long. Except for two CPs on two contigs, 248 CPs were mapped onto 28 scaffolds with 136 genes (54.4%) restricted to five scaffolds. A total of 107 CPs were clustered and located at 27 loci. The CPR family had the conserved motif GSYS- LVEPDGTVRTV. The RR- 1 subfamily had an additional 21 amino acid (aa) motifs with the YVADENGF sequence that is common in insects. The RR-2 subfamily had an additional 50 aa motifs with two additional regions RDGDWKG and G-x(3)-VV. A comparison with 115 orthologous counterparts of An. gambiae CPs suggested purifying selection for all of these genes. This study provides basic information useful for further studies on biological functions of An. sinensis CPs as well as for comparative genomics of insect CPs.