摘要
Selenoprotein is biosynthesized by the incorporation of selenocysteine into proteins,where the TGA codon in the open reading frame does not act as a stop signal but is translated into selenocysteine.The dual functions of TGA result in mis-annotation or lack of selenoproteins in the sequenced genomes of many species.Available computational tools fail to correctly predict selenoproteins.Thus,we devel-oped a new method to identify selenoproteins from the genome of Anopheles gambiae computationally.Based on released genomic information,several programs were edited with PERL language to identify selenocysteine insertion sequence(SECIS)element,the coding potential of TGA codons,and cys-teine-containing homologs of selenoprotein genes.Our results showed that 11365 genes were termi-nated with TGA codons,918 of which contained SECIS elements.Similarity search revealed that 58 genes contained Sec/Cys pairs and similar flanking regions around in-frame TGA codons.Finally,7 genes were found to fully meet requirements for selenoproteins,although they have not been anno-tated as selenoproteins in NCBI databases.Deduced from their basic properties,the newly found se-lenoproteins in the genome of Anopheles gambiae are possibly related to in vivo oxidation tolerance and protein regulation in order to interfere with anopheles' vectorial capacity of Plasmodium.This study may also provide theoretical bases for the prevention of malaria from anopheles transmission.
Selenoprotein is biosynthesized by the incorporation of selenocysteine into proteins,where the TGA codon in the open reading frame does not act as a stop signal but is translated into selenocysteine.The dual functions of TGA result in mis-annotation or lack of selenoproteins in the sequenced genomes of many species.Available computational tools fail to correctly predict selenoproteins.Thus,we devel-oped a new method to identify selenoproteins from the genome of Anopheles gambiae computationally.Based on released genomic information,several programs were edited with PERL language to identify selenocysteine insertion sequence(SECIS)element,the coding potential of TGA codons,and cys-teine-containing homologs of selenoprotein genes.Our results showed that 11365 genes were termi-nated with TGA codons,918 of which contained SECIS elements.Similarity search revealed that 58 genes contained Sec/Cys pairs and similar flanking regions around in-frame TGA codons.Finally,7 genes were found to fully meet requirements for selenoproteins,although they have not been anno-tated as selenoproteins in NCBI databases.Deduced from their basic properties,the newly found se-lenoproteins in the genome of Anopheles gambiae are possibly related to in vivo oxidation tolerance and protein regulation in order to interfere with anopheles' vectorial capacity of Plasmodium.This study may also provide theoretical bases for the prevention of malaria from anopheles transmission.
作者
JIANG Liang1,2,LIU Qiong1,CHEN Ping3,GAO ZhongHong3 & XU HuiBi3 1 College of Life Science,Shenzhen University,Shenzhen 518060,China
2 Changchun Institute of Applied Chemistry,Chinese Academy of Sciences,Changchun 130022,China
3 Department of Chemistry,Huazhong University of Science and Technology,Wuhan 430074,China
基金
the National Natural Science Foundation of China (Grant Nos. 30370352 and 30570420)