摘要
为发掘达氏鳇的功能基因,采用新一代高通量测序技术对达氏鳇肌肉组织转录组进行测序。结果获得原始数据14 447 211 200 bp,拼接获得了55 531条单基因序列(unigene),长度范围300~32 613 bp,平均长度941 bp。利用生物信息学方法对unigene进行了非冗余蛋白质数据库(Nr)相似搜索,此外还进行了GO的功能注释和KEGG代谢通路分析。结果一共有20 735条unigene(37.34%)与Nr数据库中的己知基因同源;根据GO功能可分为生物过程、细胞组分和分子功能3大类56分支;依据KEGG代谢通路分析可以分成290类。
The transcriptome of the Huso dauricus' muscle tissue was sequenced by a new generation of high throughput sequencing technology in order to fill a gap which has hitherto existed in the research field of gene sequences of H. dauricus and to preserve the gene resources,at the same time to lay the ground for discovering functional genes and cloning new genes,etc. Obtaining 14 447 211 200 bp of the original data,it generated 55 531 Unigenes which ranged from 300 bp to 32 613 bp in length,the average length was941 bp. Similar searches were made by Bioinformatics methods which was against the Nr database of these Unigenes,in addition,it made gene functional description of the GO and analysis of metabolic pathways of KEGG. The result indicated that 20 735( 37. 34%) of these Unigenes have significant matches. The unigenes GO functions in the transcriptome library were divided into 3 categories: biological process,cellular component and molecular function with 56 branches; it could be divided into 290 classes taking the KEGG database as a reference.
出处
《水产学报》
CAS
CSCD
北大核心
2014年第9期1255-1262,共8页
Journal of Fisheries of China
基金
横向联合项目"达氏鳇基础生物学研究"(2012-2014)
关键词
达氏鳇
肌肉组织
基因
转录组
Huso dauricus
muscle tissue
gene
transcriptome