摘要
为获得基因组信息,利用下一代高通量测序技术平台IIIumina HiSeqTM4000对香木莲(Manglietia aromatica)组织进行转录组测序,将Raw reads过滤组装后获得的Unigene进行生物信息学相关分析.结果显示:共获得48123条Unigene,N50为1331 nt,平均长度为960 nt.与NR、GO、KOG、KEGG、Swiss-prot数据库进行比对得到注释的有效序列数量分别为37877、32125、25525、30143和27988条.在NR数据库中,与莲花(Nelumbo nucifera)、博落回(Macleaya cordata)、葡萄(Vitis vinifera)、棕榈(Elaeis guineensis)等物种匹配序列较多;在GO数据库中得到3大类58个亚类共246974个功能注释,其中,基因数最多的是生物学过程注释;在KOG数据库中得到25类25525个功能注释,其中,一般功能和信号传导机制的相关基因数量较多;在KEGG pathway中涉及6大类19个亚类共142条代谢途径分支,其中,代谢相关通路较多,占总数的63.6%.
To obtain genome-related information,the tissues of Manglietia aromatica was sequenced by using IIIumina HiSeqTM4000.The Raw reads were filtered and assembled.The obtained Unigenes were further analyzed by using related bioinformatics database.The results showed that 48123 Unigenes were obtained with 960 nt in average length and 1331 nt in N50.All the Unigenes were searched against NR,GO,KOG,KEGG and Swiss-prot databases,and 37877,32125,25525,30143 and 27988 Unigenes were respectively annotated in those databases.In NR database,these Unigenes were more matched with Nelumbo nucifera,Macleaya cordata,Vitis vinifera and Elaeis guineensis.In GO database,a total of 246974 functional annotations were obtained,and were divided into 3 major classes and 58 subclasses,in which the biological processes were the category with the most annotated genes.In KOG database,a total of 25525 functional annotations were found and divided into 25 categories.Among them,the genes with high number of genes were from the general function prediction and signal transduction mechanisms.In KEGG database,142 metabolic pathways were found,in which the metabolic pathway was the most,accounting for 63.6%.
作者
苗艺明
石松
杨梅
刘世男
MIAO Yiming;SHI Song;YANG Mei;LIU Shinan(Forestry College of Guangxi University,Nanning 530004,China;Natural Resources Bureau of Du’an Yao Autonomous County,Du’an 530799,China)
出处
《北华大学学报(自然科学版)》
CAS
2021年第1期122-127,共6页
Journal of Beihua University(Natural Science)
基金
广西高校中青年教师科研基础能力提升项目(2020ky01015).
关键词
香木莲
转录组
生物信息学分析
Manglietia aromatica
transcriptome
bioinformatics analysis