摘要
EST序列代表了组织基因表达的转录信号,本研究尝试开发简单高效的大规模EST分析方法,从NCBI下载水稻(Oryza sativa)的所有EST序列并进行分析以获取水稻发育过程基因表达的重要信息。通过进行blast比对和phrap拼接分析,及利用Unix文本过滤方法,从EST序列拼接获得了3万多个重叠群序列。进一步将重叠群序列与NCBI核酸数据库进行比对获得了各个序列的注释信息。从重叠群的组织表达初步挖掘中发现花药的表达数量最多,为下一步探讨水稻发育器官特异表达基因调控打下了重要基础。
EST sequences represent transcribed signals of gene expressions in tissues. In this study,a simple and effective method for large-scale EST analysis was developed using all rice( Oryza sativa) ESTs downloaded from NCBI for mining important information in rice development. After the blast alignment,phrap contig joining,and Unix command-line filtering,over 30 000 contigs were obtained from EST sequences. Annotations of these contigs were returned with further alignments to NCBI nucleotide databases. Anther expressions showed the most abundant in this preliminary mining from annotations for different tissues. This lays an important foundation for further investigating tissue-specific regulation of gene expression in rice development.
出处
《生物信息学》
2015年第2期96-102,共7页
Chinese Journal of Bioinformatics