摘要
基于unix/linux操作系统和mysql数据库 ,利用phred/phrap ,stackpack ,blast软件 ,对cDNA和EST序列进行大规模自动分析。它可以完成从测序峰图文件向核酸序列的转化 ,去除载体污染和重复序列 ,序列聚类 ,拼接 ,分析可变剪切 ,数据库搜索进行相似性分析。该系统可以加速大规模EST测序的分析速度。
With increasing huge amount of cDNA sequences have been obtained since the human genome project, a powerful system is urgently needed for data mining these cDNA sequences. Based on unix/linux operating system, phred/phrap, stackpack and blast software have been used to construct a platform for batch analysis of cDNA and EST sequences, including base_calling, vector and repeat sequence removing, sequence clustering, assembling, alternative splicing analysis and sequence alignment. Our results demonstrated that this platform could accelerate data analysis for large scale EST sequencing and suggest some useful clues.
出处
《中山大学学报(自然科学版)》
CAS
CSCD
北大核心
2002年第5期60-63,共4页
Acta Scientiarum Naturalium Universitatis Sunyatseni
基金
国家自然科学基金资助项目 (3980 0 0 73)
国家自然科学基金重点资助项目 (6 9935 0 2 0 )