摘要
通过对TRANSFAC数据库中转录因子结合位点(TFBS)所包含核苷k联体(k-mer)在人类和小鼠基因组启动子区中分布的比较分析,提出一种在人类全基因组启动子区搜索转录调节k-mer模体(transcriptionregulatoryk-mermotifs,TRKMs)的非联配快速算法——基于距离的保守k-mer搜索算法(distance-basedconservativek-mersearchingalgorithm,DCKSalgorithm).应用该算法,对人7-mer转录调节模体进行预测,预测结果敏感性为90%,特异性为78%,相关系数为0.65.
The comparative studies of k-mer distribution in human and mouse TFBS sequences listed in TRANSFAC database are given. A non-alignment based approach for fast genome-wide discovery of transcription regulatory k-mer motifs (TRKMs) is proposed. The method is called distance-based conservative k-met searching algorithm (DCKS) which is based on the conservation of k-mer pair distance. By use of DCKS the prediction accuracy of human transcription regulatory 7-mer motifs is: sensitivity 90%, specificity 78%, and correlation coefficient 0.65.
出处
《生物化学与生物物理进展》
SCIE
CAS
CSCD
北大核心
2006年第11期1044-1050,共7页
Progress In Biochemistry and Biophysics
基金
国家自然科学基金资助项目(90403010).~~
关键词
转录调节模体
非联配途径
基于距离的保守k-mer搜索算法
二次判别分析
transcription regulatory motifs, non-alignment based approach, distance-based conservative k-mer searching algorithm, quadratic discriminant analysis