摘要
核小体是染色质结构的基本单位,其在整条DNA序列上的定位分布情况,对于真核生物的基因表达调控起关键作用。用机器学习方法预测核小体定位成为近年来的研究热点。以DNA序列6-mer组分为参数,采用我们提出的多样性增量特征选择技术,筛选出8个6-mer作为分类特征。进一步,采用支持向量机算法,10折交叉检验的总精度达到98.2%。结果表明,核小体定位序列和连接序列核苷k-mer组分的特异化分布,是影响酵母核小体定位的主要因素。
Nucleosome is a basic unit of chromatin structure. Its location and distribution on the entire DNA sequence play a key role in the regulation of gene expression in eukaryotes. The prediction of nucleosome positioning with machine learning method has become a hot topic in recent years. Taken the 6-mer component of DNA sequence as the parameter, we used the increment of diversity feature selection technique proposed by us to select eight 6-mers as the classification characteristics. Furthermore, the total accuracy of the 10 fold cross validation is 98.2% using the support vector machine algorithm. The results show that the specific distribution of the k-mer component in the nucleosomal and linker sequences is the main factor that affected nucleosome positioning in yeast.
作者
胡世赛
陈宇翔
张颖
吕军
Shisai Hu;Yuxiang Chen;Ying Zhang;Jun Lv(College of Science,Inner Mongolia University of Technology,Hohhot Inner Mongolia)
出处
《生物物理学》
2018年第1期1-6,共6页
Biophysics
基金
内蒙古自治区自然科学基金项目(2015MS0331和2016MS0306)资助。
关键词
核小体定位序列
多样性增量
特征选择技术
Nucleosome Positioning Sequence
Increment of Diversity
Feature Selection Technology