摘要
核小体是染色体的基本结构单元。将核小体序列和非核小体序列预处理为时间序列数据,利用LSTM(long short-term memory network)进行迭代训练和长、短程特征学习,得到的LSTM模型可以实现核小体序列92.67%的识别准确率。研究表明,核小体序列与非核小体序列具有不同的特征,并且核小体序列具有高度可分类性。基于核小体序列的高度可分类性,可以实现核小体序列与非核小体序列的判断识别,这对于核小体定位及其动态性、基因转录调控、DNA复制与修复和DNA序列的功能及进化等的研究具有一定的生物学意义和价值。
Nucleosome is the basic structural unit of chromosome.Nucleosome sequences and non-nucleosome sequences were preprocessed as time series data,then used LSTM(long short-term memory network)for iterative training and Long-Short range feature learning.The resulting LSTM model could achieve the identification accuracy of 92.67%for nucleosome sequences.These results show that the nucleosome sequences and non-nucleosome sequences have different classification characteristics,and the nucleosome sequences are highly classifiable.Based on the high classification nature of nucleosome sequences,the judgment and recognition of nucleosome sequences and non-nucleosome sequences can be realized,which may be of certain biological significance and value to study nucleosome positioning and its dynamics,as well as gene transcription regulation,DNA replication and repair,and the function and evolution of DNA sequences.
作者
刘建丽
周德良
靳文
LIU Jianli;ZHOU Deliang;JIN Wen(School of Water Resource and Environment Engineering,China University of Geosciences(Beijing),Beijing 100083,China;Beijing Zhongdianyida Technology Co.,Ltd.,Beijing 100190,China;Inner Mongolia People's Hospital,Hohhot 010017,China)
出处
《佳木斯大学学报(自然科学版)》
CAS
2023年第6期126-129,共4页
Journal of Jiamusi University:Natural Science Edition
基金
基本科研业务费专项资金资助项目(53200759777)
国家自然科学基金(42007289)。
关键词
核小体
核小体序列
LSTM
可分类性
核小体定位
nucleosome
nucleosome sequence
LSTM
classification nature
nucleosome positioning