期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
Accurate Plant MicroRNA Prediction Can Be Achieved Using Sequence Motif Features 被引量:1
1
作者 Malik Yousef Jens Allmer Waleed Khalifa 《Journal of Intelligent Learning Systems and Applications》 2016年第1期9-22,共14页
MicroRNAs (miRNAs) are short (~21 nt) nucleotide sequences that are either co-transcribed during the production of mRNA or are organized in intergenic regions transcribed by RNA polymerase II. In animals, Drosha, and ... MicroRNAs (miRNAs) are short (~21 nt) nucleotide sequences that are either co-transcribed during the production of mRNA or are organized in intergenic regions transcribed by RNA polymerase II. In animals, Drosha, and in plants DCL1 recognize pre-miRNAs which set themselves apart by their characteristic stem loop (hairpin) structure. This structure appears important for their recognition during the process of maturation leading to functioning mature miRNAs. A large body of research is available for computational pre-miRNA detection in animals, but less within the plant kingdom. For the prediction of pre-miRNAs, usually machine learning approaches are employed. Therefore, it is necessary to convert the pre-miRNAs into a set of features that can be calculated and many such features have been described. We here select a subset of the previously described features and add sequence motifs as new features. The resulting model which we called MotifmiRNAPred was tested on known pre-miRNAs listed in miRBase and its accuracy was compared to existing approaches in the field. With an accuracy of 99.95% for the generalized plant model, it distinguishes itself from previously published results which reach an average accuracy between 74% and 98%. We believe that our approach is useful for prediction of pre-miRNAs in plants without per species adjustment. 展开更多
关键词 MicroRNA Prediction PLANT BIOINFORMATICS Machine Learning sequence motifs
下载PDF
MotViz: A Tool for Sequence Motif Prediction in Parallel to Structural Visualization and Analyses
2
作者 Muhammad Sulaman Nawaz Sajid Rashid 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2012年第1期35-43,共9页
Linking similar proteins structurally is a challenging task that may help in finding the novel members of a protein family. In this respect, identification of conserved sequence can facilitate understanding and classi... Linking similar proteins structurally is a challenging task that may help in finding the novel members of a protein family. In this respect, identification of conserved sequence can facilitate understanding and classifying the exact role of proteins. However, the exact role of these conserved elements cannot be elucidated without structural and physiochemical information. In this work, we present a novel desktop application MotViz designed for searching and analyzing the conserved sequence segments within protein structure. With MotViz, the user can extract a complete list of sequence motifs from loaded 3D structures, annotate the motifs structurally and analyze their physiochemical properties. The conservation value calculated for an individual motif can be visualized graphically. To check the efficiency, predicted motifs from the data sets of 9 protein families were analyzed and Mot^z algorithm was more efficient in comparison to other online motif prediction tools. Furthermore, a database was also integrated for storing, retrieving and performing the detailed functional annotation studies. In summary, MotViz effectively predicts motifs with high sensitivity and simultaneously visualizes them into 3D strucures. Moreover, Mot- V/z is user-friendly with optimized graphical parameters and better processing speed due to the inclusion of a database at the back end. MotViz is available at http://www.fi-pk.corn/motviz.html. 展开更多
关键词 MotViz sequence motif structural visualization algorithm BIOINFORMATICS
原文传递
RSMD-repeat searcher and motif detector
3
作者 Udayakumar Mani Vaidhyanathan Mahaganapathy +1 位作者 Sadhana Ravisankar Sai Mukund Ramakrishnan 《The Journal of Biomedical Research》 CAS 2014年第5期416-422,共7页
The functionality of a gene or a protein depends on codon repeats occurring in it.As a consequence of their vitality in protein function and apparent involvement in causing diseases,an interest in these repeats has de... The functionality of a gene or a protein depends on codon repeats occurring in it.As a consequence of their vitality in protein function and apparent involvement in causing diseases,an interest in these repeats has developed in recent years.The analysis of genomic and proteomic sequences to identify such repeats requires some algorithmic support from informatics level.Here,we proposed an offline stand-alone toolkit Repeat Searcher and Motif Detector(RSMD),which uncovers and employs few novel approaches in identification of sequence repeats and motifs to understand their functionality in sequence level and their disease causing tendency.The tool offers various features such as identifying motifs,repeats and identification of disease causing repeats.RSMD was designed to provide an easily understandable graphical user interface(GUI),for the tool will be predominantly accessed by biologists and various researchers in all platforms of life science.GUI was developed using the scripting language Perl and its graphical module PerlTK.RSMD covers algorithmic foundations of computational biology by combining theory with practice. 展开更多
关键词 motif repeats genomic sequence proteomic sequence computational biology combination algorithm
下载PDF
Searching for Non-coding RNAs in Genomic Sequences Using ncRNAscout
4
作者 Michael Bao Miguel Cervantes Cervantes +1 位作者 Ling Zhong Jason T.L. Wang 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2012年第2期114-121,共8页
Recently non-coding RNA (ncRNA) genes have been found to serve many important functions in the cell such as regulation of gene expression at the transcriptional level. Potentially there are more ncRNA molecules yet ... Recently non-coding RNA (ncRNA) genes have been found to serve many important functions in the cell such as regulation of gene expression at the transcriptional level. Potentially there are more ncRNA molecules yet to be found and their possible functions are to be revealed. The discovery of ncRNAs is a difficult task because they lack sequence indicators such as the start and stop codons dis- played by protein-coding RNAs. Current methods utilize either sequence motifs or structural parameters to detect novel ncRNAs within genomes. Here, we present an ab initio ncRNA finder, named ncRNAscout, by utilizing both sequence motifs and structural parameters. Specifically, our method has three components: (i) a measure of the frequency of a sequence, (ii) a measure of the structural stability of a sequence contained in a t-score, and (iii) a measure of the frequency of certain patterns within a sequence that may indicate the presence of ncRNA. Experimental results show that, given a genome and a set of known ncRNAs, our method is able to accurately identify and locate a significant number of ncRNA sequences in the genome. The ncRNAscout tool is available for downloading at http:/]bioinfor- matics.njit.edu/ncRNAscout. 展开更多
关键词 Genome-wide ncRNA discovery sequence motifs Structural parameters
原文传递
Characterizing RNA Pseudouridylation by Convolutional Neural Networks 被引量:1
5
作者 Xuan He Sai Zhang +3 位作者 Yanqing Zhang Zhixin Lei Tao Jiang Jianyang Zeng 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2021年第5期815-833,共19页
Pseudouridine(Ψ)is the most prevalent post-transcriptional RNA modification and is widespread in small cellular RNAs and m RNAs.However,the functions,mechanisms,and precise distribution ofΨs(especially in m RNAs)sti... Pseudouridine(Ψ)is the most prevalent post-transcriptional RNA modification and is widespread in small cellular RNAs and m RNAs.However,the functions,mechanisms,and precise distribution ofΨs(especially in m RNAs)still remain largely unclear.The landscape ofΨs across the transcriptome has not yet been fully delineated.Here,we present a highly effective model based on a convolutional neural network(CNN),called Pseudo Uridy Lation Site Estimator(PULSE),to analyze large-scale profiling data ofΨsites and characterize the contextual sequence features of pseudouridylation.PULSE,consisting of two alternatively-stacked convolution and pooling layers followed by a fully-connected neural network,can automatically learn the hidden patterns of pseudouridylation from the local sequence information.Extensive validation tests demonstrated that PULSE can outperform other state-of-the-art prediction methods and achieve high prediction accuracy,thus enabling us to further characterize the transcriptome-wide landscape ofΨsites.We further showed that the prediction results derived from PULSE can provide novel insights into understanding the functional roles of pseudouridylation,such as the regulations of RNA secondary structure,codon usage,translation,and RNA stability,and the connection to single nucleotide variants.The source code and final model for PULSE are available at https://github.com/mlcb-thu/PULSE. 展开更多
关键词 Pseudouridylation Convolution neural network sequence motif TRANSLATION RNA stability
原文传递
A Profile of Native Integration Sites Used by φC31 Integrase in the Bovine Genome 被引量:1
6
作者 Lijuan Qu Qingwen Ma +5 位作者 Zaiwei Zhou Haiyan Ma Ying Huang Shuzhen Huang Fanyi Zeng Yitao Zeng 《Journal of Genetics and Genomics》 SCIE CAS CSCD 2012年第5期217-224,共8页
The Streptomyces phage φC31 integrase can efficiently target attB-bearing transgenes to endogenous pseudo attP sites within mammalian genomes. To better understand the activity of φC31 integrase in the bovine genome... The Streptomyces phage φC31 integrase can efficiently target attB-bearing transgenes to endogenous pseudo attP sites within mammalian genomes. To better understand the activity of φC31 integrase in the bovine genome, DNA sequences of 44 integration events were analyzed, and 32 pseudo attP sites were identified. The majority of these sites share a sequence motif that contains inverted repeats and has similarities to wild-type attP site. Genomic DNA flanking these sites typically contained repetitive sequence elements, such as short and long interspersed repetitive elements. These sequence features indicate that DNA sequence recognition plays an important role in guiding φC31-mediated site-specific integration. In addition, BF27 integration hotspot sites were identified in the bovine genome, which accounted for 13.6% of all isolated integration events and mapped to an intron of the deleted in liver cancer 1 (DLC1) gene. Also we found that the pseudo attP sites in the bovine genome had other features in common with those in the human genome. This study represents the first time that the sequence features of pseudo attP sites specific integrase system has great potential for applied modifications in the bovine genome were analyzed. We conclude that this site- of the bovine genome. 展开更多
关键词 φC31 integrase Pseudo attP sites sequence motif Repetitive elements Bovine genome
原文传递
SMS 2.0: An Updated Database to Study the Structural Plasticity of Short Peptide Fragments in Non-redundant Proteins 被引量:1
7
作者 Dheeraj Ravella Muthukumarasamy Uthaya Kumar +3 位作者 Durairaj Sherlin Mani Shankar Marthandan Kirti Vaishnavi Kanagaraj Sekar 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2012年第1期44-50,共7页
The function of a protein molecule is greatly influenced by its three-dimensional (3D) structure and therefore structure prediction will help identify its biological function. We have updated Sequence, Motif and Str... The function of a protein molecule is greatly influenced by its three-dimensional (3D) structure and therefore structure prediction will help identify its biological function. We have updated Sequence, Motif and Structure (SMS), the database of structurally rigid peptide fragments, by combining amino acid sequences and the corre- sponding 3D atomic coordinates of non-redundant (25%) and redundant (90%) protein chains available in the Protein Data Bank (PDB). SMS 2.0 provides information pertaining to the peptide fragments of length 5-14 resi- dues. The entire dataset is divided into three categories, namely, same sequence motifs having similar, intermedi- ate or dissimilar 3D structures. Further, options are provided to facilitate structural superposition using the pro- gram structural alignment of multiple proteins (STAMP) and the popular JAVA plug-in (Jmol) is deployed for visualization. In addition, functionalities are provided to search for the occurrences of the sequence motifs in other structural and sequence databases like PDB, Genome Database (GDB), Protein Information Resource (PIR) and Swiss-Prot. The updated database along with the search engine is available over the World Wide Web through the following URL http://cluster.physics.iisc.ernet.in/sms/. 展开更多
关键词 non-redundant protein chains sequence motifs 3D structure structural superposition
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部