Objective To screen the target genes that are associated with survival of breast cancer(BRCA) and explore their prognostic values and immune correlations with BRCA using multiple databases..Methods The microarray expr...Objective To screen the target genes that are associated with survival of breast cancer(BRCA) and explore their prognostic values and immune correlations with BRCA using multiple databases..Methods The microarray expression datasets of BRCA were downloaded from the Gene Expresssion Omnibus database(GEO) and analyzed to obtain differentially expressed genes(DEGs). Hub genes were obtained by constructing and visualizing the protein-protein interaction network of DEGs. The key gene was determined using R language, STRING, and Cytoscape, and the differential expression of the key gene was verified using external datasets The Cancer Genome Atlas(TCGA) and quantitative real-time PCR(q RT-PCR) for BRCA tissues of 37 patients. The prognostic value and immunological correlation of UBE2C in BRCA were explored using R language, TIMER, and Gene Set Enrichment Analysis(GSEA).Results Of 10 hub genes seleceed from 302 DEGS, UBE2C was identified as the gene associated with BRCA survival. The expression of UBE2C was differentially upregulated in BRCA, as verified by TCGA and q RT-PCR. Prognostic analysis revealed that UBE2C served as an independent prognostic factor. High expression of UBE2C was associated with decreased immune infiltration levels of B cells, CD4+ T cells, CD8+ T cells, macrophages, and myeloid dendritic cells in BRCA tissue. The expression of UBE2C in BRCA showed a significant correlation with immune checkpoints genes PDCD1, CD274, and CTLA4 expressions. There was a positive correlation between the expression of UBE2C and the tumor mutational burden and microsatellite instability. GSEA demonstrated that UBE2C expression significantly enriched 786 immune-related gene sets.Conclusions UBE2C expression in BRCA tissues is closely related to the BRCA immune microenvironment and showes predictive values on the survivals and prognosis of BRCA patients and the effecacy of immunotherapy. UBE2C may be an potential immune-related prognostic biomarker for BRCA.展开更多
This study analyzed and predicted following aspects of isopentenyl py- rophosphate isomerases (IPIs) of five north medicinal plants using bioinformatics methods and tools: physical and chemical properties, hydropho...This study analyzed and predicted following aspects of isopentenyl py- rophosphate isomerases (IPIs) of five north medicinal plants using bioinformatics methods and tools: physical and chemical properties, hydrophobicity/hydrophilicity, trans-membrane domain, secondary structure, subcellular localization and so on. The results showed that: there was no notable difference among the physical and chem- ical properties of IPIs of the five north medicinal plants; the IPIs were mainly hy- drophilic; the IPIs were mainly located in chloroplasts by subcellular localization; serine phosphorylation sites were the most; the secondary structures mainly consist- ed of c^-helixes and random coils; no signal peptide existed, indicating that the pro- tein IPI was non-secreted protein; no trans-membrane domain existed; and one functional domain was shown, Le., Nudix Hydrolase Superfamily. This study is of great significance to research on IPI gene functions, deep research on north medic- inal plants, improvement of efficacy of north medicinal plants and rational develop- ment and utilization of medicinal plant resources.展开更多
Ethylene plays an extensive role in plant growth and development.. 1-aminocyclopropane-1-carboxylate (ACC) oxidase (ACO) is the key enzyme in ethylene biosynthesis. In this study, a 354 g DNA and a 213 bp cDNA bas...Ethylene plays an extensive role in plant growth and development.. 1-aminocyclopropane-1-carboxylate (ACC) oxidase (ACO) is the key enzyme in ethylene biosynthesis. In this study, a 354 g DNA and a 213 bp cDNA base pair (bp) candidate fragment was amplified from pepper with primers derived from the ACO sequence (AJ011109) reported by Ernesto. The putative new gene was analyzed by bioinformatics tools.展开更多
AIM: To approach the elusive function of the SLA/LP molecule, we have characterized genomic organization and conservation of the major antigenic and functional properties of the SLA/LP molecule in various species. ME...AIM: To approach the elusive function of the SLA/LP molecule, we have characterized genomic organization and conservation of the major antigenic and functional properties of the SLA/LP molecule in various species. METHODS: By means of computational biology, we have characterized the complete SLA/LP gene, mRNA and deduced protein sequences in man, mouse, zebrafish, fly, and worm. RESULTS: The human SLA/LP gene sequence of approximately 39 kb, which maps to chromosome 4p15.2, is organized in 11 exons, of which 10 or 11 are translated, depending on the splice variant. Homologous molecules were identified in several biological model organisms. The various homologous protein sequences showed a high degree of similarity or homology, notably at those residues that are of functional importance. The only domain of the human protein sequence that lacks significant homology with homologous sequences is the major antigenic epitope recognized by autoantibodies from autoimmune hepatitis (AIH) patients. CONCLUSION: The SLA/LP molecule and its functionally relevant residues have been highly conserved throughout the evoluti n, suggesting an indispensable function of the molecule. The finding that the only non-conserved domain is the dominant antigenic epitope of the human SLA/LP sequence, suggests that SLA/LP autoimmunity is autoantigen-driven rather than being driven by molecular mimicry.展开更多
Polyploidy is common among agriculturally important crops. Popular genetic methods and their implementations cannot always be applied to polyploid genetic data. We give an overview about available tools and their limi...Polyploidy is common among agriculturally important crops. Popular genetic methods and their implementations cannot always be applied to polyploid genetic data. We give an overview about available tools and their limitations in terms of levels of ploidy, auto- and allo-ploidy. The main classes of tools are genotype calling, linkage mapping and haplotyping. The usability of the tools is discussed with a focus on their applicability to data sets produced by state of the art technologies. We show that many challenges remain until the toolset for polyploidy provides similar functionalities as those which are already available for diploids. Some tools have been developed over a decade ago and are now outdated. In addition, we discuss necessary steps to overcome this shortage in the future.展开更多
Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification....Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification. In this paper, a new method of using a tree regression to improve logistic classification model is introduced in biomarker data analysis. The numerical results show that the linear logistic model can be significantly improved by a tree regression on the residuals. Although the classification problem of binary responses is discussed in this research, the idea is easy to extend to the classification of multinomial responses.展开更多
The research on discovery and development of new treatments for cutaneous leishmaniasis has been declared as priority. Using bioinformatics approaches, this study aimed to identify antileishmanial activity in drugs th...The research on discovery and development of new treatments for cutaneous leishmaniasis has been declared as priority. Using bioinformatics approaches, this study aimed to identify antileishmanial activity in drugs that are currently used as anti-inflammatory and wound healing by such anti-Leishmania activity was validated by in vitro and in vivo assays. In silico analysis identified 153 compounds from which 87 were selected by data mining of DrugBank database, 22 and 44 were detected by PASS (http://pass.cribi.unipd.it) and BLAST (http://blast.ncbi.nlm.nih. gov/) alignment, respectively. The majority of identified drugs are used as skin protector, anti-acne, anti-ulcerative (wound healer) or anti-inflammatory and few of them had specific antileishmanial activity. The efficacy as antileishmanial was validated in vitro in 12/23 tested compounds and in all seven compounds that were evaluated in in vivo assays. Notably, this is the first report of antileishmanial activity for adapalene. In conclusion, bioinformatics tools not only can help to reduce time and cost of the drug discovery process but also may increase the chance that candidates identified in silico which have a validated antileishmanial activity by combining different biological properties.展开更多
[Objective] This study was conducted to clone and analyze ERECTA-LIKE1 gene in Zea mays by PCR and bioinformatics methods and to construct plant expression vector p Cambia3301-zm ERECTA-LIKE1. [Method] zm ERECTA-LIKE1...[Objective] This study was conducted to clone and analyze ERECTA-LIKE1 gene in Zea mays by PCR and bioinformatics methods and to construct plant expression vector p Cambia3301-zm ERECTA-LIKE1. [Method] zm ERECTA-LIKE1(zm ERL1)gene was obtained using RT-PCR, and physical-chemical properties were analyzed by bioinformatics methods, including domains,transmembrane regions, N-Glycosylation potential sites phosphorylation sites, and etc. [Result] Bioinformatics results showed that zm ERL1 gene was 2 169 bp, which encoded a protein consisting of 722 amino acids, 11 N-glycosylation potential sites and 42 kinase specific phosphorylation sites. According to CDD2.23 and TMHMM Server v. 2.0 software, there were leucine-rich repeats,a PKC domain and a transmembrane region in this protein. The theoretical p I and molecular weight of zm ERL1 encoded protein was 6.20 and 79 184.8 using Compute PI/Mw tool. Furthermore, we constructed the plant expression vector p Cambia3301-zm ERECTA-LIKE1 by subcloning zm ERL1 gene into p Cambia3301 instead of GUS. [Conclusion] The results provide a theoretical basis for the application of zm ERL1 gene in future study.展开更多
Polyploids are organisms with three or more complete chromosome sets. Polyploidization is widespread in plants and animals, and is an important mechanism of speciation. Genome sequencing and related molecular systemat...Polyploids are organisms with three or more complete chromosome sets. Polyploidization is widespread in plants and animals, and is an important mechanism of speciation. Genome sequencing and related molecular systematics and bioinformatics studies on plants and animals in recent years support the view that species have been shaped by whole genome duplication during evolution. The stability of polyploids depends on rapid genome recombination and changes in gene expression after formation. The formation of polyploids and subsequent diploidization are important aspects in long-term evolution. Polyploids can be formed in various ways. Among them, hybrid organisms formed by distant hybridization could produce unreduced gametes and thus generate offspring with doubled chromosomes, which is a fast, efficient method of polyploidization. The formation of fertile polyploids not only promoted the interflow of genetic materials among species and enriched the species diversity, but also laid the foundation for polyploidy breeding. The study of polyploids has both important theoretical significance and valuable applications. The production and application of polyploidy breeding have brought remarkable economic and social benefits.展开更多
Chemomics is an interdisciplinary study using approaches from chemoinformatics,bioinformatics,synthetic chemistry,and other related disciplines.Biological systems make natural products from endogenous small molecules ...Chemomics is an interdisciplinary study using approaches from chemoinformatics,bioinformatics,synthetic chemistry,and other related disciplines.Biological systems make natural products from endogenous small molecules (natural product building blocks) through a sequence of enzyme catalytic reactions.For each reaction,the natural product building blocks may contribute a group of atoms to the target natural product.We describe this group of atoms as a chemoyl.A chemome is the complete set of chemoyls in an organism.Chemomics studies chemomes and the principles of natural product syntheses and evolutions.Driven by survival and reproductive demands,biological systems have developed effective protocols to synthesize natural products in order to respond to environmental changes;this results in biological and chemical diversity.In recent years,it has been realized that one of the bottlenecks in drug discovery is the lack of chemical resources for drug screening.Chemomics may solve this problem by revealing the rules governing the creation of chemical diversity in biological systems,and by developing biomimetic synthesis approaches to make quasi natural product libraries for drug screening.This treatise introduces chemomics and outlines its contents and potential applications in the fields of drug innovation.展开更多
In the study of motif discovery, especially the transcription factor DNA binding sites discovery, a too long input sequence would return non-informative motifs rather than those biological functional motifs. This pape...In the study of motif discovery, especially the transcription factor DNA binding sites discovery, a too long input sequence would return non-informative motifs rather than those biological functional motifs. This paper gave theoretical analyses and computational experiments to suggest the length limits of the input sequence. When the sequence length exceeds a certain critical point, the probability of discovering the motif decreases sharply. The work not only gave an explanation on the unsatisfying results of the existed motif discovery problems that the input sequence length might be too long and exceed the point, but also provided an estimation of input sequence length we should accept to get more meaningful and reliable results in motif discovery.展开更多
Proteomics allows the large-scale study of protein expression either in whole organisms or in purified organelles. In particular, mass spectrometry (MS) analysis of gel-separated proteins produces data not only for ...Proteomics allows the large-scale study of protein expression either in whole organisms or in purified organelles. In particular, mass spectrometry (MS) analysis of gel-separated proteins produces data not only for protein identification, but for protein structure, location, and processing as well. An in-depth analysis was performed on MS data from etiolated hypocotyl cell wall proteomics ofArabidopsis thaliana. These analyses show that highly homologous members of multigene families can be differentiated. Two lectins presenting 93% amino acid identity were identified using peptide mass fingerprinting. Although the identification of structural proteins such as extensins or hydroxyproline/proline-rich proteins (H/PRPs) is arduous, different types of MS spectra were exploited to identify and characterize an H/PRP. Maturation events in a couple of cell wall proteins (CWPs) were analyzed using site mapping. N-glycosylation of CWPs as well as the hydroxylation or oxidation of amino acids were also explored, adding information to improve our understanding of CWP structure/function relationships. A bioinformatic tool was developed to locate by means of MS the N-terminus of mature secreted proteins and N-glycosylation.展开更多
Although high quality multiple sequence alignment is an essential task in bioinforma- tics, it becomes a big dilemma nowadays due to the gigantic explosion in the amount of molecular data. The most consuming time and ...Although high quality multiple sequence alignment is an essential task in bioinforma- tics, it becomes a big dilemma nowadays due to the gigantic explosion in the amount of molecular data. The most consuming time and space phase is the distance matrix computation. This paper addresses this issue by proposing a vectorized parallel method that accomplishes the huge number of similarity comparisons faster in less space. Per- formance tests on real biological datasets using core-iT show superior results in terms of time and space.展开更多
文摘Objective To screen the target genes that are associated with survival of breast cancer(BRCA) and explore their prognostic values and immune correlations with BRCA using multiple databases..Methods The microarray expression datasets of BRCA were downloaded from the Gene Expresssion Omnibus database(GEO) and analyzed to obtain differentially expressed genes(DEGs). Hub genes were obtained by constructing and visualizing the protein-protein interaction network of DEGs. The key gene was determined using R language, STRING, and Cytoscape, and the differential expression of the key gene was verified using external datasets The Cancer Genome Atlas(TCGA) and quantitative real-time PCR(q RT-PCR) for BRCA tissues of 37 patients. The prognostic value and immunological correlation of UBE2C in BRCA were explored using R language, TIMER, and Gene Set Enrichment Analysis(GSEA).Results Of 10 hub genes seleceed from 302 DEGS, UBE2C was identified as the gene associated with BRCA survival. The expression of UBE2C was differentially upregulated in BRCA, as verified by TCGA and q RT-PCR. Prognostic analysis revealed that UBE2C served as an independent prognostic factor. High expression of UBE2C was associated with decreased immune infiltration levels of B cells, CD4+ T cells, CD8+ T cells, macrophages, and myeloid dendritic cells in BRCA tissue. The expression of UBE2C in BRCA showed a significant correlation with immune checkpoints genes PDCD1, CD274, and CTLA4 expressions. There was a positive correlation between the expression of UBE2C and the tumor mutational burden and microsatellite instability. GSEA demonstrated that UBE2C expression significantly enriched 786 immune-related gene sets.Conclusions UBE2C expression in BRCA tissues is closely related to the BRCA immune microenvironment and showes predictive values on the survivals and prognosis of BRCA patients and the effecacy of immunotherapy. UBE2C may be an potential immune-related prognostic biomarker for BRCA.
基金Supported by Science and Technology Planning Project of Mudanjiang(G2015d1974)Funding Project of Training of Famous Teachers in Mudanjiang Normal University(2014QNGG1805)~~
文摘This study analyzed and predicted following aspects of isopentenyl py- rophosphate isomerases (IPIs) of five north medicinal plants using bioinformatics methods and tools: physical and chemical properties, hydrophobicity/hydrophilicity, trans-membrane domain, secondary structure, subcellular localization and so on. The results showed that: there was no notable difference among the physical and chem- ical properties of IPIs of the five north medicinal plants; the IPIs were mainly hy- drophilic; the IPIs were mainly located in chloroplasts by subcellular localization; serine phosphorylation sites were the most; the secondary structures mainly consist- ed of c^-helixes and random coils; no signal peptide existed, indicating that the pro- tein IPI was non-secreted protein; no trans-membrane domain existed; and one functional domain was shown, Le., Nudix Hydrolase Superfamily. This study is of great significance to research on IPI gene functions, deep research on north medic- inal plants, improvement of efficacy of north medicinal plants and rational develop- ment and utilization of medicinal plant resources.
文摘Ethylene plays an extensive role in plant growth and development.. 1-aminocyclopropane-1-carboxylate (ACC) oxidase (ACO) is the key enzyme in ethylene biosynthesis. In this study, a 354 g DNA and a 213 bp cDNA base pair (bp) candidate fragment was amplified from pepper with primers derived from the ACO sequence (AJ011109) reported by Ernesto. The putative new gene was analyzed by bioinformatics tools.
基金Supported by the Deutsche Forschungsgemeinschaft(SFB 548)
文摘AIM: To approach the elusive function of the SLA/LP molecule, we have characterized genomic organization and conservation of the major antigenic and functional properties of the SLA/LP molecule in various species. METHODS: By means of computational biology, we have characterized the complete SLA/LP gene, mRNA and deduced protein sequences in man, mouse, zebrafish, fly, and worm. RESULTS: The human SLA/LP gene sequence of approximately 39 kb, which maps to chromosome 4p15.2, is organized in 11 exons, of which 10 or 11 are translated, depending on the splice variant. Homologous molecules were identified in several biological model organisms. The various homologous protein sequences showed a high degree of similarity or homology, notably at those residues that are of functional importance. The only domain of the human protein sequence that lacks significant homology with homologous sequences is the major antigenic epitope recognized by autoantibodies from autoimmune hepatitis (AIH) patients. CONCLUSION: The SLA/LP molecule and its functionally relevant residues have been highly conserved throughout the evoluti n, suggesting an indispensable function of the molecule. The finding that the only non-conserved domain is the dominant antigenic epitope of the human SLA/LP sequence, suggests that SLA/LP autoimmunity is autoantigen-driven rather than being driven by molecular mimicry.
文摘Polyploidy is common among agriculturally important crops. Popular genetic methods and their implementations cannot always be applied to polyploid genetic data. We give an overview about available tools and their limitations in terms of levels of ploidy, auto- and allo-ploidy. The main classes of tools are genotype calling, linkage mapping and haplotyping. The usability of the tools is discussed with a focus on their applicability to data sets produced by state of the art technologies. We show that many challenges remain until the toolset for polyploidy provides similar functionalities as those which are already available for diploids. Some tools have been developed over a decade ago and are now outdated. In addition, we discuss necessary steps to overcome this shortage in the future.
文摘Researchers in bioinformatics, biostatistics and other related fields seek biomarkers for many purposes, including risk assessment, disease diagnosis and prognosis, which can be formulated as a patient classification. In this paper, a new method of using a tree regression to improve logistic classification model is introduced in biomarker data analysis. The numerical results show that the linear logistic model can be significantly improved by a tree regression on the residuals. Although the classification problem of binary responses is discussed in this research, the idea is easy to extend to the classification of multinomial responses.
文摘The research on discovery and development of new treatments for cutaneous leishmaniasis has been declared as priority. Using bioinformatics approaches, this study aimed to identify antileishmanial activity in drugs that are currently used as anti-inflammatory and wound healing by such anti-Leishmania activity was validated by in vitro and in vivo assays. In silico analysis identified 153 compounds from which 87 were selected by data mining of DrugBank database, 22 and 44 were detected by PASS (http://pass.cribi.unipd.it) and BLAST (http://blast.ncbi.nlm.nih. gov/) alignment, respectively. The majority of identified drugs are used as skin protector, anti-acne, anti-ulcerative (wound healer) or anti-inflammatory and few of them had specific antileishmanial activity. The efficacy as antileishmanial was validated in vitro in 12/23 tested compounds and in all seven compounds that were evaluated in in vivo assays. Notably, this is the first report of antileishmanial activity for adapalene. In conclusion, bioinformatics tools not only can help to reduce time and cost of the drug discovery process but also may increase the chance that candidates identified in silico which have a validated antileishmanial activity by combining different biological properties.
基金Supported by the Distinguished Young Scientists Project of Beijing(CIT&TCD201304096)Academic Degrees and Graduate Education Reform and Development Program of Beijing University of Agriculture(5056516002\016)
文摘[Objective] This study was conducted to clone and analyze ERECTA-LIKE1 gene in Zea mays by PCR and bioinformatics methods and to construct plant expression vector p Cambia3301-zm ERECTA-LIKE1. [Method] zm ERECTA-LIKE1(zm ERL1)gene was obtained using RT-PCR, and physical-chemical properties were analyzed by bioinformatics methods, including domains,transmembrane regions, N-Glycosylation potential sites phosphorylation sites, and etc. [Result] Bioinformatics results showed that zm ERL1 gene was 2 169 bp, which encoded a protein consisting of 722 amino acids, 11 N-glycosylation potential sites and 42 kinase specific phosphorylation sites. According to CDD2.23 and TMHMM Server v. 2.0 software, there were leucine-rich repeats,a PKC domain and a transmembrane region in this protein. The theoretical p I and molecular weight of zm ERL1 encoded protein was 6.20 and 79 184.8 using Compute PI/Mw tool. Furthermore, we constructed the plant expression vector p Cambia3301-zm ERECTA-LIKE1 by subcloning zm ERL1 gene into p Cambia3301 instead of GUS. [Conclusion] The results provide a theoretical basis for the application of zm ERL1 gene in future study.
基金supported by the National High Technology Research and Development Program of China (Grant No. 2011AA100403)the National Natural Science Foundation of China (Grant No. 30930071)+2 种基金the National Special Fund for Scientific Research in Public Benefits (Grant No. 200903046)the Specially-appointed Professor for Lotus Scholars Program of Hunan Province (Grant No. 080648)the Doctoral Fund Priority Development Area (Grant No. 20114306130001)
文摘Polyploids are organisms with three or more complete chromosome sets. Polyploidization is widespread in plants and animals, and is an important mechanism of speciation. Genome sequencing and related molecular systematics and bioinformatics studies on plants and animals in recent years support the view that species have been shaped by whole genome duplication during evolution. The stability of polyploids depends on rapid genome recombination and changes in gene expression after formation. The formation of polyploids and subsequent diploidization are important aspects in long-term evolution. Polyploids can be formed in various ways. Among them, hybrid organisms formed by distant hybridization could produce unreduced gametes and thus generate offspring with doubled chromosomes, which is a fast, efficient method of polyploidization. The formation of fertile polyploids not only promoted the interflow of genetic materials among species and enriched the species diversity, but also laid the foundation for polyploidy breeding. The study of polyploids has both important theoretical significance and valuable applications. The production and application of polyploidy breeding have brought remarkable economic and social benefits.
基金supported by the National Science and Technology Major Project of China (2010ZX09102-305)the National High-tech R&D Program of China (863 Program,2012AA020307)+1 种基金the Introduction of Innovative R&D Team Program of Guangdong Province (2009010058)the National Natural Science Foundation of China (81173470)
文摘Chemomics is an interdisciplinary study using approaches from chemoinformatics,bioinformatics,synthetic chemistry,and other related disciplines.Biological systems make natural products from endogenous small molecules (natural product building blocks) through a sequence of enzyme catalytic reactions.For each reaction,the natural product building blocks may contribute a group of atoms to the target natural product.We describe this group of atoms as a chemoyl.A chemome is the complete set of chemoyls in an organism.Chemomics studies chemomes and the principles of natural product syntheses and evolutions.Driven by survival and reproductive demands,biological systems have developed effective protocols to synthesize natural products in order to respond to environmental changes;this results in biological and chemical diversity.In recent years,it has been realized that one of the bottlenecks in drug discovery is the lack of chemical resources for drug screening.Chemomics may solve this problem by revealing the rules governing the creation of chemical diversity in biological systems,and by developing biomimetic synthesis approaches to make quasi natural product libraries for drug screening.This treatise introduces chemomics and outlines its contents and potential applications in the fields of drug innovation.
文摘In the study of motif discovery, especially the transcription factor DNA binding sites discovery, a too long input sequence would return non-informative motifs rather than those biological functional motifs. This paper gave theoretical analyses and computational experiments to suggest the length limits of the input sequence. When the sequence length exceeds a certain critical point, the probability of discovering the motif decreases sharply. The work not only gave an explanation on the unsatisfying results of the existed motif discovery problems that the input sequence length might be too long and exceed the point, but also provided an estimation of input sequence length we should accept to get more meaningful and reliable results in motif discovery.
文摘Proteomics allows the large-scale study of protein expression either in whole organisms or in purified organelles. In particular, mass spectrometry (MS) analysis of gel-separated proteins produces data not only for protein identification, but for protein structure, location, and processing as well. An in-depth analysis was performed on MS data from etiolated hypocotyl cell wall proteomics ofArabidopsis thaliana. These analyses show that highly homologous members of multigene families can be differentiated. Two lectins presenting 93% amino acid identity were identified using peptide mass fingerprinting. Although the identification of structural proteins such as extensins or hydroxyproline/proline-rich proteins (H/PRPs) is arduous, different types of MS spectra were exploited to identify and characterize an H/PRP. Maturation events in a couple of cell wall proteins (CWPs) were analyzed using site mapping. N-glycosylation of CWPs as well as the hydroxylation or oxidation of amino acids were also explored, adding information to improve our understanding of CWP structure/function relationships. A bioinformatic tool was developed to locate by means of MS the N-terminus of mature secreted proteins and N-glycosylation.
文摘Although high quality multiple sequence alignment is an essential task in bioinforma- tics, it becomes a big dilemma nowadays due to the gigantic explosion in the amount of molecular data. The most consuming time and space phase is the distance matrix computation. This paper addresses this issue by proposing a vectorized parallel method that accomplishes the huge number of similarity comparisons faster in less space. Per- formance tests on real biological datasets using core-iT show superior results in terms of time and space.