期刊文献+
共找到675篇文章
< 1 2 34 >
每页显示 20 50 100
Prediction of Lung Cancer Stage Using Tumor Gene Expression Data
1
作者 Yadi Gu 《Journal of Cancer Therapy》 2024年第8期287-302,共16页
Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based... Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based models for classifying cancer types using machine learning techniques. By applying Log2 normalization to gene expression data and conducting Wilcoxon rank sum tests, the researchers employed various classifiers and Incremental Feature Selection (IFS) strategies. The study culminated in two optimized models using the XGBoost classifier, comprising 10 and 74 genes respectively. The 10-gene model, due to its simplicity, is proposed for easier clinical implementation, whereas the 74-gene model exhibited superior performance in terms of Specificity, AUC (Area Under the Curve), and Precision. These models were evaluated based on their sensitivity, AUC, and specificity, aiming to achieve high sensitivity and AUC while maintaining reasonable specificity. 展开更多
关键词 Lung Cancer Detection Stage Prediction gene Expression data Xgboost Machine Learning
下载PDF
A Novel Soft Clustering Approach for Gene Expression Data
2
作者 E.Kavitha R.Tamilarasan +1 位作者 Arunadevi Baladhandapani M.K.Jayanthi Kannan 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期871-886,共16页
Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode p... Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode proteins which in turn will dictate the cell function. The production of messengerRNA along with processing the same are the two main stages involved in the process of gene expression. The biological networks complexity added with thevolume of data containing imprecision and outliers increases the challenges indealing with them. Clustering methods are hence essential to identify the patternspresent in massive gene data. Many techniques involve hierarchical, partitioning,grid based, density based, model based and soft clustering approaches for dealingwith the gene expression data. Understanding the gene regulation and other usefulinformation from this data can be possible only through effective clustering algorithms. Though many methods are discussed in the literature, we concentrate onproviding a soft clustering approach for analyzing the gene expression data. Thepopulation elements are grouped based on the fuzziness principle and a degree ofmembership is assigned to all the elements. An improved Fuzzy clustering byLocal Approximation of Memberships (FLAME) is proposed in this workwhich overcomes the limitations of the other approaches while dealing with thenon-linear relationships and provide better segregation of biological functions. 展开更多
关键词 REINFORCEMENT MEMBERSHIP CENTROID threshold STATISTICS BIOINFORMATICS gene expression data
下载PDF
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
3
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 BIOINFORMATICS data science microarray gene expression data classification deep learning metaheuristics
下载PDF
Analysis of Gene Expression Profiles of Rice Mutant SLR1 Based on Microarray Data
4
作者 Weihua LIU Yue CHEN +4 位作者 Lingxian WANG Ge HUANG Qian ZOU Zhenhua ZHU Mingliang DING 《Asian Agricultural Research》 2019年第1期54-55,59,共3页
Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal trans... Gibberellins are an important class of plant hormones.They play an important regulatory role in all stages of growth and development of higher plants.The use of mutants to study gibberellin metabolism and signal transduction pathways is currently a research hotspot.This article takes the data of Affymetrix chips of rice as an example,bioinformatics method was used to study rice SLR1 mutant and mine differentially expressed wild-type genes,thus exploring the expression regulation network of gibberellin signaling pathway-related genes. 展开更多
关键词 GIBBERELLIN gene CHIP data MINING
下载PDF
Incorporating heterogeneous biological data sources in clustering gene expression data
5
作者 Gang-Guo Li Zheng-Zhi Wang 《Health》 2009年第1期17-23,共7页
In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity... In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity measure. On the basis of the similarity measures of protein- protein interaction data and chip-chip data, the combined dissimilarity measure is defined. The combined distance measure is introduced into K-means method, which can be considered as an improved K-means method. The improved K-means method and other three clustering methods are evaluated by a real dataset. Per-formance of these methods is assessed by a prediction accuracy analysis through known gene annotations. Our results show that the improved K-means method outperforms other clustering methods. The performance of the improved K-means method is also tested by varying the tuning coefficients of the combined dissimilarity measure. The results show that it is very helpful and meaningful to incorporate het-erogeneous data sources in clustering gene expression data, and those coefficients for the genome-wide or completed data sources should be given larger values when constructing the combined dissimilarity measure. 展开更多
关键词 STATISTICAL Analysis Similarity/ DISSIMILARITY MEASURE gene Expression data Clustering data Fusion
下载PDF
Genome-Wide Identification of Genes Responsive to ABA and Cold/Salt Stresses in Gossypium hirsutum by Data-Mining and Expression Pattern Analysis
6
作者 ZHU Long-fu HE Xin +6 位作者 YUAN Dao-jun XU Lian XU Li TU Li-li SHEN Guo-xin ZHANG Hong ZHANG Xian-long 《Agricultural Sciences in China》 CAS CSCD 2011年第4期499-508,共10页
For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database inc... For making better use of nucleic acid resources of Gossypium hirsutum, a data-mining method was used to identify putative genes responsive to various abiotic stresses in G. hirsutum. Based on the compiled database including genes involved in abiotic stress response in Arabidopsis thaliana and the comprehensive analysis tool of GENEVESTIGATOR v3, 826 genes up-regulated or down-regulated significantly in roots or leaves during salt or cold treatment in Arabidopsis were identified. As compared to these 826 Arabidopsis genes annotated, 38 homologous expressed sequence tags (ESTs) from G. hirsutum were selected randomly and their expression patterns were studied using a quantitative real-time reverse transcription-polymerase chain reaction method. Among these 38 ESTs, about 55% of the genes (21 of 38) were different in response to ABA between cotton and Arabidopsis, whereas 70% of genes had similar responses to cold and salt treatments, and some of them which had not been characterized in Arabidopsis are now being investigated in gene function studies. According to these results, this approach of analyzing ESTs appears effective in large-scale identification of cotton genes involved in abiotic stress and might be adopted to determine gene functions in various biologic processes in cotton. 展开更多
关键词 cold stress salt stress data-MINING gene Gossypium hirsutum
下载PDF
Challenges Analyzing RNA-Seq Gene Expression Data
7
作者 Liliana López-Kleine Cristian González-Prieto 《Open Journal of Statistics》 2016年第4期628-636,共9页
The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pr... The analysis of messenger Ribonucleic acid obtained through sequencing techniques (RNA-se- quencing) data is very challenging. Once technical difficulties have been sorted, an important choice has to be made during pre-processing: Two different paths can be chosen: Transform RNA- sequencing count data to a continuous variable or continue to work with count data. For each data type, analysis tools have been developed and seem appropriate at first sight, but a deeper analysis of data distribution and structure, are a discussion worth. In this review, open questions regarding RNA-sequencing data nature are discussed and highlighted, indicating important future research topics in statistics that should be addressed for a better analysis of already available and new appearing gene expression data. Moreover, a comparative analysis of RNAseq count and transformed data is presented. This comparison indicates that transforming RNA-seq count data seems appropriate, at least for differential expression detection. 展开更多
关键词 RNA-Seq Analysis Count data PREPROCESSING Differential Expression gene Co-Expression Network
下载PDF
Gene Ontology在生物数据整合中的应用 被引量:8
8
作者 夏燕 张忠平 +2 位作者 曹顺良 朱扬勇 李亦学 《计算机工程》 EI CAS CSCD 北大核心 2005年第2期57-58,76,共3页
异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在... 异构数据的高效整合,在生物数据呈爆炸性增长、生物数据库复杂度不断增加的今天,具有重要的理论价值和实际意义。该文基于BioDW——一个整合的生物信息学数据仓库平台,利用统一的GeneOntology语义模型,建立异构数据库之间的语义链接,在概念和联系层次上有效地解决了生物异构数据的整合问题,实现了对生物数据智能化的多重、复合和交叉检索,为生物信息的进一步研究奠定了坚实的基础。 展开更多
关键词 生物 整合问题 实际 检索 数据整合 层次 联系 异构数据库 语义模型 数据仓库
下载PDF
DENGENE:一种高精度的基于密度的适用于基因表达数据的聚类算法 被引量:1
9
作者 孙亮 赵芳 王永吉 《计算机应用研究》 CSCD 北大核心 2007年第4期58-61,共4页
根据基因表达数据的特点,提出一种高精度的基于密度的聚类算法DENGENE。DENGENE通过定义一致性检测和引进峰点改进搜索方向,使得算法能够更好地处理基因表达数据。为了评价算法的性能,选取了两组广为使用的测试数据,即啤酒酵母基因表达... 根据基因表达数据的特点,提出一种高精度的基于密度的聚类算法DENGENE。DENGENE通过定义一致性检测和引进峰点改进搜索方向,使得算法能够更好地处理基因表达数据。为了评价算法的性能,选取了两组广为使用的测试数据,即啤酒酵母基因表达数据集对算法来进行测试。实验结果表明,与基于模型的五种算法、CAST算法、K-均值聚类等相比,DENGENE在滤除噪声和聚类精度方面取得了显著的改善。 展开更多
关键词 基因表达数据 聚类分析 基于密度的聚类 一致性检测 峰点
下载PDF
GEO(Gene Expression Omnibus):高通量基因表达数据库 被引量:9
10
作者 刘华 马文丽 郑文岭 《中国生物化学与分子生物学报》 CAS CSCD 北大核心 2007年第3期236-244,共9页
GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被... GEO(Gene Expression Omnibus)数据库包括高通量实验数据的广泛分类,有单通道和双通道以微阵列为基础的对mRNA丰度的测定;基因组DNA和蛋白质分子的实验数据;其中包括来自以非阵列为基础的高通量功能基因组学和蛋白质组学技术的数据也被存档,例如基因表达系列分析(serial analysis of gene expression,SAGE)和蛋白质鉴定技术.迄今为止,GEO数据库包含的数据含概10000个杂交实验和来自30种不同生物体的SAGE库.本文概述了GEO数据库的查询和浏览,数据下载和格式,数据分析,贮存与更新,并着重分析GEO数据浏览器中控制词汇的使用,阐述了GEO数据库的数据挖掘以及GEO在分子生物学领域中的应用前景.GEO可由此公众网址直接登陆http://www.ncbi.nlm.nih.gov/projects/geo/. 展开更多
关键词 基因表达 数据库 控制词汇 数据挖掘
下载PDF
GeneSifter在基因表达谱芯片数据挖掘中的应用 被引量:5
11
作者 廖之君 马文丽 +1 位作者 梁爽 郑文岭 《医学信息(西安上半月)》 2007年第11期1882-1887,共6页
介绍一款基因芯片数据分析工具──GeneSifter软件,具有快速、直观、便捷等特点,尤其适用于基因表达谱的数据挖掘。芯片数据一般以格式化文本文档形式上载,根据实验目的、设计不同,总共有4种上载向导工具,数据分析从控制台Analysis项目... 介绍一款基因芯片数据分析工具──GeneSifter软件,具有快速、直观、便捷等特点,尤其适用于基因表达谱的数据挖掘。芯片数据一般以格式化文本文档形式上载,根据实验目的、设计不同,总共有4种上载向导工具,数据分析从控制台Analysis项目下的Pairwise或Projects进入,需要设置滤过、阈值和统计分析等参数,Pairwise可获得的结果有:差异显著性基因列表、基因本体报告和KEGG通路报告等,Projects有更为强大的功能,可获取聚类等6种结果。GeneSifter独特的一站式单击设置,可获得相关基因的11个数据库最新链接。GeneSifter适用于基因芯片数据挖掘的生物研究人员。 展开更多
关键词 geneSifter软件 数据挖掘 基因本体术语 KEGG通路 聚类
下载PDF
Gene Panel流程的并行设计与优化研究 被引量:1
12
作者 王元戎 曾平 +2 位作者 臧大伟 谭光明 孙凝晖 《计算机学报》 EI CSCD 北大核心 2019年第11期2429-2446,共18页
随着二代测序技术的快速发展,基因测序成本迅速下降,这导致基因数据的爆炸式增长,基因数据分析工具逐渐无法满足如此大规模的数据分析需求.一方面,基因数据分析工具大多仍为串行执行,无法有效地利用多核结构提升性能并导致计算资源的严... 随着二代测序技术的快速发展,基因测序成本迅速下降,这导致基因数据的爆炸式增长,基因数据分析工具逐渐无法满足如此大规模的数据分析需求.一方面,基因数据分析工具大多仍为串行执行,无法有效地利用多核结构提升性能并导致计算资源的严重浪费;另一方面,由于前期设计和开发的局限性,分析工具所依赖的底层算法库不能兼顾高性能与友好的用户接口.Gene Panel是当前主流的面向癌症检测的基因数据分析流程,它也是由多种基因数据分析工具组成的.该文面向Gene Panel流程:(1)设计并实现了一套全新的并行Gene Panel基因数据分析流程,通过数据并行和任务并行两种主要并行手段并结合负载均衡等其他优化方法,有效地提升了多核平台的资源利用率,并获得了4~7倍的整体加速比;(2)设计并实现了一种接口友好的高性能基因数据分析底层库HCC.由于相似的算法特征,该文的优化方法同样适用于除Gene Panel外的其他测序流程. 展开更多
关键词 大数据 gene PANEL 并行优化 负载均衡 底层库优化
下载PDF
Identify the signature genes for diagnose of uveal melanoma by weight gene co-expression network analysis 被引量:10
13
作者 Kai Shi Zhi-Tong Bing +4 位作者 Gui-Qun Cao Ling Guo Ya-Na Cao Hai-Ou Jiang Mei-Xia Zhang 《International Journal of Ophthalmology(English edition)》 SCIE CAS 2015年第2期269-274,共6页
AIM: To identify and understand the relationship between co-expression pattern and clinic traits in uveal melanoma, weighted gene co-expression network analysis(WGCNA) is applied to investigate the gene expression lev... AIM: To identify and understand the relationship between co-expression pattern and clinic traits in uveal melanoma, weighted gene co-expression network analysis(WGCNA) is applied to investigate the gene expression levels and patient clinic features. Uveal melanoma is the most common primary eye tumor in adults. Although many studies have identified some important genes and pathways that were relevant to progress of uveal melanoma, the relationship between co-expression and clinic traits in systems level of uveal melanoma is unclear yet. We employ WGCNA to investigate the relationship underlying molecular and phenotype in this study.METHODS: Gene expression profile of uveal melanoma and patient clinic traits were collected from the Gene Expression Omnibus(GEO) database. The gene co-expression is calculated by WGCNA that is the R package software. The package is used to analyze the correlation between pairs of expression levels of genes.The function of the genes were annotated by gene ontology(GO).RESULTS: In this study, we identified four co-expression modules significantly correlated with clinictraits. Module blue positively correlated with radiotherapy treatment. Module purple positively correlates with tumor location(sclera) and negatively correlates with patient age. Module red positively correlates with sclera and negatively correlates with thickness of tumor. Module black positively correlates with the largest tumor diameter(LTD). Additionally, we identified the hug gene(top connectivity with other genes) in each module. The hub gene RPS15 A, PTGDS, CD53 and MSI2 might play a vital role in progress of uveal melanoma.CONCLUSION: From WGCNA analysis and hub gene calculation, we identified RPS15 A, PTGDS, CD53 and MSI2 might be target or diagnosis for uveal melanoma. 展开更多
关键词 weighted gene co-expression network analysis microarray data gene ontology
下载PDF
A Survey on Acute Leukemia Expression Data Classification Using Ensembles
14
作者 Abdel Nasser H.Zaied Ehab Rushdy Mona Gamal 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1349-1364,共16页
Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists... Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists during the classification process.More than two decades ago,researchers used microarray gene expression data to classify cancer and adopted acute leukemia as a test case.The high classification accuracy they achieved confirmed that it is possible to classify cancer subtypes using microarray gene expression data.Ensemble machine learning is an effective method that combines individual classifiers to classify new samples.Ensemble classifiers are recognized as powerful algorithms with numerous advantages over traditional classifiers.Over the past few decades,researchers have focused a great deal of attention on ensemble classifiers in a wide variety of fields,including but not limited to disease diagnosis,finance,bioinformatics,healthcare,manufacturing,and geography.This paper reviews the recent ensemble classifier approaches utilized for acute leukemia gene expression data classification.Moreover,a framework for classifying acute leukemia gene expression data is proposed.The pairwise correlation gene selection method and the Rotation Forest of Bayesian Networks are both used in this framework.Experimental outcomes show that the classification accuracy achieved by the acute leukemia ensemble classifiers constructed according to the suggested framework is good compared to the classification accuracy achieved in other studies. 展开更多
关键词 LEUKEMIA CLASSIFICATION ENSEMBLE rotation forest pairwise correlation bayesian networks gene expression data MICROARRAY gene selection
下载PDF
Data Mining Based on Principal Component Analysis Application to the Nitric Oxide Response in Escherichia coli
15
作者 AiLing Teh Donovan Layton +2 位作者 Daniel R. Hyduke Laura R. Jarboe Derrick K. Rollins Sd 《Journal of Statistical Science and Application》 2014年第1期1-18,共18页
This work evaluates a recently developed multivariate statistical method based on the creation of pseudo or latent variables using principal component analysis (PCA). The application is the data mining of gene expre... This work evaluates a recently developed multivariate statistical method based on the creation of pseudo or latent variables using principal component analysis (PCA). The application is the data mining of gene expression data to find a small subset of the most important genes in a set of thousand or tens of thousands of genes from a relatively small number of experimental runs. The method was previously developed and evaluated on artificially generated data and real data sets. Its evaluations consisted of its ability to rank the genes against known truth in simulated data studies and to identify known important genes in real data studies. The purpose of the work described here is to identify a ranked set of genes in an experimental study and then for a few of the most highly ranked unverified genes, experimentally verify their importance.This method was evaluated using the transcriptional response of Escherichia coli to treatment with four distinct inhibitory compounds: nitric oxide, S-nitrosoglutathione, serine hydroxamate and potassium cyanide. Our analysis identified genes previously recognized in the response to these compounds and also identified new genes.Three of these new genes, ycbR, yJhA and yahN, were found to significantly (p-values〈0.002) affect the sensitivityofE, coli to nitric oxide-mediated growth inhibition. Given that the three genes were not highly ranked in the selected ranked set (RS), these results support strong sensitivity in the ability of the method to successfully identify genes related to challenge by NO and GSNO. This ability to identify genes related to the response to an inhibitory compound is important for engineering tolerance to inhibitory metabolic products, such as biofuels, and utilization of cheap sugar streams, such as biomass-derived sugars or hydrolysate. 展开更多
关键词 data mining principal component analysis (PCA) gene expression data analysis
下载PDF
Modeling of gene regulatory networks: A review
16
作者 Nedumparambathmarath Vijesh Swarup Kumar Chakrabarti Janardanan Sreekumar 《Journal of Biomedical Science and Engineering》 2013年第2期223-231,共9页
Gene regulatory networks play an important role the molecular mechanism underlying biological processes. Modeling of these networks is an important challenge to be addressed in the post genomic era. Several methods ha... Gene regulatory networks play an important role the molecular mechanism underlying biological processes. Modeling of these networks is an important challenge to be addressed in the post genomic era. Several methods have been proposed for estimating gene networks from gene expression data. Computational methods for development of network models and analysis of their functionality have proved to be valuable tools in bioinformatics applications. In this paper we tried to review the different methods for reconstructing gene regulatory networks. 展开更多
关键词 gene NETWORK gene EXPRESSION data gene REGULATION
下载PDF
The application of hidden markov model in building genetic regulatory network
17
作者 Rui-Rui Ji Ding Liu Wen Zhang 《Journal of Biomedical Science and Engineering》 2010年第6期633-637,共5页
The research hotspot in post-genomic era is from sequence to function. Building genetic regulatory network (GRN) can help to understand the regulatory mechanism between genes and the function of organisms. Probabilist... The research hotspot in post-genomic era is from sequence to function. Building genetic regulatory network (GRN) can help to understand the regulatory mechanism between genes and the function of organisms. Probabilistic GRN has been paid more attention recently. This paper discusses the Hidden Markov Model (HMM) approach served as a tool to build GRN. Different genes with similar expression levels are considered as different states during training HMM. The probable regulatory genes of target genes can be found out through the resulting states transition matrix and the determinate regulatory functions can be predicted using nonlinear regression algorithm. The experiments on artificial and real-life datasets show the effectiveness of HMM in building GRN. 展开更多
关键词 geneTIC REGULATORY Network Hidden MARKOV Model STATES TRANSITION gene Expression data
下载PDF
血管样本生物信息学分析鉴定烟雾病相关的潜在关键基因
18
作者 刘洋 杨俊华 +1 位作者 吴俊 王硕 《中国卒中杂志》 北大核心 2024年第4期431-439,共9页
目的本研究对烟雾病患者血管样本的差异表达基因(differentially expressed genes,DEGs)进行生物信息学鉴定和分析,旨在探讨烟雾病的潜在发病机制。方法本研究以烟雾病和颈内动脉瘤患者大脑血管样本为研究对象。利用R语言线性模型微阵... 目的本研究对烟雾病患者血管样本的差异表达基因(differentially expressed genes,DEGs)进行生物信息学鉴定和分析,旨在探讨烟雾病的潜在发病机制。方法本研究以烟雾病和颈内动脉瘤患者大脑血管样本为研究对象。利用R语言线性模型微阵列数据(linear models for microarray data,limma)分析包对基因表达综合数据库(gene expression omnibus,GEO)中的GSE141025数据集进行分析,该数据集涵盖4例烟雾病患者和4例颈内动脉瘤患者的大脑中动脉和颞浅动脉样本各1个,共计16个样本。选择烟雾病患者的大脑中动脉、颞浅动脉及颈内动脉瘤患者的颞浅动脉共12个样本进行DEGs筛选。通过R语言功能富集分析工具包clusterProfiler,对筛选出的DEGs进行基因本体(gene ontology,GO)富集分析和京都基因与基因组百科全书(Kyoto encyclopedia of genes and genomes,KEGG)通路分析。利用STRING数据库构建蛋白质-蛋白质相互作用(proteinprotein interaction,PPI)网络,并使用网络可视化软件Cytoscape进行蛋白质网络的可视化和枢纽基因筛选。结果本研究在烟雾病患者的大脑中动脉与颞浅动脉样本间鉴定出138个DEGs,包括18个上调基因和120个下调基因。GO富集分析显示,以上DEGs在细胞外基质、受体配体活性和生长因子活性等方面显著富集,可能与烟雾病相关的血管病变和神经保护机制有关。KEGG通路分析提示,DEGs主要在酪氨酸代谢通路中富集。通过PPI网络分析,共筛选出9个枢纽基因,包括骨膜蛋白(periostin,POSTN)、脑源性神经营养因子(brain derived neurotrophic factor,BDNF)、血小板衍生生长因子受体α(platelet derived growth factor receptor alpha,PDGFRA)、Thy-1细胞表面抗原(Thy-1 cell surface antigen,THY1)、ⅩⅤ型胶原蛋白α1链(collagen typeⅩⅤalpha 1 chain,COL15A1)、成纤维细胞生长因子7(fibroblast growth factor 7,FGF7)、光蛋白聚糖(l umi can,LUM)、层粘连蛋白α2亚基(laminin subunit alpha 2,LAMA2)和RELN(reelin)。此外,上调基因delta样典型Notch配体4(delta like canonical Notch ligand 4,DLL4)在本研究中首次被发现可能在烟雾病中扮演重要角色,或与烟雾病的病理性血管生成有关。结论细胞外基质、生长因子及其受体的表达失调等可能参与烟雾病的发病过程。DEGs分析筛选出的枢纽基因(POSTN、BDNF、PDGFRA、THY1、COL15A1、FGF7、LUM、LAMA2、RELN)以及DLL4可能在烟雾病的病理形成过程中发挥作用。 展开更多
关键词 烟雾病 生物信息学分析 基因表达数据 枢纽基因
下载PDF
肺腺癌中HPRT1基因表达对患者总生存的影响
19
作者 杨红秀 朱中山 《昆明医科大学学报》 CAS 2024年第8期17-23,共7页
目的 探索HPRT1基因在肺腺癌中的表达特征、总生存率、功能激活以及免疫浸润中的影响。方法 通过对TCGA肺腺癌数据以及多个GEO数据库中的肺腺癌数据进行挖掘分析,对比并验证HPRT1表达量与预后总生存(overall surival,OS)的关系,通过clus... 目的 探索HPRT1基因在肺腺癌中的表达特征、总生存率、功能激活以及免疫浸润中的影响。方法 通过对TCGA肺腺癌数据以及多个GEO数据库中的肺腺癌数据进行挖掘分析,对比并验证HPRT1表达量与预后总生存(overall surival,OS)的关系,通过cluster Profiler分析HPRT1基因高表达组中上调基因的功能富集情况。且通过TIMER以及CIBERSORT算法计算肺腺癌中不同免疫细胞的浸润水平并对比在HPRT1高低表达组间的浸润程度差异。结果 HPRT1在肺腺癌组织中表达量显著上调,且在TCGA中证明HPRT1基因高表达的患者预后OS更差(P <0.01)。经过2个GEO数据集的验证同样发现HPRT1基因高表达表现为预后OS更差(GSE13213,P <0.01;GSE67639,P <0.001)。差异分析显示高表达的患者中有683个基因的表达量显著上调且这些上调基因的功能主要富集与p53以及细胞周期等癌症相关的信号通路。TIMER以及CIBERSORT算法进行的免疫细胞浸润程度分析发现在高表达差预后人群中B细胞以及CD4T细胞含量均更低(P <0.05)。结论 HPRT1基因表达量越高肺腺癌患者的总生存越差,且高表达患者的p53信号通路上调,B细胞以及CD4T细胞浸润程度显著下降。 展开更多
关键词 肺腺癌 HPRT1基因 总生存 数据挖掘
下载PDF
Integration of genome scale data for identifying newplayers in colorectal cancer
20
作者 Viktorija Sokolova Elisabetta Crippa Manuela Gariboldi 《World Journal of Gastroenterology》 SCIE CAS 2016年第2期534-545,共12页
Colorectal cancers(CRCs) display a wide variety of genomic aberrations that may be either causally linked to their development and progression, or might serve as biomarkers for their presence. Recent advances in rapid... Colorectal cancers(CRCs) display a wide variety of genomic aberrations that may be either causally linked to their development and progression, or might serve as biomarkers for their presence. Recent advances in rapid high-throughput genetic and genomic analysis have helped to identify a plethora of alterations that can potentially serve as new cancer biomarkers, and thus help to improve CRC diagnosis, prognosis, and treatment. Each distinct data type(copy number variations, gene and micro RNAs expression, Cp G island methylation) provides an investigator with a different, partially independent, and complementary view of the entire genome. However, elucidation of gene function will require more information than can be provided by analyzing a single type of data. The integration of knowledge obtained from different sources is becoming increasingly essential for obtaining an interdisciplinary view of large amounts of information, and also for cross-validating experimental results. The integration of numerous types of genetic and genomic data derived from public sources, and via the use of ad-hoc bioinformatics tools and statistical methods facilitates the discovery and validation of novel, informative biomarkers. This combinatory approach will also enable researchers to more accurately and comprehensively understand the associations between different biologic pathways, mechanisms, and phenomena, and gain new insights into the etiology of CRC. 展开更多
关键词 COLORECTAL cancer COPY number VARIATIONS gene EXPRESSION miRNA EXPRESSION Methylome dataintegration
下载PDF
上一页 1 2 34 下一页 到第
使用帮助 返回顶部