期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
A Novel Soft Clustering Approach for Gene Expression Data
1
作者 E.Kavitha R.Tamilarasan +1 位作者 Arunadevi Baladhandapani M.K.Jayanthi Kannan 《Computer Systems Science & Engineering》 SCIE EI 2022年第12期871-886,共16页
Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode p... Gene expression data represents a condition matrix where each rowrepresents the gene and the column shows the condition. Micro array used todetect gene expression in lab for thousands of gene at a time. Genes encode proteins which in turn will dictate the cell function. The production of messengerRNA along with processing the same are the two main stages involved in the process of gene expression. The biological networks complexity added with thevolume of data containing imprecision and outliers increases the challenges indealing with them. Clustering methods are hence essential to identify the patternspresent in massive gene data. Many techniques involve hierarchical, partitioning,grid based, density based, model based and soft clustering approaches for dealingwith the gene expression data. Understanding the gene regulation and other usefulinformation from this data can be possible only through effective clustering algorithms. Though many methods are discussed in the literature, we concentrate onproviding a soft clustering approach for analyzing the gene expression data. Thepopulation elements are grouped based on the fuzziness principle and a degree ofmembership is assigned to all the elements. An improved Fuzzy clustering byLocal Approximation of Memberships (FLAME) is proposed in this workwhich overcomes the limitations of the other approaches while dealing with thenon-linear relationships and provide better segregation of biological functions. 展开更多
关键词 REINFORCEMENT MEMBERSHIP CENTROID threshold STATISTICS BIOINFORMATICS gene expression data
下载PDF
Gene Expression Data Classification Using Consensus Independent Component Analysis 被引量:7
2
作者 Chun-Hou Zheng De-Shuang Huang +1 位作者 Xiang-Zhen Kong Xing-Ming Zhao 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2008年第2期74-82,共9页
We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (... We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (ICA). Secondly, the most discriminant eigenassays extracted by ICA are selected by the sequential floating forward selection technique. Finally, support vector machine is used to classify the modeling data. To show the validity of the proposed method, we applied it to classify three DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible. 展开更多
关键词 independent component analysis feature selection support vector machine gene expression data
原文传递
Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets 被引量:4
3
作者 Hualong Yu 1 , Jun Ni 2 , Yuanyuan Dan 3 , Sen Xu 4 1. School of Computer Science and Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003, China +2 位作者 2. Department of Radiology, Carver College of Medicine, The University of Iowa, Iowa City, IA 52242, USA 3. School of Biology and Chemical Engineering, Jiangsu University of Science and Technology, Zhenjiang 212003, China 4. School of Information Engineering, Yancheng Institute of Technology, Yancheng 224051, China 《Tsinghua Science and Technology》 SCIE EI CAS 2012年第6期666-673,共8页
There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms ... There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will seriously underestimate the performance of the minority class, leading to inaccurate diagnosis in clinical trails. This paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure. The extracted genes are paired as decision rules to distinguish both classes, with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples; thus avoiding tedious data normalization and classifier construction. The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets. 展开更多
关键词 cancer gene expression data class imbalance paired differential expression genes decision ruleensemble learning majority voting
原文传递
Outlier Analysis for Gene Expression Data 被引量:3
4
作者 ChaoYan Guo-LiangChen Yi-FeiShen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第1期13-21,共9页
The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has... The rapid developments of technologies that generate arrays of gene dataenable a global view of the transcription levels of hundreds of thousands of genes simultaneously.The outlier detection problem for gene data has its importance but together with the difficulty ofhigh dimensionality. The sparsity of data in high-dimensional space makes each point a relativelygood outlier in the view of traditional distance-based definitions. Thus, finding outliers in highdimensional data is more complex. In this paper, some basic outlier analysis algorithms arediscussed and a new genetic algorithm is presented. This algorithm is to find best dimensionprojections based on a revised cell-based algorithm and to give explanations to solutions. It cansolve the outlier detection problem for gene expression data and for other high dimensional data aswell. 展开更多
关键词 gene expression data outlier analysis cell-based algorithm geneTICALGORITHM
原文传递
A Survey on Acute Leukemia Expression Data Classification Using Ensembles
5
作者 Abdel Nasser H.Zaied Ehab Rushdy Mona Gamal 《Computer Systems Science & Engineering》 SCIE EI 2023年第11期1349-1364,共16页
Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists... Acute leukemia is an aggressive disease that has high mortality rates worldwide.The error rate can be as high as 40%when classifying acute leukemia into its subtypes.So,there is an urgent need to support hematologists during the classification process.More than two decades ago,researchers used microarray gene expression data to classify cancer and adopted acute leukemia as a test case.The high classification accuracy they achieved confirmed that it is possible to classify cancer subtypes using microarray gene expression data.Ensemble machine learning is an effective method that combines individual classifiers to classify new samples.Ensemble classifiers are recognized as powerful algorithms with numerous advantages over traditional classifiers.Over the past few decades,researchers have focused a great deal of attention on ensemble classifiers in a wide variety of fields,including but not limited to disease diagnosis,finance,bioinformatics,healthcare,manufacturing,and geography.This paper reviews the recent ensemble classifier approaches utilized for acute leukemia gene expression data classification.Moreover,a framework for classifying acute leukemia gene expression data is proposed.The pairwise correlation gene selection method and the Rotation Forest of Bayesian Networks are both used in this framework.Experimental outcomes show that the classification accuracy achieved by the acute leukemia ensemble classifiers constructed according to the suggested framework is good compared to the classification accuracy achieved in other studies. 展开更多
关键词 LEUKEMIA CLASSIFICATION ENSEMBLE rotation forest pairwise correlation bayesian networks gene expression data MICROARRAY gene selection
下载PDF
PCA-FA:Applying Supervised Learning to Analyze Gene Expression Data
6
作者 翁时锋 张长水 张学工 《Tsinghua Science and Technology》 SCIE EI CAS 2004年第4期428-434,共7页
In previous gene expression data analyses, supervised learning has mainly focused on the clas-sification of attribute data, such as the different experimental conditions, different known classes of the same tumor and ... In previous gene expression data analyses, supervised learning has mainly focused on the clas-sification of attribute data, such as the different experimental conditions, different known classes of the same tumor and sex. However, supervised learning classification is not suitable for interval-scaled attributes, such as age and survival outcome of cancer patients. For this problem, this paper proposed a new method by combining two well-known methods: principal component analysis (PCA) and Fisher analysis (FA). The method, PCA-FA, realizes supervised learning with two types of attributes (nominal attributes and interval-scaled attributes). The fuzzy FA was introduced to model the interval-scaled attributes. In this paper, an ap-proximate linear relationship between gene expression data of lung adenocarcinoma patients and survival outcome is successfully revealed by PCA-TA. 展开更多
关键词 supervised learning gene expression data principal component analysis Fisher analysis
原文传递
Deep Learning Enabled Microarray Gene Expression Classification for Data Science Applications
7
作者 Areej A.Malibari Reem M.Alshehri +5 位作者 Fahd N.Al-Wesabi Noha Negm Mesfer Al Duhayyim Anwer Mustafa Hilal Ishfaq Yaseen Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2022年第11期4277-4290,共14页
In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary cha... In bioinformatics applications,examination of microarray data has received significant interest to diagnose diseases.Microarray gene expression data can be defined by a massive searching space that poses a primary challenge in the appropriate selection of genes.Microarray data classification incorporates multiple disciplines such as bioinformatics,machine learning(ML),data science,and pattern classification.This paper designs an optimal deep neural network based microarray gene expression classification(ODNN-MGEC)model for bioinformatics applications.The proposed ODNN-MGEC technique performs data normalization process to normalize the data into a uniform scale.Besides,improved fruit fly optimization(IFFO)based feature selection technique is used to reduce the high dimensionality in the biomedical data.Moreover,deep neural network(DNN)model is applied for the classification of microarray gene expression data and the hyperparameter tuning of the DNN model is carried out using the Symbiotic Organisms Search(SOS)algorithm.The utilization of IFFO and SOS algorithms pave the way for accomplishing maximum gene expression classification outcomes.For examining the improved outcomes of the ODNN-MGEC technique,a wide ranging experimental analysis is made against benchmark datasets.The extensive comparison study with recent approaches demonstrates the enhanced outcomes of the ODNN-MGEC technique in terms of different measures. 展开更多
关键词 BIOINFORMATICS data science microarray gene expression data classification deep learning metaheuristics
下载PDF
Constrained query of order-preserving submatrix in gene expression data 被引量:2
8
作者 Tao JIANG Zhanhuai LI +3 位作者 Xuequn SHANG Bolin CHEN Weibang LI Zhilei YIN 《Frontiers of Computer Science》 SCIE EI CSCD 2016年第6期1052-1066,共15页
Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of mic... Order-preserving submatrix (OPSM) has become important in modelling biologically meaningful subspace cluster, capturing the general tendency of gene expressions across a subset of conditions. With the advance of microarray and analysis techniques, big volume of gene expression datasets and OPSM mining results are produced. OPSM query can efficiently retrieve relevant OPSMs from the huge amount of OPSM datasets. However, improving OPSM query relevancy remains a difficult task in real life exploratory data analysis processing. First, it is hard to capture subjective interestingness aspects, e.g., the analyst's expectation given her/his domain knowledge. Second, when these expectations can be declaratively specified, it is still challenging to use them during the computational process of OPSM queries. With the best of our knowledge, existing methods mainly fo- cus on batch OPSM mining, while few works involve OPSM query. To solve the above problems, the paper proposes two constrained OPSM query methods, which exploit userdefined constraints to search relevant results from two kinds of indices introduced. In this paper, extensive experiments are conducted on real datasets, and experiment results demonstrate that the multi-dimension index (cIndex) and enumerating sequence index (esIndex) based queries have better performance than brute force search. 展开更多
关键词 gene expression data OPSM constrained query brute-force search feature sequence cIndex
原文传递
Network-Based Predictions and Simulations by Biological State Space Models: Search for Drug Mode of Action
9
作者 Rui Yamaguchi Seiya Imoto Satoru Miyano 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第1期131-153,共23页
Since time-course microarrav data are short but contain a large number of genes, most of statistical models should be extended so that they can handle such statistically irregular situations. We introduce biological s... Since time-course microarrav data are short but contain a large number of genes, most of statistical models should be extended so that they can handle such statistically irregular situations. We introduce biological state space models that are established as suitable computational models for constructing gene networks from microarray gene expression data. This chapter elucidates theory and methodology of our biological state space models together with some representative analyses including discovery of drug mode of action. Through the applications we show the whole strategy of biological state space model analysis involving experimental design of time-course data, model building and analysis of the estimated networks. 展开更多
关键词 gene networks state space models time-course gene expression data
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部