The extension of Minimum Spanning Tree(MST) problem is an NP hard problem which does not exit a polynomial time algorithm. In this paper, a fast optimization method on MST problem——the Gradient Gene Algorithm is int...The extension of Minimum Spanning Tree(MST) problem is an NP hard problem which does not exit a polynomial time algorithm. In this paper, a fast optimization method on MST problem——the Gradient Gene Algorithm is introduced. Compared with other evolutionary algorithms on MST problem, it is more advanced: firstly, very simple and easy to realize; then, efficient and accurate; finally general on other combination optimization problems.展开更多
The job shop scheduli ng problem has been studied for decades and known as an NP-hard problem. The fl exible job shop scheduling problem is a generalization of the classical job sche duling problem that allows an oper...The job shop scheduli ng problem has been studied for decades and known as an NP-hard problem. The fl exible job shop scheduling problem is a generalization of the classical job sche duling problem that allows an operation to be processed on one machine out of a set of machines. The problem is to assign each operation to a machine and find a sequence for the operations on the machine in order that the maximal completion time of all operations is minimized. A genetic algorithm is used to solve the f lexible job shop scheduling problem. A novel gene coding method aiming at job sh op problem is introduced which is intuitive and does not need repairing process to validate the gene. Computer simulations are carried out and the results show the effectiveness of the proposed algorithm.展开更多
Based on the analysis of previous genetic algorithms (GAs) for TSP, a novel method called Ge- GA is proposed. It combines gene pool and GA so as to direct the evolution of the whole population. The core of Ge- GA is t...Based on the analysis of previous genetic algorithms (GAs) for TSP, a novel method called Ge- GA is proposed. It combines gene pool and GA so as to direct the evolution of the whole population. The core of Ge- GA is the construction of gene pool and how to apply it to GA. Different from standard GAs, Ge- GA aims to enhance the ability of exploration and exploitation by incorporating global search with local search. On one hand a local search called Ge- Lo-calSearch operator is proposed to improve the solution quality, on the other hand the modified Inver-Over operator called Ge InverOver is considered as a global search mechanism to expand solution space of local minimal. Both of these operators are based on the gene pool. Our algorithm is applied to 11 well-known traveling salesman problems whose numbers of cities are from 70 to 1577 cities. The experiments results indicate that Ge- GA has great robustness for TSP. For each test instance, the average value of solution quality, found in accepted time, stays within 0. 001% from the optimum.展开更多
In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task i...In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.展开更多
In Systems Biology, system identification, which infers regulatory network in genetic system and metabolic pathways using experimentally observed time-course data, is one of the hottest issues. The efficient numerical...In Systems Biology, system identification, which infers regulatory network in genetic system and metabolic pathways using experimentally observed time-course data, is one of the hottest issues. The efficient numerical optimization algorithm to estimate more than 100 real-coded parameters should be developed for this purpose. New real-coded genetic algorithm (RCGA), the combination of AREX (adaptive real-coded ensemble crossover) with JGG (just generation gap), have applied to the inference of genetic interactions involving more than 100 parameters related to the interactions with using experimentally observed time-course data. Compared with conventional RCGA, the combination of UNDX (unimodal normal distribution crossover) with MGG (minimal generation gap), new algorithm has shown the superiority with improving early convergence in the first stage of search and suppressing evolutionary stagnation in the last stage of search.展开更多
In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying result...In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.展开更多
Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroa...Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroarray data was presented, by combined with evidence acquired from comparative genornic hybridization (CGH) data. Methods: Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify ted genes in CRC. Results: A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions- Our results demonstrated that integration analysis is an effective strategy for mining cancer- associated genes.展开更多
It is a complicated problem for the bottom-to-top adaptive conceptual design of complicated products between structure and function. Reliable theories demand to be found in order to determine whether the structure acc...It is a complicated problem for the bottom-to-top adaptive conceptual design of complicated products between structure and function. Reliable theories demand to be found in order to determine whether the structure accords with the requirement of design. For the requirement generally is dynamic variety as time passes, new requirements will come, and some initial requirements can no longer be used. The number of product requirements, the gene length expressing requirements, the structure of the product, and the correlation matrix are varied with individuation of customer requirements of the product. By researching on the calculation mechanisms of dynamic variety, the approaches of gene expression and variable length gene expression are proposed. According to the diversity of structure selection in conceptual design and mutual relations between structure and function as well as structure and structure, the correlation matrixes between structure and function as well as structure and structure are defined. By the approach of making the sum of the elements of correlation matrix maximum, the mathematical models of multi-object optimization for structure design are provided based on variable requirements. An improved genetic algorithm called segment genetic algorithm is proposed based on optimization preservation simple genetic algorithm. The models of multi-object optimization are calculated by the segment genetic algorithm and hybrid genetic algorithm. An example for the conceptual design of a washing machine is given to show that the proposed method is able to realize the optimization structure design fitting for variable requirements. In addition, the proposed approach can provide good Pareto optimization solutions, and the individuation customer requirements for structures of products are able to be resolved effectively.展开更多
Various gene signatures of chemosensitivity in breast cancer have been discovered. One previous study employed t-test to find a signature of 31 probe sets (27 genes) from a group of patients who received weekly preope...Various gene signatures of chemosensitivity in breast cancer have been discovered. One previous study employed t-test to find a signature of 31 probe sets (27 genes) from a group of patients who received weekly preoperative chemotherapy. Based on this signature, a 30-probe set diagonal linear discriminant analysis (DLDA-30) classifier of pathologic complete response (pCR) was constructed. In this study, we sought to uncover a signature that is much smaller than the 31 probe sets and yet has enhanced predictive performance. A signature of this nature could inform us what genes are essential in response prediction. Genetic algorithms (GAs) and sparse logistic regression (SLR) were employed to identify two such small signatures. The first had 13 probe sets (10 genes) selected from the 31 probe sets and was used to build a SLR predictor of pCR (SLR-13), and the second had 14 probe sets (14 genes) selected from the genes involved in Notch signaling pathway and was used to develop another SLR predictor of pCR (SLR-Notch-14). The SLR-13 and SLR-Notch-14 had a higher accuracy and a higher positive predictive value than the DLDA-30 with much lower P values, suggesting that our two signatures had their own discriminative power with high statistical significance. The SLR prediction model also suggested the dual role of gene RNUX1 in promoting residual disease (RD) or pCR in breast cancer. Our results demonstrated that the multivariable techniques such as GAs and SLR are effective in finding significant genes in chemosensitivity prediction. They have the advantage of revealing the interacting genes, which might be missed by single variable techniques such as t-test.展开更多
文摘The extension of Minimum Spanning Tree(MST) problem is an NP hard problem which does not exit a polynomial time algorithm. In this paper, a fast optimization method on MST problem——the Gradient Gene Algorithm is introduced. Compared with other evolutionary algorithms on MST problem, it is more advanced: firstly, very simple and easy to realize; then, efficient and accurate; finally general on other combination optimization problems.
文摘The job shop scheduli ng problem has been studied for decades and known as an NP-hard problem. The fl exible job shop scheduling problem is a generalization of the classical job sche duling problem that allows an operation to be processed on one machine out of a set of machines. The problem is to assign each operation to a machine and find a sequence for the operations on the machine in order that the maximal completion time of all operations is minimized. A genetic algorithm is used to solve the f lexible job shop scheduling problem. A novel gene coding method aiming at job sh op problem is introduced which is intuitive and does not need repairing process to validate the gene. Computer simulations are carried out and the results show the effectiveness of the proposed algorithm.
基金Supported by the National Natural Science Foundation of China(70071042,60073043,and 60133010)
文摘Based on the analysis of previous genetic algorithms (GAs) for TSP, a novel method called Ge- GA is proposed. It combines gene pool and GA so as to direct the evolution of the whole population. The core of Ge- GA is the construction of gene pool and how to apply it to GA. Different from standard GAs, Ge- GA aims to enhance the ability of exploration and exploitation by incorporating global search with local search. On one hand a local search called Ge- Lo-calSearch operator is proposed to improve the solution quality, on the other hand the modified Inver-Over operator called Ge InverOver is considered as a global search mechanism to expand solution space of local minimal. Both of these operators are based on the gene pool. Our algorithm is applied to 11 well-known traveling salesman problems whose numbers of cities are from 70 to 1577 cities. The experiments results indicate that Ge- GA has great robustness for TSP. For each test instance, the average value of solution quality, found in accepted time, stays within 0. 001% from the optimum.
基金supported by National Natural Science Foundation of China (Grant Nos. 60433020, 60175024 and 60773095)European Commission under grant No. TH/Asia Link/010 (111084)the Key Science-Technology Project of the National Education Ministry of China (Grant No. 02090),and the Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, P. R. China
文摘In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.
文摘In Systems Biology, system identification, which infers regulatory network in genetic system and metabolic pathways using experimentally observed time-course data, is one of the hottest issues. The efficient numerical optimization algorithm to estimate more than 100 real-coded parameters should be developed for this purpose. New real-coded genetic algorithm (RCGA), the combination of AREX (adaptive real-coded ensemble crossover) with JGG (just generation gap), have applied to the inference of genetic interactions involving more than 100 parameters related to the interactions with using experimentally observed time-course data. Compared with conventional RCGA, the combination of UNDX (unimodal normal distribution crossover) with MGG (minimal generation gap), new algorithm has shown the superiority with improving early convergence in the first stage of search and suppressing evolutionary stagnation in the last stage of search.
基金Project supported by the National Basic Research Program (973) of China (No. 2002CB312200) and the Center for Bioinformatics Pro-gram Grant of Harvard Center of Neurodegeneration and Repair,Harvard Medical School, Harvard University, Boston, USA
文摘In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables and small number of samples as well as its non-linearity. It is difficult to get satisfying results by using conventional linear sta- tistical methods. Recursive feature elimination based on support vector machine (SVM RFE) is an effective algorithm for gene selection and cancer classification, which are integrated into a consistent framework. In this paper, we propose a new method to select parameters of the aforementioned algorithm implemented with Gaussian kernel SVMs as better alternatives to the common practice of selecting the apparently best parameters by using a genetic algorithm to search for a couple of optimal parameter. Fast implementation issues for this method are also discussed for pragmatic reasons. The proposed method was tested on two repre- sentative hereditary breast cancer and acute leukaemia datasets. The experimental results indicate that the proposed method per- forms well in selecting genes and achieves high classification accuracies with these genes.
基金supported by a grant from the National Natural Science Foundation of China(Grant No.61373057)a grant from the Zhejiang Provincial Natural Science Foundation of China(Grant No.Y1110763)
文摘Objective: Identification of colorectal cancer (CRC) metastasis genes is one of the most important issues in CRC research. For the purpose of mining CRC metastasis-associated genes, an integrated analysis of mJcroarray data was presented, by combined with evidence acquired from comparative genornic hybridization (CGH) data. Methods: Gene expression profile data of CRC samples were obtained at Gene Expression Omnibus (GEO) website. The 15 important chromosomal aberration sites detected by using CGH technology were used for integrated genomic and transcriptomic analysis. Significant Analysis of Microarray (SAM) was used to detect significantly differentially expressed genes across the whole genome. The overlapping genes were selected in their corresponding chromosomal aberration regions, and analyzed by using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Finally, SVM-T-RFE gene selection algorithm was applied to identify ted genes in CRC. Results: A minimum gene set was obtained with the minimum number [14] of genes, and the highest classification accuracy (100%) in both PRI and META datasets. A fraction of selected genes are associated with CRC or its metastasis. Conclusions- Our results demonstrated that integration analysis is an effective strategy for mining cancer- associated genes.
基金supported by National Natural Science Foundation of China(Grant No.50975033,Grant No.60875046)Program of Education Office of Liaoning Province,China(Grant No.LT2010074)
文摘It is a complicated problem for the bottom-to-top adaptive conceptual design of complicated products between structure and function. Reliable theories demand to be found in order to determine whether the structure accords with the requirement of design. For the requirement generally is dynamic variety as time passes, new requirements will come, and some initial requirements can no longer be used. The number of product requirements, the gene length expressing requirements, the structure of the product, and the correlation matrix are varied with individuation of customer requirements of the product. By researching on the calculation mechanisms of dynamic variety, the approaches of gene expression and variable length gene expression are proposed. According to the diversity of structure selection in conceptual design and mutual relations between structure and function as well as structure and structure, the correlation matrixes between structure and function as well as structure and structure are defined. By the approach of making the sum of the elements of correlation matrix maximum, the mathematical models of multi-object optimization for structure design are provided based on variable requirements. An improved genetic algorithm called segment genetic algorithm is proposed based on optimization preservation simple genetic algorithm. The models of multi-object optimization are calculated by the segment genetic algorithm and hybrid genetic algorithm. An example for the conceptual design of a washing machine is given to show that the proposed method is able to realize the optimization structure design fitting for variable requirements. In addition, the proposed approach can provide good Pareto optimization solutions, and the individuation customer requirements for structures of products are able to be resolved effectively.
文摘Various gene signatures of chemosensitivity in breast cancer have been discovered. One previous study employed t-test to find a signature of 31 probe sets (27 genes) from a group of patients who received weekly preoperative chemotherapy. Based on this signature, a 30-probe set diagonal linear discriminant analysis (DLDA-30) classifier of pathologic complete response (pCR) was constructed. In this study, we sought to uncover a signature that is much smaller than the 31 probe sets and yet has enhanced predictive performance. A signature of this nature could inform us what genes are essential in response prediction. Genetic algorithms (GAs) and sparse logistic regression (SLR) were employed to identify two such small signatures. The first had 13 probe sets (10 genes) selected from the 31 probe sets and was used to build a SLR predictor of pCR (SLR-13), and the second had 14 probe sets (14 genes) selected from the genes involved in Notch signaling pathway and was used to develop another SLR predictor of pCR (SLR-Notch-14). The SLR-13 and SLR-Notch-14 had a higher accuracy and a higher positive predictive value than the DLDA-30 with much lower P values, suggesting that our two signatures had their own discriminative power with high statistical significance. The SLR prediction model also suggested the dual role of gene RNUX1 in promoting residual disease (RD) or pCR in breast cancer. Our results demonstrated that the multivariable techniques such as GAs and SLR are effective in finding significant genes in chemosensitivity prediction. They have the advantage of revealing the interacting genes, which might be missed by single variable techniques such as t-test.