Aiming at the problems of multiple types of power quality composite disturbances,strong feature correlation and high recognition error rate,a method of power quality composite disturbances identification based on mult...Aiming at the problems of multiple types of power quality composite disturbances,strong feature correlation and high recognition error rate,a method of power quality composite disturbances identification based on multiresolution S-transform and decision tree was proposed.Firstly,according to IEEE standard,the signal models of seven single power quality disturbances and 17 combined power quality disturbances are given,and the disturbance waveform samples are generated in batches.Then,in order to improve the recognition accuracy,the adjustment factor is introduced to obtain the controllable time-frequency resolution through multi-resolution S-transform time-frequency domain analysis.On this basis,five disturbance time-frequency domain features are extracted,which quantitatively reflect the characteristics of the analyzed power quality disturbance signal,which is less than the traditional method based on S-transform.Finally,three classifiers such as K-nearest neighbor,support vector machine and decision tree algorithm are used to effectively complete the identification of power quality composite disturbances.Simulation results showthat the classification accuracy of decision tree algorithmis higher than that of K-nearest neighbor and support vector machine.Finally,the proposed method is compared with other commonly used recognition algorithms.Experimental results show that the proposedmethod is effective in terms of detection accuracy,especially for combined PQ interference.展开更多
A recommender system is an approach performed by e-commerce for increasing smooth users’experience.Sequential pattern mining is a technique of data mining used to identify the co-occurrence relationships by taking in...A recommender system is an approach performed by e-commerce for increasing smooth users’experience.Sequential pattern mining is a technique of data mining used to identify the co-occurrence relationships by taking into account the order of transactions.This work will present the implementation of sequence pattern mining for recommender systems within the domain of e-com-merce.This work will execute the Systolic tree algorithm for mining the frequent patterns to yield feasible rules for the recommender system.The feature selec-tion's objective is to pick a feature subset having the least feature similarity as well as highest relevancy with the target class.This will mitigate the feature vector's dimensionality by eliminating redundant,irrelevant,or noisy data.This work pre-sents a new hybrid recommender system based on optimized feature selection and systolic tree.The features were extracted using Term Frequency-Inverse Docu-ment Frequency(TF-IDF),feature selection with the utilization of River Forma-tion Dynamics(RFD),and the Particle Swarm Optimization(PSO)algorithm.The systolic tree is used for pattern mining,and based on this,the recommendations are given.The proposed methods were evaluated using the MovieLens dataset,and the experimental outcomes confirmed the efficiency of the techniques.It was observed that the RFD feature selection with systolic tree frequent pattern mining with collaborativefiltering,the precision of 0.89 was achieved.展开更多
A subtractive cDNA library was developed to study genes associated with bud dormancy release in tree peonies. In order to identify genes that are highly expressed in buds released from dormancy, 588 clones were examin...A subtractive cDNA library was developed to study genes associated with bud dormancy release in tree peonies. In order to identify genes that are highly expressed in buds released from dormancy, 588 clones were examined by differential screening. Of these, 185 clones were selected to be sequenced. A total of 37 unique sequences were obtained of which only 31 sequences have matches in the NCBI database or the Arabidopsis thaliana protein database. Semi-quantitative RT-PCR was used to confirm further the expression profiles for 12 transcripts identified within the subtractive cDNA library. Gene ontology analyses indicated that many of the different genes identified have unknown or hypothetical functions while it is speculated that other genes play different mo- lecular roles. In our study, genes involved in bud dormancy release were growth-related or stress-responsive, while low-temperature-induced ribosomal proteins may also play a role in bud dormancy release. Our results provide interesting information for further understanding of the molecular mechanism of bud dormancy release in tree peonies.展开更多
Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in thr...Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in throughput are very limited.In order to solve these problems,this paper presents a novel tag anti-collision scheme,namely adaptive hybrid search tree(AHST),by combining two algorithms of the adaptive binary-tree disassembly(ABD) and the combination query tree(CQT),in which ABD has superior tag identification velocity and CQT has optimum performance in system throughput and search timeslots.From the theoretical analysis and numerical simulations,the proposed algorithm can colligate the advantages of above algorithms,improve the system throughput and reduce the searching timeslots dramatically.展开更多
In order to solve the poor generalization ability of the back-propagation(BP)neural network in the model updating hybrid test,a novel method called the AdaBoost regression tree algorithm is introduced into the model u...In order to solve the poor generalization ability of the back-propagation(BP)neural network in the model updating hybrid test,a novel method called the AdaBoost regression tree algorithm is introduced into the model updating procedure in hybrid tests.During the learning phase,the regression tree is selected as a weak regression model to be trained,and then multiple trained weak regression models are integrated into a strong regression model.Finally,the training results are generated through voting by all the selected regression models.A 2-DOF nonlinear structure was numerically simulated by utilizing the online AdaBoost regression tree algorithm and the BP neural network algorithm as a contrast.The results show that the prediction accuracy of the online AdaBoost regression algorithm is 48.3%higher than that of the BP neural network algorithm,which verifies that the online AdaBoost regression tree algorithm has better generalization ability compared to the BP neural network algorithm.Furthermore,it can effectively eliminate the influence of weight initialization and improve the prediction accuracy of the restoring force in hybrid tests.展开更多
Existing stereo matching methods cannot guarantee both the computational accuracy and efficiency for ihe disparity estimation of large-scale or multi-view images.Hybrid tree method can obtain a disparity estimation fa...Existing stereo matching methods cannot guarantee both the computational accuracy and efficiency for ihe disparity estimation of large-scale or multi-view images.Hybrid tree method can obtain a disparity estimation fast with relatively low accuracy,while PatchMatch can give high-precision disparity value with relatively high computational cost.In this work,we propose the Hybrid Tree Guided PatchMatch which can calculate the disparity fast and accurate.Firstly,an initial disparity map is estimated by employing hybrid tree cost aggregation,which is used to constrain the label searching range of the PatchMatch.Furthermore,a reliable normal searching range for each current normal vector defined on the initial disparity map is calculated to refine the PatchMatch.Finally,an effective quantizing acceleration strategy is designed to decrease the matching computational cost of continuous disparity.Experimental results demonstrate that the disparity estimation based on our algorithm is better in binocular image benchmarks such as Middlebury and KITTI.We also provide the disparity estimation results for multi-view stereo in real scenes.展开更多
One of the most important methods that finds usefulness in various applications, such as searching historical manuscripts, forensic search, bank check reading, mail sorting, book and handwritten notes transcription, i...One of the most important methods that finds usefulness in various applications, such as searching historical manuscripts, forensic search, bank check reading, mail sorting, book and handwritten notes transcription, is handwritten character recognition. The common issues in the character recognition are often due to different writing styles, orientation angle, size variation(regarding length and height), etc. This study presents a classification model using a hybrid classifier for the character recognition by combining holoentropy enabled decision tree(HDT) and deep neural network(DNN). In feature extraction, the local gradient features that include histogram oriented gabor feature and grid level feature, and grey level co-occurrence matrix(GLCM) features are extracted. Then, the extracted features are concatenated to encode shape, color, texture, local and statistical information, for the recognition of characters in the image by applying the extracted features to the hybrid classifier. In the experimental analysis, recognition accuracy of 96% is achieved. Thus, it can be suggested that the proposed model intends to provide more accurate character recognition rate compared to that of character recognition techniques used in the literature.展开更多
Ribonucleic acid(RNA)hybridization is widely used in popular RNA simulation software in bioinformatics.However limited by the exponential computational complexity of combin atorial problems,it is challenging to decide...Ribonucleic acid(RNA)hybridization is widely used in popular RNA simulation software in bioinformatics.However limited by the exponential computational complexity of combin atorial problems,it is challenging to decide,within an acceptable time,whether a specific RNA hybridization is effective.We hereby introduce a machine learning based technique to address this problem.Sample machine learning(ML)models tested in the training phase include algorithms based on the boosted tree(BT)random forest(RF),decision tree(DT)and logistic regression(LR),and the corresponding models are obtained.Given the RNA molecular coding training and testing sets,the trained machine learning models are applied to predict the classification of RNA hybridization results.The experiment results show that the op timal predictive accuracies are 96.2%,96.6%,96.0%and 69.8%for the RF,BT,DT and LR-based approaches,respectively,un der the strong constraint condition,compared with traditiona representative methods.Furthermore,the average computation efficiency of the RF,BT,DT and LR-based approaches are208679,269756,184333 and 187458 times higher than that o existing approach,respectively.Given an RNA design,the BT based approach demonstrates high computational efficiency and better predictive accuracy in determining the biological effective ness of molecular hybridization.展开更多
A useful life prediction method based on the integration of the stochastic hybrid automata(SHA) model and the frame of the dynamic fault tree(DFT) is proposed. The SHA model can incorporate the orbit environment, work...A useful life prediction method based on the integration of the stochastic hybrid automata(SHA) model and the frame of the dynamic fault tree(DFT) is proposed. The SHA model can incorporate the orbit environment, work modes, system configuration, dynamic probabilities and degeneration of components,as well as spacecraft dynamics and kinematics. By introducing the frame of DFT, the system is classified into several layers, and the problem of state combination explosion is artfully overcome.An improved dynamic reliability model(DRM) based on the Nelson hypothesis is investigated to improve the defect of cumulative failure probability(CFP), which is used to address the failure probability of components in the SHA model. The simulation using the Monte-Carlo method is finally conducted on two satellites, which are deployed with the same multi-gyro subsystem but run on different orbits. The results show that the predicted useful life of the attitude control system(ACS) with consideration of abrupt failure,degradation, and running environment is quite different between the two satellites.展开更多
A new multicast routing algorithm based on the hybrid genetic algorithm (HGA) is proposed. The coding pattern based on the number of routing paths is used. A fitness function that is computed easily and makes algorith...A new multicast routing algorithm based on the hybrid genetic algorithm (HGA) is proposed. The coding pattern based on the number of routing paths is used. A fitness function that is computed easily and makes algorithm quickly convergent is proposed. A new approach that defines the HGA's parameters is provided. The simulation shows that the approach can increase largely the convergent ratio, and the fitting values of the parameters of this algorithm are different from that of the original algorithms. The optimal mutation probability of HGA equals 0.50 in HGA in the experiment, but that equals 0.07 in SGA. It has been concluded that the population size has a significant influence on the HGA's convergent ratio when it's mutation probability is bigger. The algorithm with a small population size has a high average convergent rate. The population size has little influence on HGA with the lower mutation probability.展开更多
A hybrid Cartesian structured grid method is proposed for solving moving boundary unsteady problems. The near body region is discretized by using the body-fitted structured grids, while the remaining computational dom...A hybrid Cartesian structured grid method is proposed for solving moving boundary unsteady problems. The near body region is discretized by using the body-fitted structured grids, while the remaining computational domain is tessellated with the generated Cartesian grids. As the body moves, the structured grids move with the body and the outer boundaries of inside grids are used to generate new holes in the outside adaptive Cartesian grid to facilitate data communication. By using the alternating digital tree (ADT) algorithm, the computational time of hole-cutting and identification of donor cells can be reduced significantly. A compressible solver for unsteady flow problems is developed. A cell-centered, second-order accurate finite volume method is employed in spatial discreti- zation and an implicit dual-time stepping low-upper symmetric Gauss-Seidei (LU-SGS) approach is employed in temporal discretization. Geometry-based adaptation is used during unsteady simulation time steps when boundary moves and the flow solution is interpolated from the old Cartesian grids to the new one with inverse distance weigh- ting interpolation formula. Both laminar and turbulent unsteady cases are tested to demonstrate the accuracy and efficiency of the proposed method. Then, a 2-D store separation problem is simulated. The result shows that the hybrid Cartesian grid method can handle the unsteady flow problems involving large-scale moving boundaries.展开更多
AIM: To construct tree models for classification of diffuse large B-cell lymphomas (DLBCL) by chromosome copy numbers, to compare them with cDNA microarray classification, and to explore models of multi-gene, multi-st...AIM: To construct tree models for classification of diffuse large B-cell lymphomas (DLBCL) by chromosome copy numbers, to compare them with cDNA microarray classification, and to explore models of multi-gene, multi-step and multi-pathway processes of DLBCL tumorigenesis. METHODS: Maximum-weight branching and distancebased models were constructed based on the comparative genomic hybridization (CGH) data of 123 DLBCL samples using the established methods and software of Desper et al . A maximum likelihood tree model was also used to analyze the data. By comparing with the results reported in literature, values of tree models in the classification of DLBCL were elucidated. RESULTS: Both the branching and the distance-based trees classified DLBCL into three groups. We combined the classification methods of the two models and classified DLBCL into three categories according to their characteristics. The first group was marked by +Xq, +Xp, -17p and +13q; the second group by +3q, +18q and +18p; and the third group was marked by -6q and +6p. This chromosomal classification was consistent with cDNA classification. It indicated that -6q and +3q were two main events in the tumorigenesis of lymphoma. CONCLUSION: Tree models of lymphoma established from CGH data can be used in the classification of DLBCL. These models can suggest multi-gene, multistep and multi-pathway processes of tumorigenesis. Two pathways, -6q preceding +6q and +3q preceding+18q, may be important in understanding tumorigenesis of DLBCL. The pathway, -6q preceding +6q, may have a close relationship with the tumorigenesis of non-GCB DLBCL.展开更多
基金Foundation of China(No.52067013)the Key Natural Science Fund Project of Gansu Provincial Department of Science and Technology(No.21JR7RA280)+1 种基金the Tianyou Innovation Team Science Foundation of Intelligent Power Supply and State Perception for Rail Transit(No.TY202010)the Natural Science Foundation of Gansu Province(No.20JR5RA395).
文摘Aiming at the problems of multiple types of power quality composite disturbances,strong feature correlation and high recognition error rate,a method of power quality composite disturbances identification based on multiresolution S-transform and decision tree was proposed.Firstly,according to IEEE standard,the signal models of seven single power quality disturbances and 17 combined power quality disturbances are given,and the disturbance waveform samples are generated in batches.Then,in order to improve the recognition accuracy,the adjustment factor is introduced to obtain the controllable time-frequency resolution through multi-resolution S-transform time-frequency domain analysis.On this basis,five disturbance time-frequency domain features are extracted,which quantitatively reflect the characteristics of the analyzed power quality disturbance signal,which is less than the traditional method based on S-transform.Finally,three classifiers such as K-nearest neighbor,support vector machine and decision tree algorithm are used to effectively complete the identification of power quality composite disturbances.Simulation results showthat the classification accuracy of decision tree algorithmis higher than that of K-nearest neighbor and support vector machine.Finally,the proposed method is compared with other commonly used recognition algorithms.Experimental results show that the proposedmethod is effective in terms of detection accuracy,especially for combined PQ interference.
文摘A recommender system is an approach performed by e-commerce for increasing smooth users’experience.Sequential pattern mining is a technique of data mining used to identify the co-occurrence relationships by taking into account the order of transactions.This work will present the implementation of sequence pattern mining for recommender systems within the domain of e-com-merce.This work will execute the Systolic tree algorithm for mining the frequent patterns to yield feasible rules for the recommender system.The feature selec-tion's objective is to pick a feature subset having the least feature similarity as well as highest relevancy with the target class.This will mitigate the feature vector's dimensionality by eliminating redundant,irrelevant,or noisy data.This work pre-sents a new hybrid recommender system based on optimized feature selection and systolic tree.The features were extracted using Term Frequency-Inverse Docu-ment Frequency(TF-IDF),feature selection with the utilization of River Forma-tion Dynamics(RFD),and the Particle Swarm Optimization(PSO)algorithm.The systolic tree is used for pattern mining,and based on this,the recommendations are given.The proposed methods were evaluated using the MovieLens dataset,and the experimental outcomes confirmed the efficiency of the techniques.It was observed that the RFD feature selection with systolic tree frequent pattern mining with collaborativefiltering,the precision of 0.89 was achieved.
基金supported by the Natural Science Foundation of Shangdong Province,China(Z2005D04).
文摘A subtractive cDNA library was developed to study genes associated with bud dormancy release in tree peonies. In order to identify genes that are highly expressed in buds released from dormancy, 588 clones were examined by differential screening. Of these, 185 clones were selected to be sequenced. A total of 37 unique sequences were obtained of which only 31 sequences have matches in the NCBI database or the Arabidopsis thaliana protein database. Semi-quantitative RT-PCR was used to confirm further the expression profiles for 12 transcripts identified within the subtractive cDNA library. Gene ontology analyses indicated that many of the different genes identified have unknown or hypothetical functions while it is speculated that other genes play different mo- lecular roles. In our study, genes involved in bud dormancy release were growth-related or stress-responsive, while low-temperature-induced ribosomal proteins may also play a role in bud dormancy release. Our results provide interesting information for further understanding of the molecular mechanism of bud dormancy release in tree peonies.
基金Supported by the National Natural Science Foundation of China(No.61401407)
文摘Due to more tag-collisions result in failed transmissions,tag anti-collision is a very vital issue in the radio frequency identification(RFID) system.However,so far decreases in communication time and increases in throughput are very limited.In order to solve these problems,this paper presents a novel tag anti-collision scheme,namely adaptive hybrid search tree(AHST),by combining two algorithms of the adaptive binary-tree disassembly(ABD) and the combination query tree(CQT),in which ABD has superior tag identification velocity and CQT has optimum performance in system throughput and search timeslots.From the theoretical analysis and numerical simulations,the proposed algorithm can colligate the advantages of above algorithms,improve the system throughput and reduce the searching timeslots dramatically.
基金The National Natural Science Foundation of China(No.51708110)。
文摘In order to solve the poor generalization ability of the back-propagation(BP)neural network in the model updating hybrid test,a novel method called the AdaBoost regression tree algorithm is introduced into the model updating procedure in hybrid tests.During the learning phase,the regression tree is selected as a weak regression model to be trained,and then multiple trained weak regression models are integrated into a strong regression model.Finally,the training results are generated through voting by all the selected regression models.A 2-DOF nonlinear structure was numerically simulated by utilizing the online AdaBoost regression tree algorithm and the BP neural network algorithm as a contrast.The results show that the prediction accuracy of the online AdaBoost regression algorithm is 48.3%higher than that of the BP neural network algorithm,which verifies that the online AdaBoost regression tree algorithm has better generalization ability compared to the BP neural network algorithm.Furthermore,it can effectively eliminate the influence of weight initialization and improve the prediction accuracy of the restoring force in hybrid tests.
文摘Existing stereo matching methods cannot guarantee both the computational accuracy and efficiency for ihe disparity estimation of large-scale or multi-view images.Hybrid tree method can obtain a disparity estimation fast with relatively low accuracy,while PatchMatch can give high-precision disparity value with relatively high computational cost.In this work,we propose the Hybrid Tree Guided PatchMatch which can calculate the disparity fast and accurate.Firstly,an initial disparity map is estimated by employing hybrid tree cost aggregation,which is used to constrain the label searching range of the PatchMatch.Furthermore,a reliable normal searching range for each current normal vector defined on the initial disparity map is calculated to refine the PatchMatch.Finally,an effective quantizing acceleration strategy is designed to decrease the matching computational cost of continuous disparity.Experimental results demonstrate that the disparity estimation based on our algorithm is better in binocular image benchmarks such as Middlebury and KITTI.We also provide the disparity estimation results for multi-view stereo in real scenes.
文摘One of the most important methods that finds usefulness in various applications, such as searching historical manuscripts, forensic search, bank check reading, mail sorting, book and handwritten notes transcription, is handwritten character recognition. The common issues in the character recognition are often due to different writing styles, orientation angle, size variation(regarding length and height), etc. This study presents a classification model using a hybrid classifier for the character recognition by combining holoentropy enabled decision tree(HDT) and deep neural network(DNN). In feature extraction, the local gradient features that include histogram oriented gabor feature and grid level feature, and grey level co-occurrence matrix(GLCM) features are extracted. Then, the extracted features are concatenated to encode shape, color, texture, local and statistical information, for the recognition of characters in the image by applying the extracted features to the hybrid classifier. In the experimental analysis, recognition accuracy of 96% is achieved. Thus, it can be suggested that the proposed model intends to provide more accurate character recognition rate compared to that of character recognition techniques used in the literature.
基金supported by the National Natural Science Foundation of China(U1204608,61472370,61672469,61822701)
文摘Ribonucleic acid(RNA)hybridization is widely used in popular RNA simulation software in bioinformatics.However limited by the exponential computational complexity of combin atorial problems,it is challenging to decide,within an acceptable time,whether a specific RNA hybridization is effective.We hereby introduce a machine learning based technique to address this problem.Sample machine learning(ML)models tested in the training phase include algorithms based on the boosted tree(BT)random forest(RF),decision tree(DT)and logistic regression(LR),and the corresponding models are obtained.Given the RNA molecular coding training and testing sets,the trained machine learning models are applied to predict the classification of RNA hybridization results.The experiment results show that the op timal predictive accuracies are 96.2%,96.6%,96.0%and 69.8%for the RF,BT,DT and LR-based approaches,respectively,un der the strong constraint condition,compared with traditiona representative methods.Furthermore,the average computation efficiency of the RF,BT,DT and LR-based approaches are208679,269756,184333 and 187458 times higher than that o existing approach,respectively.Given an RNA design,the BT based approach demonstrates high computational efficiency and better predictive accuracy in determining the biological effective ness of molecular hybridization.
基金supported by the Fundamental Research Funds for the Central Universities(2016083)
文摘A useful life prediction method based on the integration of the stochastic hybrid automata(SHA) model and the frame of the dynamic fault tree(DFT) is proposed. The SHA model can incorporate the orbit environment, work modes, system configuration, dynamic probabilities and degeneration of components,as well as spacecraft dynamics and kinematics. By introducing the frame of DFT, the system is classified into several layers, and the problem of state combination explosion is artfully overcome.An improved dynamic reliability model(DRM) based on the Nelson hypothesis is investigated to improve the defect of cumulative failure probability(CFP), which is used to address the failure probability of components in the SHA model. The simulation using the Monte-Carlo method is finally conducted on two satellites, which are deployed with the same multi-gyro subsystem but run on different orbits. The results show that the predicted useful life of the attitude control system(ACS) with consideration of abrupt failure,degradation, and running environment is quite different between the two satellites.
文摘A new multicast routing algorithm based on the hybrid genetic algorithm (HGA) is proposed. The coding pattern based on the number of routing paths is used. A fitness function that is computed easily and makes algorithm quickly convergent is proposed. A new approach that defines the HGA's parameters is provided. The simulation shows that the approach can increase largely the convergent ratio, and the fitting values of the parameters of this algorithm are different from that of the original algorithms. The optimal mutation probability of HGA equals 0.50 in HGA in the experiment, but that equals 0.07 in SGA. It has been concluded that the population size has a significant influence on the HGA's convergent ratio when it's mutation probability is bigger. The algorithm with a small population size has a high average convergent rate. The population size has little influence on HGA with the lower mutation probability.
基金supported partly by the National Basic Research Program of China(″973″Program)(No.2014CB046200)
文摘A hybrid Cartesian structured grid method is proposed for solving moving boundary unsteady problems. The near body region is discretized by using the body-fitted structured grids, while the remaining computational domain is tessellated with the generated Cartesian grids. As the body moves, the structured grids move with the body and the outer boundaries of inside grids are used to generate new holes in the outside adaptive Cartesian grid to facilitate data communication. By using the alternating digital tree (ADT) algorithm, the computational time of hole-cutting and identification of donor cells can be reduced significantly. A compressible solver for unsteady flow problems is developed. A cell-centered, second-order accurate finite volume method is employed in spatial discreti- zation and an implicit dual-time stepping low-upper symmetric Gauss-Seidei (LU-SGS) approach is employed in temporal discretization. Geometry-based adaptation is used during unsteady simulation time steps when boundary moves and the flow solution is interpolated from the old Cartesian grids to the new one with inverse distance weigh- ting interpolation formula. Both laminar and turbulent unsteady cases are tested to demonstrate the accuracy and efficiency of the proposed method. Then, a 2-D store separation problem is simulated. The result shows that the hybrid Cartesian grid method can handle the unsteady flow problems involving large-scale moving boundaries.
基金Science and Technology Project of Guangzhou, No. 2002Z3-E4016 No. B30101, China
文摘AIM: To construct tree models for classification of diffuse large B-cell lymphomas (DLBCL) by chromosome copy numbers, to compare them with cDNA microarray classification, and to explore models of multi-gene, multi-step and multi-pathway processes of DLBCL tumorigenesis. METHODS: Maximum-weight branching and distancebased models were constructed based on the comparative genomic hybridization (CGH) data of 123 DLBCL samples using the established methods and software of Desper et al . A maximum likelihood tree model was also used to analyze the data. By comparing with the results reported in literature, values of tree models in the classification of DLBCL were elucidated. RESULTS: Both the branching and the distance-based trees classified DLBCL into three groups. We combined the classification methods of the two models and classified DLBCL into three categories according to their characteristics. The first group was marked by +Xq, +Xp, -17p and +13q; the second group by +3q, +18q and +18p; and the third group was marked by -6q and +6p. This chromosomal classification was consistent with cDNA classification. It indicated that -6q and +3q were two main events in the tumorigenesis of lymphoma. CONCLUSION: Tree models of lymphoma established from CGH data can be used in the classification of DLBCL. These models can suggest multi-gene, multistep and multi-pathway processes of tumorigenesis. Two pathways, -6q preceding +6q and +3q preceding+18q, may be important in understanding tumorigenesis of DLBCL. The pathway, -6q preceding +6q, may have a close relationship with the tumorigenesis of non-GCB DLBCL.