摘要
聚类方法是面向基因数据库知识发现的主要手段之一。随着后基因时代的到来 ,巨量的基因数据向现有的聚类分析方法提出了严峻的挑战。就基因数据库知识发现的内容、聚类方法在基因数据分析中的应用和基因数据库的特征进行了分析与探讨 ,指出面向基因数据分析的聚类方法必须解决基因数据集巨量性、异构性、高维性等
Clustering analysis is one of the key tools for knowledge discovery in genome databases. However, with the coming of the post genomic era, great challenges are being posed by the tremendous quantity of genomic data to the known clustering algorithms. Clustering related issues in genome researches are discussed hereafter with analysis made on the content of genome database knowledge discovery, the application of clustering analysis to gene research and the very characteristics of genome databases, such as prodigiousness, high dimensionality, heterogeneous property and redundancy. It is suggested that an effective clustering algorithm that is oriented to the analysis of genome databases must settle all the problems produced by the above mentioned characteristics.
出处
《淮海工学院学报(自然科学版)》
CAS
2002年第3期20-23,共4页
Journal of Huaihai Institute of Technology:Natural Sciences Edition