摘要
贝叶斯 (Bayesian)网络近年成为数据采掘引人注目的研究方向。通过剖析 Bayesian网络的结构和建造步骤 ,着重讨论用 Bayesian方法从先验信息和样本数据进行学习以确定网络的结构和概率分布的基本方法 ,分析 Bayesian网络学习的特点 ,探讨 Bayesian网络的适用性。与数据采掘的其它方法相比 ,Bayesian网络的优点是可以综合先验信息和样本信息 ,这在样本难得时特别有用 ;可以发现数据之间的因果关系 ,适合于处理不完整数据集 ,这是其它模型难以做到的。其缺点是计算开销较大 ;确定合理的先验密度比较困难 ;如何判定实际问题是否满足所要求的假设 。
Bayesian network approaches have become an important research direction in Data Mining. This paper discusses the structure and the construction of Bayesian networks, emphasizing the basic methods for learning the structure and probabilities of Bayesian networks from prior knowledge and sample data. Compared with other approaches used for data mining, Bayesian networks can combine prior knowledge with observed data, which is very important when data is scarce or very expensive. Moreover, Bayesian networks can discover causal relationships among data and handle incomplete data sets, which other methods can not do. The disadvantages of Bayesian networks are the high computational cost, the difficulties in determining appropriate parameters and structures, and the lack of principles to justify if the hypotheses required by the Bayesian network are actually satisfied by the problems.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2001年第1期49-52,共4页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目!(79990 5 80 )
清华大学信息工程学院基金资助