摘要
目的运用贝叶斯网络(Bayesian network,BNT)分析肝癌病人资料,探索肝癌预后影响因素间的相互关系。方法依据最小描述长度(Mini mal Description Length,MDL)准则学习网络结构,对完整数据运用极大似然估计(Maxi mumlikelihood esti mation,MLE)获得网络参数,对含有缺失值的数据运用期望最大化(Expectation Maxi mization,EM)算法进行参数学习,并与完整数据的MLE估计进行比较,衡量BNT学习含有缺失值数据的能力。结果通过对含有1441个样本的肝癌资料的学习,构建了一个含有49个结点,62条有向边的BNT模型,并获得各结点参数。网络中的有向边反映肝癌预后影响因素之间的相互作用或影响,网络参数反映其强度。分析了直接影响肝癌预后和分期的指标,并依据网络参数,判断肝癌分期及预后情况。结论BNT模型具有较强的处理缺失数据的能力,应用BNT分析肝癌病人资料,揭示了影响肝癌预后的多因素间,多层次的多重因果关系,并从概率角度定量描述各因素间的影响强度。
Objective Analyze the liver cancer patients' data based on Bayesian networks, so as to explore the relationship among the related factors which affected prognosis. Methods Construct the network structure based on the MDL (Minimal Description Length) principle. Then utilize the MLE (MLE, Maximum likelihood estimation) algorithm for parameters learning from the complete data, and utilize EM (Expectation Maximization) algorithm for missing value processing. Weigh the ability of BNT for analyzing dataset with missing value through comparing the parameters between them. Results Get a BNT with 49 nodes and 62 directed arcs and the related parameters by analyzing liver cancer data with 1 441 patients. The directed arcs reflected whether there is correlation between two nodes and parameters reflected their intensity. Analyze the factors that affected prognosis or classification of liver cancer, and estimate the stage and prognosis of live cancer according the parameters. Conclusion Discover some potential, multilayer relationships among multi - factors affecting prognosis of liver cancer and describe the intensity of regulate function in probability. Enhance the ability to process the data with missing value in llver cancer data.
出处
《中国卫生统计》
CSCD
北大核心
2008年第1期10-14,共5页
Chinese Journal of Health Statistics
基金
国家自然科学基金(30671821)
上海市自然科学基金(04ZR14049)
关键词
原发性肝癌
贝叶斯网络
结构学习
参数学习
Primary live cancer
Bayesian networks
Structure learning
Parameter learning