摘要
提出了一种新的基于贝叶斯网络对XML文档信息进行查询的模型方法。该模型支持针对XML文档信息的结构化查询。基于XML信息查询的特点,利用XML数据集中语词、元素和结构化单元的统计信息对模型的拓扑结构和条件概率进行了学习;结合概率函数的方法,利用模型的概率推理进程对XML文档和结构化查询条件的相关度进行了估算。最后在基于INEX测试集的实验中证明了该方法的有效性和可靠性。
In this paper, a Bayesian model for XML document information retrieval was proposed, supporting content and structure queries. The topology and conditional probabilities of the Bayesian model were mined from the XML document collection by statistics of terms, elements and structural units contained in the collection. Combined with probability functions, the relevance of a document to a given structured query was obtained by an inference process through a complex dependences network of this model. Experimental results on the INEX corpus of XML documents show the validity and reliability.
出处
《计算机应用》
CSCD
北大核心
2009年第10期2791-2795,共5页
journal of Computer Applications
关键词
贝叶斯网络
XML信息检索
结构化查询
概率函数
Bayesian network
XML information retrieval
structured query
probability function