近些年来,条件概率模型的研究得到了很大的发展。在对序列标注类问题进行处理时,条件模型逐渐开始取代产生式模型,其应用领域相当广泛,条件概率模型可应用到图像识别、自然语言处理、入侵检测等问题上。条件随机场模型(Conditional Rand...近些年来,条件概率模型的研究得到了很大的发展。在对序列标注类问题进行处理时,条件模型逐渐开始取代产生式模型,其应用领域相当广泛,条件概率模型可应用到图像识别、自然语言处理、入侵检测等问题上。条件随机场模型(Conditional Random Fields,CRFs)模型是条件模型中的代表模型,也是条件模型中现在研究得最多的模型之一。它避免了产生式模型的缺点,而且克服了前期最大熵模型标记偏置的缺陷,由此得到广泛的运用。在利用CRFs作具体应用研究时发现,单纯利用CRFs模型进行实际运用取得的效果并没有达到最好,所以在每个应用中均进行了改进。本文主要研究军用文书分词、军事命名实体识别、入侵检测等方面,所做的改进都在模型应用的基础上更进一步提高了系统的性能。展开更多
对于生鲜蛋供应链知识图谱构建过程中供应链领域实体名称多样、特征信息提取不充分的问题,提出了一种基于BERT-CRF模型(Bidirectional encoder representations from transformers-conditional random field)的命名实体识别方法。该方...对于生鲜蛋供应链知识图谱构建过程中供应链领域实体名称多样、特征信息提取不充分的问题,提出了一种基于BERT-CRF模型(Bidirectional encoder representations from transformers-conditional random field)的命名实体识别方法。该方法使用BIO(Begin、Internal、Other)标记规则进行序列标注,以字向量和位置向量作为输入,通过BERT预训练模型提取输入序列全局特征,并在模型的末端添加CRF层引入硬约束,构建适合生鲜蛋供应链领域命名实体识别的模型框架。所提出的模型与其他3种命名实体识别模型在自建数据集上进行了对比实验,该数据集包含12810条文本语料数据,5大类21个小类。实验结果表明,本文模型取得了很好的结果,准确率、召回率和F1值分别达到91.82%、90.44%、91.01%,验证了本文模型优于其他3种模型。最后本文模型使用自建的食品领域菜谱数据集进行实验,结果表明模型具有一定的泛化能力。展开更多
The inner relationship between Markov random field(MRF) and Markov chain random field(MCRF) is discussed. MCRF is a special MRF for dealing with high-order interactions of sparse data. It consists of a single spatial ...The inner relationship between Markov random field(MRF) and Markov chain random field(MCRF) is discussed. MCRF is a special MRF for dealing with high-order interactions of sparse data. It consists of a single spatial Markov chain(SMC) that can move in the whole space. Generally, the theoretical backbone of MCRF is conditional independence assumption, which is a way around the problem of knowing joint probabilities of multi-points. This so-called Naive Bayes assumption should not be taken lightly and should be checked whenever possible because it is mathematically difficult to prove. Rather than trap in this independence proving, an appropriate potential function in MRF theory is chosen instead. The MCRF formulas are well deduced and the joint probability of MRF is presented by localization approach, so that the complicated parameter estimation algorithm and iteration process can be avoided. The MCRF model is then applied to the lithofacies identification of a region and compared with triplex Markov chain(TMC) simulation. Analyses show that the MCRF model will not cause underestimation problem and can better reflect the geological sedimentation process.展开更多
文摘近些年来,条件概率模型的研究得到了很大的发展。在对序列标注类问题进行处理时,条件模型逐渐开始取代产生式模型,其应用领域相当广泛,条件概率模型可应用到图像识别、自然语言处理、入侵检测等问题上。条件随机场模型(Conditional Random Fields,CRFs)模型是条件模型中的代表模型,也是条件模型中现在研究得最多的模型之一。它避免了产生式模型的缺点,而且克服了前期最大熵模型标记偏置的缺陷,由此得到广泛的运用。在利用CRFs作具体应用研究时发现,单纯利用CRFs模型进行实际运用取得的效果并没有达到最好,所以在每个应用中均进行了改进。本文主要研究军用文书分词、军事命名实体识别、入侵检测等方面,所做的改进都在模型应用的基础上更进一步提高了系统的性能。
文摘对于生鲜蛋供应链知识图谱构建过程中供应链领域实体名称多样、特征信息提取不充分的问题,提出了一种基于BERT-CRF模型(Bidirectional encoder representations from transformers-conditional random field)的命名实体识别方法。该方法使用BIO(Begin、Internal、Other)标记规则进行序列标注,以字向量和位置向量作为输入,通过BERT预训练模型提取输入序列全局特征,并在模型的末端添加CRF层引入硬约束,构建适合生鲜蛋供应链领域命名实体识别的模型框架。所提出的模型与其他3种命名实体识别模型在自建数据集上进行了对比实验,该数据集包含12810条文本语料数据,5大类21个小类。实验结果表明,本文模型取得了很好的结果,准确率、召回率和F1值分别达到91.82%、90.44%、91.01%,验证了本文模型优于其他3种模型。最后本文模型使用自建的食品领域菜谱数据集进行实验,结果表明模型具有一定的泛化能力。
基金Project(2011ZX05002-005-006) supported by the National Science and Technology Major Research Program during the Twelfth Five-Year Plan of China
文摘The inner relationship between Markov random field(MRF) and Markov chain random field(MCRF) is discussed. MCRF is a special MRF for dealing with high-order interactions of sparse data. It consists of a single spatial Markov chain(SMC) that can move in the whole space. Generally, the theoretical backbone of MCRF is conditional independence assumption, which is a way around the problem of knowing joint probabilities of multi-points. This so-called Naive Bayes assumption should not be taken lightly and should be checked whenever possible because it is mathematically difficult to prove. Rather than trap in this independence proving, an appropriate potential function in MRF theory is chosen instead. The MCRF formulas are well deduced and the joint probability of MRF is presented by localization approach, so that the complicated parameter estimation algorithm and iteration process can be avoided. The MCRF model is then applied to the lithofacies identification of a region and compared with triplex Markov chain(TMC) simulation. Analyses show that the MCRF model will not cause underestimation problem and can better reflect the geological sedimentation process.