With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service respons...With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field.展开更多
Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes...Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbrevia- tion phenomena, cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has important impact on the system performance, and the post-processing explored has an important contribution on system performance to achieve better re- sults.展开更多
Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process o...Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process of natural language instruction should be included before carried out by the robot autonomously and the prototype dialog system was designed thus the standardization problem was raised for the natural and understandable language interaction. In the application background of remotely navigating a mobile robot inside a building with Chinese natural spoken language considering that as an important navigation element in instructions a place name can be expressed with different lexical terms in spoken language this paper proposes a model for substituting different alternatives of a place name with a standard one (called standardization). First a CRF (Conditional Random Fields) model is trained to label the term required be standardized then a trained word embedding model is to represent lexical terms as digital vectors. In the vector space similarity of lexical terms is defined and used to find out the most similar one to the term picked out to be standardized. Experiments show that the method proposed works well and the dialog system responses to confirm the instructions are natural and understandable.展开更多
Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead pro...Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead profile using site investigation results.As a general method to reflect the spatial distribution of geo-material properties based on field measurements,the conditional random field(CRF)was improved in this paper to simulate rockhead profiles.Besides,in geotechnical engineering practice,measurements are generally limited due to the limitations of budget and time so that the estimation of the mean value can have uncertainty to some extent.As the Bayesian theory can effectively combine the measurements and prior information to deal with uncertainty,CRF was implemented with the aid of the Bayesian framework in this study.More importantly,this simulation procedure is achieved as an analytical solution to avoid the time-consuming sampling work.The results show that the proposed method can provide a reasonable estimation about the rockhead depth at various locations against measurement data and as a result,the subjectivity in determining prior mean can be minimized.Finally,both the measurement data and selection of hyper-parameters in the proposed method can affect the simulated rockhead profiles,while the influence of the latter is less significant than that of the former.展开更多
In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es...In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.展开更多
This paper presents a new method for refining image annotation by integrating probabilistic latent semantic analysis(PLSA) with conditional random field(CRF).First a PLSA model with asymmetric modalities is constructe...This paper presents a new method for refining image annotation by integrating probabilistic latent semantic analysis(PLSA) with conditional random field(CRF).First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores,and then model semantic relationship among the candidate annotations by leveraging conditional random field.In CRF,the confidence scores generated by the PLSA model and the Flickr distance between pairwise candidate annotations are considered as local evidences and contextual potentials respectively.The novelty of our method mainly lies in two aspects:exploiting PLSA to predict a candidate set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation.To demonstrate the effectiveness of the method proposed in this paper,an experiment is conducted on the standard Corel dataset and its results are compared favorably with several state-of-the-art approaches.展开更多
针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from...针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from transformers)预训练模型得到输入序列语义的词向量;然后将训练后的词向量输入双向长短期记忆网络(bi-directional long short-term memory,BiLSTM)模型进一步获取上下文特征;最后根据条件随机场(conditional random fields,CRF)的标注规则和序列解码能力输出最大概率序列标注结果,构建油气领域命名实体识别模型框架。将BERT-BiLSTM-CRF模型与其他2种命名实体识别模型(BiLSTM-CRF、BiLSTM-Attention-CRF)在包括3万多条文本语料数据、4类实体的自建数据集上进行了对比实验。实验结果表明,BERT-BiLSTM-CRF模型的准确率(P)、召回率(R)和F_(1)值分别达到91.3%、94.5%和92.9%,实体识别效果优于其他2种模型。展开更多
条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本...条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本时,面临一系列挑战.一方面,个人计算机遇到处理的瓶颈从而难以胜任;另一方面,服务器执行效率较低.而通过升级服务器的硬件配置来提高其计算能力的方法,在处理大规模的文本分析任务时,终究不能从根本上解决问题.为此,采用"分而治之"的思想,基于Apache Spark的大数据处理框架设计并实现了运行在集群环境下的分布式CRFs——SparkCRF.实验表明,SparkCRF在文本分析任务中,具有高效的计算能力和较好的扩展性,并且具有与传统的单节点CRF++相同水平的准确率.展开更多
和弦识别是音乐调式分析和自动标注的基础,同时在分析音乐的结构和旋律方面有着非常重要的作用。结合音乐理论和信号处理知识,提出一种基于MPCP(Mel Pitch Class Profile)特征和CRFs(Conditional Random Fields)模型的和弦识别方法。利...和弦识别是音乐调式分析和自动标注的基础,同时在分析音乐的结构和旋律方面有着非常重要的作用。结合音乐理论和信号处理知识,提出一种基于MPCP(Mel Pitch Class Profile)特征和CRFs(Conditional Random Fields)模型的和弦识别方法。利用短时傅里叶变换(STFT)对音乐信号进行时频变换,定义了一种新的MPCP特征,最后用CRFs对和弦进行识别。实验结果表明,提出的方法在识别率上优于其他方法,具有一定的潜力。展开更多
通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。...通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。越南语中交叉歧义广泛分布在句子中,为了克服交叉歧义的影响,通过词典的正向和逆向匹配算法从训练语料中抽取了5 377条歧义片段,并通过最大熵模型训练得到一个歧义模型,并融入到分词模型中。把训练语料均分为10份做交叉验证实验,分词准确率达到了96.55%。与已有越南语分词工具VnTokenizer比较,实验结果表明该方法提高了越南语分词的准确率、召回率和F值。展开更多
基金supported by Science and Technology Project of State Grid Corporation(Research and Application of Intelligent Energy Meter Quality Analysis and Evaluation Technology Based on Full Chain Data)
文摘With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field.
基金Supported by The National Natural Science Foundation of China(No.60302021).
文摘Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbrevia- tion phenomena, cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has important impact on the system performance, and the post-processing explored has an important contribution on system performance to achieve better re- sults.
基金Sponsored by the Basic Research Development Program of China ( Grant No. 2013CB03554)the Fundamental Research Funds for Universities, Central South University (Grant No. 2017zzts394).
文摘Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process of natural language instruction should be included before carried out by the robot autonomously and the prototype dialog system was designed thus the standardization problem was raised for the natural and understandable language interaction. In the application background of remotely navigating a mobile robot inside a building with Chinese natural spoken language considering that as an important navigation element in instructions a place name can be expressed with different lexical terms in spoken language this paper proposes a model for substituting different alternatives of a place name with a standard one (called standardization). First a CRF (Conditional Random Fields) model is trained to label the term required be standardized then a trained word embedding model is to represent lexical terms as digital vectors. In the vector space similarity of lexical terms is defined and used to find out the most similar one to the term picked out to be standardized. Experiments show that the method proposed works well and the dialog system responses to confirm the instructions are natural and understandable.
基金the funding support from the National Natural Science Foundation of China (Grant No. 52078086)Program of Distinguished Young Scholars, Natural Science Foundation of Chongqing, China (Grant No. cstc2020jcyj-jq0087)State Education Ministry and the Fundamental Research Funds for the Central Universities (Grant No. 2019 CDJSK 04 XK23)
文摘Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead profile using site investigation results.As a general method to reflect the spatial distribution of geo-material properties based on field measurements,the conditional random field(CRF)was improved in this paper to simulate rockhead profiles.Besides,in geotechnical engineering practice,measurements are generally limited due to the limitations of budget and time so that the estimation of the mean value can have uncertainty to some extent.As the Bayesian theory can effectively combine the measurements and prior information to deal with uncertainty,CRF was implemented with the aid of the Bayesian framework in this study.More importantly,this simulation procedure is achieved as an analytical solution to avoid the time-consuming sampling work.The results show that the proposed method can provide a reasonable estimation about the rockhead depth at various locations against measurement data and as a result,the subjectivity in determining prior mean can be minimized.Finally,both the measurement data and selection of hyper-parameters in the proposed method can affect the simulated rockhead profiles,while the influence of the latter is less significant than that of the former.
文摘In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.
基金Supported by the National Basic Research Priorities Programme(No.2013CB329502)the National High Technology Research and Development Programme of China(No.2012AA011003)+1 种基金the Natural Science Basic Research Plan in Shanxi Province of China(No.2014JQ2-6036)the Science and Technology R&D Program of Baoji City(No.203020013,2013R2-2)
文摘This paper presents a new method for refining image annotation by integrating probabilistic latent semantic analysis(PLSA) with conditional random field(CRF).First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores,and then model semantic relationship among the candidate annotations by leveraging conditional random field.In CRF,the confidence scores generated by the PLSA model and the Flickr distance between pairwise candidate annotations are considered as local evidences and contextual potentials respectively.The novelty of our method mainly lies in two aspects:exploiting PLSA to predict a candidate set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation.To demonstrate the effectiveness of the method proposed in this paper,an experiment is conducted on the standard Corel dataset and its results are compared favorably with several state-of-the-art approaches.
文摘针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from transformers)预训练模型得到输入序列语义的词向量;然后将训练后的词向量输入双向长短期记忆网络(bi-directional long short-term memory,BiLSTM)模型进一步获取上下文特征;最后根据条件随机场(conditional random fields,CRF)的标注规则和序列解码能力输出最大概率序列标注结果,构建油气领域命名实体识别模型框架。将BERT-BiLSTM-CRF模型与其他2种命名实体识别模型(BiLSTM-CRF、BiLSTM-Attention-CRF)在包括3万多条文本语料数据、4类实体的自建数据集上进行了对比实验。实验结果表明,BERT-BiLSTM-CRF模型的准确率(P)、召回率(R)和F_(1)值分别达到91.3%、94.5%和92.9%,实体识别效果优于其他2种模型。
文摘条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本时,面临一系列挑战.一方面,个人计算机遇到处理的瓶颈从而难以胜任;另一方面,服务器执行效率较低.而通过升级服务器的硬件配置来提高其计算能力的方法,在处理大规模的文本分析任务时,终究不能从根本上解决问题.为此,采用"分而治之"的思想,基于Apache Spark的大数据处理框架设计并实现了运行在集群环境下的分布式CRFs——SparkCRF.实验表明,SparkCRF在文本分析任务中,具有高效的计算能力和较好的扩展性,并且具有与传统的单节点CRF++相同水平的准确率.
文摘和弦识别是音乐调式分析和自动标注的基础,同时在分析音乐的结构和旋律方面有着非常重要的作用。结合音乐理论和信号处理知识,提出一种基于MPCP(Mel Pitch Class Profile)特征和CRFs(Conditional Random Fields)模型的和弦识别方法。利用短时傅里叶变换(STFT)对音乐信号进行时频变换,定义了一种新的MPCP特征,最后用CRFs对和弦进行识别。实验结果表明,提出的方法在识别率上优于其他方法,具有一定的潜力。
文摘通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。越南语中交叉歧义广泛分布在句子中,为了克服交叉歧义的影响,通过词典的正向和逆向匹配算法从训练语料中抽取了5 377条歧义片段,并通过最大熵模型训练得到一个歧义模型,并融入到分词模型中。把训练语料均分为10份做交叉验证实验,分词准确率达到了96.55%。与已有越南语分词工具VnTokenizer比较,实验结果表明该方法提高了越南语分词的准确率、召回率和F值。