MicroRNAs(miRNAs)exert an enormous influence on cell differentiation,biological development and the onset of diseases.Because predicting potential miRNA-disease associations(MDAs)by biological experiments usually requ...MicroRNAs(miRNAs)exert an enormous influence on cell differentiation,biological development and the onset of diseases.Because predicting potential miRNA-disease associations(MDAs)by biological experiments usually requires considerable time and money,a growing number of researchers are working on developing computational methods to predict MDAs.High accuracy is critical for prediction.To date,many algorithms have been proposed to infer novel MDAs.However,they may still have some drawbacks.In this paper,a logistic weighted profile-based bi-random walk method(LWBRW)is designed to infer potential MDAs based on known MDAs.In this method,three networks(i.e.,a miRNA functional similarity network,a disease semantic similarity network and a known MDA network)are constructed first.In the process of building the miRNA network and the disease network,Gaussian interaction profile(GIP)kernel is computed to increase the kernel similarities,and the logistic function is used to extract valuable information and protect known MDAs.Next,the known MDA matrix is preprocessed by the weighted K-nearest known neighbours(WKNKN)method to reduce the number of false negatives.Then,the LWBRW method is applied to infer novel MDAs by bi-randomly walking on the miRNA network and the disease network.Finally,the predictive ability of the LWBRW method is confirmed by the average AUC of 0.9393(0.0061)in 5-fold cross-validation(CV)and the AUC value of 0.9763 in leave-one-out cross-validation(LOOCV).In addition,case studies also show the outstanding ability of the LWBRW method to explore potential MDAs.展开更多
针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from...针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from transformers)预训练模型得到输入序列语义的词向量;然后将训练后的词向量输入双向长短期记忆网络(bi-directional long short-term memory,BiLSTM)模型进一步获取上下文特征;最后根据条件随机场(conditional random fields,CRF)的标注规则和序列解码能力输出最大概率序列标注结果,构建油气领域命名实体识别模型框架。将BERT-BiLSTM-CRF模型与其他2种命名实体识别模型(BiLSTM-CRF、BiLSTM-Attention-CRF)在包括3万多条文本语料数据、4类实体的自建数据集上进行了对比实验。实验结果表明,BERT-BiLSTM-CRF模型的准确率(P)、召回率(R)和F_(1)值分别达到91.3%、94.5%和92.9%,实体识别效果优于其他2种模型。展开更多
为实现柔性直流(voltage sourced converter-high voltage direct current,VSC-HVDC)换流阀冷却系统入阀水温的智能预测,文中提出一种基于随机森林(random forest,RF)和双向长短时记忆(bi-directional long short-term memory,BiLSTM)...为实现柔性直流(voltage sourced converter-high voltage direct current,VSC-HVDC)换流阀冷却系统入阀水温的智能预测,文中提出一种基于随机森林(random forest,RF)和双向长短时记忆(bi-directional long short-term memory,BiLSTM)网络混合的柔直换流阀冷却系统入阀水温的预测模型,并以此为基础对柔直换流站阀冷系统的冷却能力进行评估。首先,采用RF算法对由阀冷系统监测变量组成的高维特征集进行重要性分析,筛选出影响入阀水温的重要特征,与历史入阀水温构成输入特征向量。然后,将特征向量输入到BiLSTM预测模型,对模型进行训练并实现对入阀水温的准确预测和冷却能力定量评估。最后,以广东电网某柔直换流站为实例对所提方法进行分析,验证了所提出的基于RF-BiLSTM的混合模型预测精度优于BiLSTM模型、RF模型、支持向量机(support vector machine,SVM)模型和自回归滑动平均模型(auto-regressive and moving average,ARMA)模型,并且实现了冷却能力的定量评估。结果表明该换流站冷却裕量达98%,存在过度冷却、能源浪费的问题,与换流站现场运行情况相符,验证了文中所提方法的有效性和准确性。展开更多
为有效解决构建电力运检知识图谱的关键步骤之一的电力运检命名实体识别问题,通过构建一种基于Stacking多模型融合的隐马尔可夫-条件随机场-双向长短期记忆网络(hidden Markov-conditional random fields-bi-directional long short-ter...为有效解决构建电力运检知识图谱的关键步骤之一的电力运检命名实体识别问题,通过构建一种基于Stacking多模型融合的隐马尔可夫-条件随机场-双向长短期记忆网络(hidden Markov-conditional random fields-bi-directional long short-term,HCB)模型方法研究了电力运检命名实体识别问题。HCB模型分为两层,第一层使用隐马尔可夫模型(hidden Markov model,HMM)、条件随机场(conditional random fields,CRF)和双向长短期记忆网络(bi-directional long short-term memory,Bi-LSTM)模型进行训练预测,再将预测结果输入第二层的CRF模型进行训练,经过双层模型训练预测得出最后的命名实体。结果表明:在电力运检命名实体识别问题上HCB模型的精确率、召回率及F1值等指标明显优于单模型以及其他的融合模型。可见HCB模型能有效解决电力运检命名实体识别问题。展开更多
基金This work was supported by the National Natural Science Foundation of China under Grant Nos.61902215,61872220 and 61701279.
文摘MicroRNAs(miRNAs)exert an enormous influence on cell differentiation,biological development and the onset of diseases.Because predicting potential miRNA-disease associations(MDAs)by biological experiments usually requires considerable time and money,a growing number of researchers are working on developing computational methods to predict MDAs.High accuracy is critical for prediction.To date,many algorithms have been proposed to infer novel MDAs.However,they may still have some drawbacks.In this paper,a logistic weighted profile-based bi-random walk method(LWBRW)is designed to infer potential MDAs based on known MDAs.In this method,three networks(i.e.,a miRNA functional similarity network,a disease semantic similarity network and a known MDA network)are constructed first.In the process of building the miRNA network and the disease network,Gaussian interaction profile(GIP)kernel is computed to increase the kernel similarities,and the logistic function is used to extract valuable information and protect known MDAs.Next,the known MDA matrix is preprocessed by the weighted K-nearest known neighbours(WKNKN)method to reduce the number of false negatives.Then,the LWBRW method is applied to infer novel MDAs by bi-randomly walking on the miRNA network and the disease network.Finally,the predictive ability of the LWBRW method is confirmed by the average AUC of 0.9393(0.0061)in 5-fold cross-validation(CV)and the AUC value of 0.9763 in leave-one-out cross-validation(LOOCV).In addition,case studies also show the outstanding ability of the LWBRW method to explore potential MDAs.
文摘针对油气领域知识图谱构建过程中命名实体识别使用传统方法存在实体特征信息提取不准确、识别效率低的问题,提出了一种基于BERT-BiLSTM-CRF模型的命名实体识别研究方法。该方法首先利用BERT(bidirectional encoder representations from transformers)预训练模型得到输入序列语义的词向量;然后将训练后的词向量输入双向长短期记忆网络(bi-directional long short-term memory,BiLSTM)模型进一步获取上下文特征;最后根据条件随机场(conditional random fields,CRF)的标注规则和序列解码能力输出最大概率序列标注结果,构建油气领域命名实体识别模型框架。将BERT-BiLSTM-CRF模型与其他2种命名实体识别模型(BiLSTM-CRF、BiLSTM-Attention-CRF)在包括3万多条文本语料数据、4类实体的自建数据集上进行了对比实验。实验结果表明,BERT-BiLSTM-CRF模型的准确率(P)、召回率(R)和F_(1)值分别达到91.3%、94.5%和92.9%,实体识别效果优于其他2种模型。
文摘为实现柔性直流(voltage sourced converter-high voltage direct current,VSC-HVDC)换流阀冷却系统入阀水温的智能预测,文中提出一种基于随机森林(random forest,RF)和双向长短时记忆(bi-directional long short-term memory,BiLSTM)网络混合的柔直换流阀冷却系统入阀水温的预测模型,并以此为基础对柔直换流站阀冷系统的冷却能力进行评估。首先,采用RF算法对由阀冷系统监测变量组成的高维特征集进行重要性分析,筛选出影响入阀水温的重要特征,与历史入阀水温构成输入特征向量。然后,将特征向量输入到BiLSTM预测模型,对模型进行训练并实现对入阀水温的准确预测和冷却能力定量评估。最后,以广东电网某柔直换流站为实例对所提方法进行分析,验证了所提出的基于RF-BiLSTM的混合模型预测精度优于BiLSTM模型、RF模型、支持向量机(support vector machine,SVM)模型和自回归滑动平均模型(auto-regressive and moving average,ARMA)模型,并且实现了冷却能力的定量评估。结果表明该换流站冷却裕量达98%,存在过度冷却、能源浪费的问题,与换流站现场运行情况相符,验证了文中所提方法的有效性和准确性。
文摘为有效解决构建电力运检知识图谱的关键步骤之一的电力运检命名实体识别问题,通过构建一种基于Stacking多模型融合的隐马尔可夫-条件随机场-双向长短期记忆网络(hidden Markov-conditional random fields-bi-directional long short-term,HCB)模型方法研究了电力运检命名实体识别问题。HCB模型分为两层,第一层使用隐马尔可夫模型(hidden Markov model,HMM)、条件随机场(conditional random fields,CRF)和双向长短期记忆网络(bi-directional long short-term memory,Bi-LSTM)模型进行训练预测,再将预测结果输入第二层的CRF模型进行训练,经过双层模型训练预测得出最后的命名实体。结果表明:在电力运检命名实体识别问题上HCB模型的精确率、召回率及F1值等指标明显优于单模型以及其他的融合模型。可见HCB模型能有效解决电力运检命名实体识别问题。