摘要
为推动核电厂配置风险管理工作的实施,以核电厂操纵员日志为研究对象,开展非结构化文本语义识别研究,自动提取其中的设备和状态等风险配置参数信息。经过文本预处理和特征工程,开发基于注意力机制的深度学习模型,对文本进行编码和推理,实现实体定位和状态识别等功能。通过3500条人工标注数据进行初步模型训练后,语义识别模型的准确率可达到83%,实现了日志文本中缩写设备、单个设备及状态、多个设备及状态的有效识别和标准化输出。
In order to promote the implementation of nuclear power plant configuration risk management, a study was conducted on unstructured text semantic recognition based on nuclear power plant operator logs, automatically extracting risk configuration parameter information such as equipment and status. After text preprocessing and feature engineering, a deep learning model based on attention mechanism was developed to encode and infer text, achieving functions such as entity localization and state recognition. Through preliminary model training with 3500 manually annotated data, the accuracy of the semantic recognition model can reach 83%, achieving effective recognition and standardized output of abbreviated devices, single devices and states, and multiple devices and states in log text.
出处
《核科学与技术》
2024年第1期27-35,共9页
Nuclear Science and Technology