摘要
机械装配过程常常需要人工阅读并理解大量装配工艺文本,从而耗费大量时间,并且由于装配工艺文本书写人员和装配人员能力的差异,可能会导致装配人员错误理解装配文本,产生零部件错装、漏装等问题;机械装配矩阵以矩阵形式存储零部件的装配实体关系,可以直接、有效表达装配关系,不仅易于工人理解装配关系,也便于计算机识别,可以显著提高装配效率。自然语言处理作为研究计算机理解人类语言的工具,在根据装配文本生成装配矩阵的任务中可以起到关键的作用;文章采用自然语言处理的方法,对装配文本进行断句、分词、词性标注等文本预处理操作,采用机械装配名词语料库辅助以提高对装配零件的分词、词性标注时的准确率;用语法依存关系分析和语法模板匹配两种方法生成每个句子的主语、谓语、宾语三元组,其中采用机械装配名词语料库进行匹配,以判断其中的装配零部件名;之后提取出主语及宾语都为装配零件的三元组作为一个装配关系,对其进行去除冗余词、实体对齐等后处理操作;最后根据零部件数量组成一个空矩阵,将装配关系填入接触矩阵,并根据零部件类型判断生成装配关系的接触-连接矩阵。
Mechanical assembly process often requires manual reading and understanding of a large number of assembly process texts,which consumes a lot of time.Moreover,due to the differences in the abilities of assembly process text writers and assemblers,it may cause assemblers to misunderstand assembly texts,resulting in problems such as wrong assembly and missing assembly of parts.Mechanical assembly matrix stores the assembly entity relationship of parts in the form of matrix,which can directly and effectively express the assembly relationship.It is not only easy for assemblers to understand the assembly relationship,but also easy for computer recognition,which can significantly improve the assembly efficiency.Natural language processing can play a key role in generating assembly matrix from assembly text as a method for computer understanding human language.In this paper,the natural language processing method is used to preprocess the assembly texts,such as sentence breaking,word segmentation and part of speech tagging.The mechanical assembly noun corpus is used to improve the accuracy of word segmentation and part of speech tagging of assembly parts;Then,the“subject-predicate-object”triplet of each sentence is generated by two methods:the syntax dependency analysis and syntax template matching.The mechanical assembly noun corpus is used to match the assembly part names;After that,the triplet whose subject and object are assembly parts is extracted as an assembly relationship,and the post-processing operations such as removing redundant words and entity alignment are carried out;Finally,an empty matrix is formed according to the number of parts,the assembly relationship is filled into the contact matrix,and the assembly relationship matrix is generated according to the type of parts.
作者
尹昱东
王保建
李珂嘉
王紫平
刘洁
YIN Yudong;WANG Baojian;LI Kejia;WANG Ziping;LIU Jie(School of Mechanical Engineering,Xi'an Jiaotong University,Xi'an 710049,China)
出处
《计算机测量与控制》
2024年第6期198-205,219,共9页
Computer Measurement &Control
基金
陕西省自然科学基础研究计划项目(2021M-169)
陕西省自然科学基础研究计划项目(2023-JC-YB-477)
2022年西安交通大学本科实验实践与创新创业教育教学改革研究专项项目(22SJZX10)。
关键词
装配工艺文本
实体关系
自然语言处理
词性标注
三元组
装配关系矩阵
assembly process text
entity relationship
natural language processing
part of speech tagging
triplet
assembly relation matrix