Text event mining,as an indispensable method of text mining processing,has attracted the extensive attention of researchers.A modeling method for knowledge graph of events based on mutual information among neighbor do...Text event mining,as an indispensable method of text mining processing,has attracted the extensive attention of researchers.A modeling method for knowledge graph of events based on mutual information among neighbor domains and sparse representation is proposed in this paper,i.e.UKGE-MS.Specifically,UKGE-MS can improve the existing text mining technology's ability of understanding and discovering high-dimensional unmarked information,and solves the problems of traditional unsupervised feature selection methods,which only focus on selecting features from a global perspective and ignoring the impact of local connection of samples.Firstly,considering the influence of local information of samples in feature correlation evaluation,a feature clustering algorithm based on average neighborhood mutual information is proposed,and the feature clusters with certain event correlation are obtained;Secondly,an unsupervised feature selection method based on the high-order correlation of multi-dimensional statistical data is designed by combining the dimension reduction advantage of local linear embedding algorithm and the feature selection ability of sparse representation,so as to enhance the generalization ability of the selected feature items.Finally,the events knowledge graph is constructed by means of sparse representation and l1 norm.Extensive experiments are carried out on five real datasets and synthetic datasets,and the UKGE-MS are compared with five corresponding algorithms.The experimental results show that UKGE-MS is better than the traditional method in event clustering and feature selection,and has some advantages over other methods in text event recognition and discovery.展开更多
为满足中国空气动力研究与发展中心的2.4m跨声速风洞流场品质改进的需要,有必要建立一个高效的风洞流场控制模型作为控制器设计的验证平台。由于难以建立精确的空气动力学模型,且2.4m跨声速风洞长期运行积累了大量的试验运行数据的实际...为满足中国空气动力研究与发展中心的2.4m跨声速风洞流场品质改进的需要,有必要建立一个高效的风洞流场控制模型作为控制器设计的验证平台。由于难以建立精确的空气动力学模型,且2.4m跨声速风洞长期运行积累了大量的试验运行数据的实际,数据建模成为建模方法的首选。在硬件上,建立了基于反射内存技术的流场控制仿真系统,以获取现场采集的数据。建模方法采用数据建模方式,主要是利用系统辨识理论,将整个系统看成是一个"黑箱",利用现场采集的数据来确定系统的参数和输入输出间的映射关系。采用以非线性自回归滑动平均模型(Non-linear Auto-Regressive Moving Average Model with Exogenous Inputs,NARMAX)作为风洞系统的数据模型,应用互信息法、曲线拟合法和伪最近邻点法分别确定了模型中采样间隔、时间滞后以及阶次3个参数。对比了最小二乘线性回归、BP神经网络以及最小二乘支持向量机(LS_SVM)3种方法对模型的拟合效果,确立了最小二乘支持向量机作为最终的拟合方法。为了提高仿真的精度,根据风洞运行的特点,将其整个过程划分为冲压、启动和调节3个阶段,分别建立了各个阶段的子模型。由于风洞系统是一个多输入多输出系统,并且延迟和阶次较大,采用了基于信息熵的数据压缩方法,实现了简化子模型规模的目的。最后,采用多模型融合的方法将各个阶段的子模型通过加权的方法来完成融合,从而构建起整个风洞系统的模型。稳定段总压和驻室静压分别通过所建模型得到,最后通过马赫数的计算公式得到试验段马赫数值。仿真结果表明:所建模型在运行包络线范围内的试验工况下,总压预测精度达到0.1%、马赫数预测精度基本达到0.001,达到了研究的目的。该项工作的开展较为系统地建立了暂冲式风洞的流场控制模型,建立的模型将为下一阶段基于现代控制理论的控制器设计奠定基础。展开更多
基金This study was funded by the International Science and Technology Cooperation Program of the Science and Technology Department of Shaanxi Province,China(No.2021KW-16)the Science and Technology Project in Xi’an(No.2019218114GXRC017CG018-GXYD17.11),Thesis work was supported by the special fund construction project of Key Disciplines in Ordinary Colleges and Universities in Shaanxi Province,the authors would like to thank the anonymous reviewers for their helpful comments and suggestions.
文摘Text event mining,as an indispensable method of text mining processing,has attracted the extensive attention of researchers.A modeling method for knowledge graph of events based on mutual information among neighbor domains and sparse representation is proposed in this paper,i.e.UKGE-MS.Specifically,UKGE-MS can improve the existing text mining technology's ability of understanding and discovering high-dimensional unmarked information,and solves the problems of traditional unsupervised feature selection methods,which only focus on selecting features from a global perspective and ignoring the impact of local connection of samples.Firstly,considering the influence of local information of samples in feature correlation evaluation,a feature clustering algorithm based on average neighborhood mutual information is proposed,and the feature clusters with certain event correlation are obtained;Secondly,an unsupervised feature selection method based on the high-order correlation of multi-dimensional statistical data is designed by combining the dimension reduction advantage of local linear embedding algorithm and the feature selection ability of sparse representation,so as to enhance the generalization ability of the selected feature items.Finally,the events knowledge graph is constructed by means of sparse representation and l1 norm.Extensive experiments are carried out on five real datasets and synthetic datasets,and the UKGE-MS are compared with five corresponding algorithms.The experimental results show that UKGE-MS is better than the traditional method in event clustering and feature selection,and has some advantages over other methods in text event recognition and discovery.
文摘为满足中国空气动力研究与发展中心的2.4m跨声速风洞流场品质改进的需要,有必要建立一个高效的风洞流场控制模型作为控制器设计的验证平台。由于难以建立精确的空气动力学模型,且2.4m跨声速风洞长期运行积累了大量的试验运行数据的实际,数据建模成为建模方法的首选。在硬件上,建立了基于反射内存技术的流场控制仿真系统,以获取现场采集的数据。建模方法采用数据建模方式,主要是利用系统辨识理论,将整个系统看成是一个"黑箱",利用现场采集的数据来确定系统的参数和输入输出间的映射关系。采用以非线性自回归滑动平均模型(Non-linear Auto-Regressive Moving Average Model with Exogenous Inputs,NARMAX)作为风洞系统的数据模型,应用互信息法、曲线拟合法和伪最近邻点法分别确定了模型中采样间隔、时间滞后以及阶次3个参数。对比了最小二乘线性回归、BP神经网络以及最小二乘支持向量机(LS_SVM)3种方法对模型的拟合效果,确立了最小二乘支持向量机作为最终的拟合方法。为了提高仿真的精度,根据风洞运行的特点,将其整个过程划分为冲压、启动和调节3个阶段,分别建立了各个阶段的子模型。由于风洞系统是一个多输入多输出系统,并且延迟和阶次较大,采用了基于信息熵的数据压缩方法,实现了简化子模型规模的目的。最后,采用多模型融合的方法将各个阶段的子模型通过加权的方法来完成融合,从而构建起整个风洞系统的模型。稳定段总压和驻室静压分别通过所建模型得到,最后通过马赫数的计算公式得到试验段马赫数值。仿真结果表明:所建模型在运行包络线范围内的试验工况下,总压预测精度达到0.1%、马赫数预测精度基本达到0.001,达到了研究的目的。该项工作的开展较为系统地建立了暂冲式风洞的流场控制模型,建立的模型将为下一阶段基于现代控制理论的控制器设计奠定基础。