本文收集了环烷烃类、环烯烃类、酮类、胺类、醚类、酯类等有机物在固定相角鲨烷和SE-30上的气相色谱保留指数,并采用基于Monte Carlo采样的模型集群分析(Monte Carlo sampling model population analysis,MCS MPA)方法进行了定量结构...本文收集了环烷烃类、环烯烃类、酮类、胺类、醚类、酯类等有机物在固定相角鲨烷和SE-30上的气相色谱保留指数,并采用基于Monte Carlo采样的模型集群分析(Monte Carlo sampling model population analysis,MCS MPA)方法进行了定量结构-色谱保留指数相关关系建模方法的比较研究。对于两种固定相上的有机化合物,分别采用不同的分子描述符予以表征,分子描述符的选择基于统计学与遗传算法。采用的建模方法包括多元线性回归(multivariate linear regression,MLR)、支持向量机回归(support vector machine,SVM)、径向基函数人工神经网络方法(radial basis function artificial neural networks,RBF ANN),通过所建模型预测了独立外部测试样本的气相色谱保留指数。研究结果表明,对于本文所研究的数据,SVM回归方法的建模效果优于MLR与RBF ANN方法。展开更多
集群数据刻画了不同研究对象在群内的动态关系,在经济学、社会和医学等领域被广泛应用。经典的聚类分析方法常用来刻画样本之间的相似性,进而对样本或者指标进行聚类,对于集群数据子群之间的聚类研究较少。本文对集群数据建立因子分析模...集群数据刻画了不同研究对象在群内的动态关系,在经济学、社会和医学等领域被广泛应用。经典的聚类分析方法常用来刻画样本之间的相似性,进而对样本或者指标进行聚类,对于集群数据子群之间的聚类研究较少。本文对集群数据建立因子分析模型,通过主成分法,产生群组各异的集群数据,使用K-means聚类方法对集群数据群聚类。随机模拟用因子分析模型主成分法产生集群数据,模拟表明了聚类方法的有效性。实例分析对集群数据群进行聚类,使用轮廓系数对聚类进行评价。评价结果表明,运用机器学习K-means算法对集群数据子群聚类效果较好。Cluster data characterizes the dynamic relationships among different research objects within a cluster, and is widely used in fields such as economics, society, and medicine. Classic clustering analysis methods are commonly used to characterize the similarity between samples and cluster samples or indicators, but there is relatively little research on clustering between subgroups of cluster data. This article establishes a factor analysis model for cluster data, generates cluster data with different groups through principal component analysis, and uses K-means clustering method to cluster the cluster data. Random simulation uses factor analysis model principal component analysis to generate cluster data, and the simulation shows the effectiveness of the clustering method. Case analysis is used to cluster data groups and evaluate the clustering using silhouette coefficients. The evaluation results indicate that the use of machine learning K-means algorithm has a good clustering effect on subgroups of cluster data.展开更多
文摘本文收集了环烷烃类、环烯烃类、酮类、胺类、醚类、酯类等有机物在固定相角鲨烷和SE-30上的气相色谱保留指数,并采用基于Monte Carlo采样的模型集群分析(Monte Carlo sampling model population analysis,MCS MPA)方法进行了定量结构-色谱保留指数相关关系建模方法的比较研究。对于两种固定相上的有机化合物,分别采用不同的分子描述符予以表征,分子描述符的选择基于统计学与遗传算法。采用的建模方法包括多元线性回归(multivariate linear regression,MLR)、支持向量机回归(support vector machine,SVM)、径向基函数人工神经网络方法(radial basis function artificial neural networks,RBF ANN),通过所建模型预测了独立外部测试样本的气相色谱保留指数。研究结果表明,对于本文所研究的数据,SVM回归方法的建模效果优于MLR与RBF ANN方法。
文摘集群数据刻画了不同研究对象在群内的动态关系,在经济学、社会和医学等领域被广泛应用。经典的聚类分析方法常用来刻画样本之间的相似性,进而对样本或者指标进行聚类,对于集群数据子群之间的聚类研究较少。本文对集群数据建立因子分析模型,通过主成分法,产生群组各异的集群数据,使用K-means聚类方法对集群数据群聚类。随机模拟用因子分析模型主成分法产生集群数据,模拟表明了聚类方法的有效性。实例分析对集群数据群进行聚类,使用轮廓系数对聚类进行评价。评价结果表明,运用机器学习K-means算法对集群数据子群聚类效果较好。Cluster data characterizes the dynamic relationships among different research objects within a cluster, and is widely used in fields such as economics, society, and medicine. Classic clustering analysis methods are commonly used to characterize the similarity between samples and cluster samples or indicators, but there is relatively little research on clustering between subgroups of cluster data. This article establishes a factor analysis model for cluster data, generates cluster data with different groups through principal component analysis, and uses K-means clustering method to cluster the cluster data. Random simulation uses factor analysis model principal component analysis to generate cluster data, and the simulation shows the effectiveness of the clustering method. Case analysis is used to cluster data groups and evaluate the clustering using silhouette coefficients. The evaluation results indicate that the use of machine learning K-means algorithm has a good clustering effect on subgroups of cluster data.