随着智能手机的普及,越来越多的手机具备评估用户日常活动消耗热量的功能。这类运动健康软件主要依赖智能手机记录用户每天活动状态的数据来计算热量消耗。然而,如何有效地分类和分析这些运动数据仍然是一个挑战。本文的研究目标是对实...随着智能手机的普及,越来越多的手机具备评估用户日常活动消耗热量的功能。这类运动健康软件主要依赖智能手机记录用户每天活动状态的数据来计算热量消耗。然而,如何有效地分类和分析这些运动数据仍然是一个挑战。本文的研究目标是对实验人员的运动数据进行分类和分析,以提高数据处理和分类的准确性。研究主要分为三个部分:1) 数据预处理:通过数据清洗和标准化处理,提取时间域和频域特征,并应用层次聚类算法对实验人员的运动数据进行分类,生成层次树状图展示数据点的层次关系。2) 分类模型评估:使用10名实验人员的运动数据,采用随机森林分类模型进行训练和预测。结果表明,模型整体准确性为65%,其中类别8的分类效果最佳,类别2和3的分类效果较差。3) 数据差异分析:整合数据并使用多元方差分析(MANOVA)检验不同实验人员传感器数据之间的显著差异。结果显示实验人员之间的传感器数据无显著差异。此外,通过相关性分析,计算传感器数据与实验人员特征(年龄、身高、体重)之间的相关系数,并绘制相关性矩阵。本文提出的分类和分析方法有效识别了实验人员的运动数据特征,提供了进一步优化模型和数据处理的建议,以提高分类准确性。With the widespread use of smartphones, more and more smartphones have the ability to evaluate the daily activity energy consumption of users. This feature mainly relies on the smartphone to record daily activity data and calculate energy consumption. However, how to effectively classify and analyze this data is a challenging task. This study conducts experiments on data from laboratory personnel to classify and analyze the data to improve the accuracy and validity of the data processing. The research is divided into three main parts: 1) Data preprocessing: Through data cleaning and standardization, time and frequency domain features are extracted, and unsupervised classification of these features is conducted using hierarchical clustering. A hierarchical tree diagram was generated to display the hierarchical relationship among data points. 2) Classification model evaluation: Using motion data from 10 participants, a Random Forest classification model was trained and tested. The overall accuracy of the model was 65%, with the best performance in classifying category 8, while categories 2 and 3 showed poorer classification results. 3) Data variance analysis: The data were consolidated, and a multivariate analysis of variance (MANOVA) was conducted to assess significant differences in sensor data among participants. The results indicated no significant differences in sensor data across the participants. In addition, relevant analyses are conducted to calculate the correlations between the transmission data and laboratory personnel characteristics (age, height, weight), combining correlation and regression analysis. This study summarizes the problems identified in data classification and analysis and provides further recommendations for model optimization and data processing.展开更多
本文研究了对称锥上法锥的图的切锥和法锥的数学描述。首先,基于对称锥的特性,引入了切锥和法锥的定义。然后,通过构建适当的数学模型,推导出切锥和法锥的精确公式,从而对对称锥规划问题的求解具有重要意义。In this paper, we study th...本文研究了对称锥上法锥的图的切锥和法锥的数学描述。首先,基于对称锥的特性,引入了切锥和法锥的定义。然后,通过构建适当的数学模型,推导出切锥和法锥的精确公式,从而对对称锥规划问题的求解具有重要意义。In this paper, we study the mathematical description of tangent cone and normal cone of the graph of normal cone on symmetric cone. First, the definition of tangent cone and normal cone is introduced based on the properties of the symmetry cone. Then, by constructing appropriate mathematical models, we derive exact formulas for the tangent and normal cones, which have important implications for the solution of the symmetric cone programming problems.展开更多
本文用连续可微非凸函数描述的概率约束分析非线性随机优化问题。为此描述了潜在概率函数的水平集的切锥和法锥,并在此基础上,提出p-有效点的定义,形成问题的一阶和二阶最优性条件,基于p-有效点生成的概率函数的水平集,通过修正的Carrol...本文用连续可微非凸函数描述的概率约束分析非线性随机优化问题。为此描述了潜在概率函数的水平集的切锥和法锥,并在此基础上,提出p-有效点的定义,形成问题的一阶和二阶最优性条件,基于p-有效点生成的概率函数的水平集,通过修正的Carroll函数生成一个对偶算法。In this paper, probabilistic constraints described by continuously differentiable non-convex functions are used to analyze nonlinear stochastic optimization problems. To this end, the tangent and normal cones of the level set of potential probability functions are described, and on this basis, the definition of p-effective points is proposed to form the first and second order optimality conditions of the problem. Based on the water-level set of probability functions generated by p-effective points, a dual algorithm is generated by the modified Carroll function.展开更多
本文以2022年成都的二手房房价数据为研究对象,构建随机森林模型和XGBoost模型来预测二手房价格。首先对数据集进行清洗并可视化处理,构建虚拟变量,接着绘制热力图并运用熵值法进行特征值筛选,选取重要的特征进行训练模型。接着,采用网...本文以2022年成都的二手房房价数据为研究对象,构建随机森林模型和XGBoost模型来预测二手房价格。首先对数据集进行清洗并可视化处理,构建虚拟变量,接着绘制热力图并运用熵值法进行特征值筛选,选取重要的特征进行训练模型。接着,采用网格搜索技术分别开发了基于随机森林和XGBoost的预测模型,并利用决定系数、均方误差和平均绝对误差这三个关键指标来衡量模型的预测准确性,经过模型比较和结果分析,发现优化后的XGBoost模型对二手房房价有良好的预测结果,准确率达90.3%。This article takes the second-hand housing price data of Chengdu in 2022 as the research object, and constructs a random forest model and XGBoost model to predict the second-hand housing price. Firstly, the dataset is cleaned and visualized to construct virtual variables. Then, a heat map is drawn and the entropy method is used for feature value screening to select important features for training the model. Subsequently, prediction systems based on random forest and XGBoost were developed using grid search techniques, and the accuracy of the models was measured using three key indicators: coefficient of determination, mean square error, and mean absolute error. After model comparison and result analysis, it was found that the optimized XGBoost model had good prediction results for second-hand housing prices, with an accuracy rate of 90.3%.展开更多
文摘随着智能手机的普及,越来越多的手机具备评估用户日常活动消耗热量的功能。这类运动健康软件主要依赖智能手机记录用户每天活动状态的数据来计算热量消耗。然而,如何有效地分类和分析这些运动数据仍然是一个挑战。本文的研究目标是对实验人员的运动数据进行分类和分析,以提高数据处理和分类的准确性。研究主要分为三个部分:1) 数据预处理:通过数据清洗和标准化处理,提取时间域和频域特征,并应用层次聚类算法对实验人员的运动数据进行分类,生成层次树状图展示数据点的层次关系。2) 分类模型评估:使用10名实验人员的运动数据,采用随机森林分类模型进行训练和预测。结果表明,模型整体准确性为65%,其中类别8的分类效果最佳,类别2和3的分类效果较差。3) 数据差异分析:整合数据并使用多元方差分析(MANOVA)检验不同实验人员传感器数据之间的显著差异。结果显示实验人员之间的传感器数据无显著差异。此外,通过相关性分析,计算传感器数据与实验人员特征(年龄、身高、体重)之间的相关系数,并绘制相关性矩阵。本文提出的分类和分析方法有效识别了实验人员的运动数据特征,提供了进一步优化模型和数据处理的建议,以提高分类准确性。With the widespread use of smartphones, more and more smartphones have the ability to evaluate the daily activity energy consumption of users. This feature mainly relies on the smartphone to record daily activity data and calculate energy consumption. However, how to effectively classify and analyze this data is a challenging task. This study conducts experiments on data from laboratory personnel to classify and analyze the data to improve the accuracy and validity of the data processing. The research is divided into three main parts: 1) Data preprocessing: Through data cleaning and standardization, time and frequency domain features are extracted, and unsupervised classification of these features is conducted using hierarchical clustering. A hierarchical tree diagram was generated to display the hierarchical relationship among data points. 2) Classification model evaluation: Using motion data from 10 participants, a Random Forest classification model was trained and tested. The overall accuracy of the model was 65%, with the best performance in classifying category 8, while categories 2 and 3 showed poorer classification results. 3) Data variance analysis: The data were consolidated, and a multivariate analysis of variance (MANOVA) was conducted to assess significant differences in sensor data among participants. The results indicated no significant differences in sensor data across the participants. In addition, relevant analyses are conducted to calculate the correlations between the transmission data and laboratory personnel characteristics (age, height, weight), combining correlation and regression analysis. This study summarizes the problems identified in data classification and analysis and provides further recommendations for model optimization and data processing.
文摘本文研究了对称锥上法锥的图的切锥和法锥的数学描述。首先,基于对称锥的特性,引入了切锥和法锥的定义。然后,通过构建适当的数学模型,推导出切锥和法锥的精确公式,从而对对称锥规划问题的求解具有重要意义。In this paper, we study the mathematical description of tangent cone and normal cone of the graph of normal cone on symmetric cone. First, the definition of tangent cone and normal cone is introduced based on the properties of the symmetry cone. Then, by constructing appropriate mathematical models, we derive exact formulas for the tangent and normal cones, which have important implications for the solution of the symmetric cone programming problems.
文摘本文用连续可微非凸函数描述的概率约束分析非线性随机优化问题。为此描述了潜在概率函数的水平集的切锥和法锥,并在此基础上,提出p-有效点的定义,形成问题的一阶和二阶最优性条件,基于p-有效点生成的概率函数的水平集,通过修正的Carroll函数生成一个对偶算法。In this paper, probabilistic constraints described by continuously differentiable non-convex functions are used to analyze nonlinear stochastic optimization problems. To this end, the tangent and normal cones of the level set of potential probability functions are described, and on this basis, the definition of p-effective points is proposed to form the first and second order optimality conditions of the problem. Based on the water-level set of probability functions generated by p-effective points, a dual algorithm is generated by the modified Carroll function.
文摘本文以2022年成都的二手房房价数据为研究对象,构建随机森林模型和XGBoost模型来预测二手房价格。首先对数据集进行清洗并可视化处理,构建虚拟变量,接着绘制热力图并运用熵值法进行特征值筛选,选取重要的特征进行训练模型。接着,采用网格搜索技术分别开发了基于随机森林和XGBoost的预测模型,并利用决定系数、均方误差和平均绝对误差这三个关键指标来衡量模型的预测准确性,经过模型比较和结果分析,发现优化后的XGBoost模型对二手房房价有良好的预测结果,准确率达90.3%。This article takes the second-hand housing price data of Chengdu in 2022 as the research object, and constructs a random forest model and XGBoost model to predict the second-hand housing price. Firstly, the dataset is cleaned and visualized to construct virtual variables. Then, a heat map is drawn and the entropy method is used for feature value screening to select important features for training the model. Subsequently, prediction systems based on random forest and XGBoost were developed using grid search techniques, and the accuracy of the models was measured using three key indicators: coefficient of determination, mean square error, and mean absolute error. After model comparison and result analysis, it was found that the optimized XGBoost model had good prediction results for second-hand housing prices, with an accuracy rate of 90.3%.