随着智能手机的普及,越来越多的手机具备评估用户日常活动消耗热量的功能。这类运动健康软件主要依赖智能手机记录用户每天活动状态的数据来计算热量消耗。然而,如何有效地分类和分析这些运动数据仍然是一个挑战。本文的研究目标是对实...随着智能手机的普及,越来越多的手机具备评估用户日常活动消耗热量的功能。这类运动健康软件主要依赖智能手机记录用户每天活动状态的数据来计算热量消耗。然而,如何有效地分类和分析这些运动数据仍然是一个挑战。本文的研究目标是对实验人员的运动数据进行分类和分析,以提高数据处理和分类的准确性。研究主要分为三个部分:1) 数据预处理:通过数据清洗和标准化处理,提取时间域和频域特征,并应用层次聚类算法对实验人员的运动数据进行分类,生成层次树状图展示数据点的层次关系。2) 分类模型评估:使用10名实验人员的运动数据,采用随机森林分类模型进行训练和预测。结果表明,模型整体准确性为65%,其中类别8的分类效果最佳,类别2和3的分类效果较差。3) 数据差异分析:整合数据并使用多元方差分析(MANOVA)检验不同实验人员传感器数据之间的显著差异。结果显示实验人员之间的传感器数据无显著差异。此外,通过相关性分析,计算传感器数据与实验人员特征(年龄、身高、体重)之间的相关系数,并绘制相关性矩阵。本文提出的分类和分析方法有效识别了实验人员的运动数据特征,提供了进一步优化模型和数据处理的建议,以提高分类准确性。With the widespread use of smartphones, more and more smartphones have the ability to evaluate the daily activity energy consumption of users. This feature mainly relies on the smartphone to record daily activity data and calculate energy consumption. However, how to effectively classify and analyze this data is a challenging task. This study conducts experiments on data from laboratory personnel to classify and analyze the data to improve the accuracy and validity of the data processing. The research is divided into three main parts: 1) Data preprocessing: Through data cleaning and standardization, time and frequency domain features are extracted, and unsupervised classification of these features is conducted using hierarchical clustering. A hierarchical tree diagram was generated to display the hierarchical relationship among data points. 2) Classification model evaluation: Using motion data from 10 participants, a Random Forest classification model was trained and tested. The overall accuracy of the model was 65%, with the best performance in classifying category 8, while categories 2 and 3 showed poorer classification results. 3) Data variance analysis: The data were consolidated, and a multivariate analysis of variance (MANOVA) was conducted to assess significant differences in sensor data among participants. The results indicated no significant differences in sensor data across the participants. In addition, relevant analyses are conducted to calculate the correlations between the transmission data and laboratory personnel characteristics (age, height, weight), combining correlation and regression analysis. This study summarizes the problems identified in data classification and analysis and provides further recommendations for model optimization and data processing.展开更多
Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated wa...Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is found that health impact level is 60.8% dependent on water quality parameters of BOD, COD, TDS, TC and FC. The t-statistics and their associated 2-tailed p-values indicate that COD and TDS produces health impacts compared to BOD, TC and FC, when their effects are put together across all the six sampling stations in Srirangapatna. Further Pearson correlation Matrix shows highly significant positive correlation amongst parameters across all stations indicating possibility of common sources of origin that might be anthropogenic. Also graphs are plotted for individual parameters across all stations and it reveals that COD and TDS values are significant across all sampling stations, though their values are higher in impact stations, causing health impacts.展开更多
This article presents a statistic for testing the sphericity in a GMANOVA- MANOVA model with normal error. It is shown that the null distribution of this statistic is beta and its nonnull distribution is given in seri...This article presents a statistic for testing the sphericity in a GMANOVA- MANOVA model with normal error. It is shown that the null distribution of this statistic is beta and its nonnull distribution is given in series form of beta distributions.展开更多
In ground water quality studies multivariate statistical techniques like Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA), Factor Analysis (FA) and Multivariate Analysis of Variance (MANOVA) wer...In ground water quality studies multivariate statistical techniques like Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA), Factor Analysis (FA) and Multivariate Analysis of Variance (MANOVA) were employed to evaluate the principal factors and mechanisms governing the spatial variations and to assess source apportionment at Lawspet area in Puducherry, India. PCA/FA has made the first known factor which showed the anthropogenic impact on ground water quality and this dominant factor explained 82.79% of the total variance. The other four factors identified geogenic and hardness components. The distribution of first factor scores portray high loading for EC, TDS, Na+ and Cl−(anthropogenic) in south east and south west parts of the study area, whereas other factor scores depict high loading for HCO3−, Mg2+, Ca2+ and TH (hardness and geogenic) in the north west and south west parts of the study area. K+ and SO42−(geogenic) are dominant in south eastern direction. Further MANOVA showed that there are significant differences between ground water quality parameters. The spatial distribution maps of water quality parameters have rendered a powerful and practical visual tool for defining, interpreting, and distinguishing the anthropogenic, hardness and geogenic factors in the study area. Further the study indicated that multivariate statistical methods have successfully assessed the ground water qualitatively and spatially with a more effective step towards ground water quality management.展开更多
Metallic elements have various origins: natural and anthropogenic sources as geochemical, marine and atmospheric sources resulting from the fallout of pollutants emitted or dust raised and which are transported by wat...Metallic elements have various origins: natural and anthropogenic sources as geochemical, marine and atmospheric sources resulting from the fallout of pollutants emitted or dust raised and which are transported by water and air currents. Thus marine, brackish and fresh continental waters may have high metal concentrations. In addition, some essential metals can become toxic above certain concentration values in aquatic environments. The aquatic ecosystems of Cotonou channel and lake Nokoué receive the pollutants charges from the town cities of Cotonou, Abomey-Calavi and town hall of So Ava. The aim of this study is to analyze waters from Eighteen (18) stations identified in the two ecosystems (nine by ecosystem). The concentrations of magnesium (Mg), calcium (Ca), vanadium (V), chromium (Cr), manganese (Mn), iron (Fe), cobalt (Co), nickel (Ni), copper (Cu), zinc (Zn), arsenic (As), selenium (Se), cadmium (Cd), beryllium (Be), aluminum (Al), strontium (Sr), molybdenum (Mo), silver (Ag), tin (Sn), barium (Ba), platinum (Pt), mercury (Hg), thallium (Tl), lead (Pb), thorium (Th) and uranium (U) were measured after acid digestion of the water samples using the inductively coupled plasma source mass spectrometer (ICP-MS). The results of the analyses indicate an unequal distribution of metals in the different ecosystems. However, atypical concentrations were observed at some stations of the lake and the channel. Magnesium, calcium and manganese have very high values in Lake Nokoué respectively at Ganvié market station GAN_M (2990 ± 105 mg/L), Ganvié center, station GAN_C (4991 ± 177 mg/L) and Lake middle station MLak4 (10662 ± 17.03 μg/L). On the other hand, iron, aluminum and strontium have very high concentrations in the Cotonou Channel respectively at Agbato station AGB (5236 ± 103 and 8289 ± 519 μg/L) and at the estuary station EST (6118 ± 68 μg/L). The concentrations were compared to wells and cborehole waters in sixth neighborhood of Cotonou. We have used statistical analyzers such as MANOVA which have made it possible to classify the waters and metals in the ecosystems studied compared to groundwater and Well water waters. We use hierarchical clustering on principal components to identify similarities between stations based on metal concentration with R software packages “FactoMineR” and “factoextra”. In general, we can conclude that most of the metals have an anthropogenic source except strontium and major elements (Ca and Mg) which could respectively provide from marine waters and geochemical sources.展开更多
文摘随着智能手机的普及,越来越多的手机具备评估用户日常活动消耗热量的功能。这类运动健康软件主要依赖智能手机记录用户每天活动状态的数据来计算热量消耗。然而,如何有效地分类和分析这些运动数据仍然是一个挑战。本文的研究目标是对实验人员的运动数据进行分类和分析,以提高数据处理和分类的准确性。研究主要分为三个部分:1) 数据预处理:通过数据清洗和标准化处理,提取时间域和频域特征,并应用层次聚类算法对实验人员的运动数据进行分类,生成层次树状图展示数据点的层次关系。2) 分类模型评估:使用10名实验人员的运动数据,采用随机森林分类模型进行训练和预测。结果表明,模型整体准确性为65%,其中类别8的分类效果最佳,类别2和3的分类效果较差。3) 数据差异分析:整合数据并使用多元方差分析(MANOVA)检验不同实验人员传感器数据之间的显著差异。结果显示实验人员之间的传感器数据无显著差异。此外,通过相关性分析,计算传感器数据与实验人员特征(年龄、身高、体重)之间的相关系数,并绘制相关性矩阵。本文提出的分类和分析方法有效识别了实验人员的运动数据特征,提供了进一步优化模型和数据处理的建议,以提高分类准确性。With the widespread use of smartphones, more and more smartphones have the ability to evaluate the daily activity energy consumption of users. This feature mainly relies on the smartphone to record daily activity data and calculate energy consumption. However, how to effectively classify and analyze this data is a challenging task. This study conducts experiments on data from laboratory personnel to classify and analyze the data to improve the accuracy and validity of the data processing. The research is divided into three main parts: 1) Data preprocessing: Through data cleaning and standardization, time and frequency domain features are extracted, and unsupervised classification of these features is conducted using hierarchical clustering. A hierarchical tree diagram was generated to display the hierarchical relationship among data points. 2) Classification model evaluation: Using motion data from 10 participants, a Random Forest classification model was trained and tested. The overall accuracy of the model was 65%, with the best performance in classifying category 8, while categories 2 and 3 showed poorer classification results. 3) Data variance analysis: The data were consolidated, and a multivariate analysis of variance (MANOVA) was conducted to assess significant differences in sensor data among participants. The results indicated no significant differences in sensor data across the participants. In addition, relevant analyses are conducted to calculate the correlations between the transmission data and laboratory personnel characteristics (age, height, weight), combining correlation and regression analysis. This study summarizes the problems identified in data classification and analysis and provides further recommendations for model optimization and data processing.
文摘Rivers are important systems which provide water to fulfill human needs. However, excessive human uses over the years have led to deterioration in quality of river causing, causing health problems from contaminated water. This study focuses on the application of statistical techniques, Multiple Linear Regression model and MANOVA to assess health impacts due to pollution in Cauvery river stretch in Srirangapatna. In this study, using Multiple Linear Regression, it is found that health impact level is 60.8% dependent on water quality parameters of BOD, COD, TDS, TC and FC. The t-statistics and their associated 2-tailed p-values indicate that COD and TDS produces health impacts compared to BOD, TC and FC, when their effects are put together across all the six sampling stations in Srirangapatna. Further Pearson correlation Matrix shows highly significant positive correlation amongst parameters across all stations indicating possibility of common sources of origin that might be anthropogenic. Also graphs are plotted for individual parameters across all stations and it reveals that COD and TDS values are significant across all sampling stations, though their values are higher in impact stations, causing health impacts.
基金the National Natural Science Foundation of China (10761010, 10771185)the Mathematics Tianyuan Youth Foundation of China
文摘This article presents a statistic for testing the sphericity in a GMANOVA- MANOVA model with normal error. It is shown that the null distribution of this statistic is beta and its nonnull distribution is given in series form of beta distributions.
文摘In ground water quality studies multivariate statistical techniques like Hierarchical Cluster Analysis (HCA), Principal Component Analysis (PCA), Factor Analysis (FA) and Multivariate Analysis of Variance (MANOVA) were employed to evaluate the principal factors and mechanisms governing the spatial variations and to assess source apportionment at Lawspet area in Puducherry, India. PCA/FA has made the first known factor which showed the anthropogenic impact on ground water quality and this dominant factor explained 82.79% of the total variance. The other four factors identified geogenic and hardness components. The distribution of first factor scores portray high loading for EC, TDS, Na+ and Cl−(anthropogenic) in south east and south west parts of the study area, whereas other factor scores depict high loading for HCO3−, Mg2+, Ca2+ and TH (hardness and geogenic) in the north west and south west parts of the study area. K+ and SO42−(geogenic) are dominant in south eastern direction. Further MANOVA showed that there are significant differences between ground water quality parameters. The spatial distribution maps of water quality parameters have rendered a powerful and practical visual tool for defining, interpreting, and distinguishing the anthropogenic, hardness and geogenic factors in the study area. Further the study indicated that multivariate statistical methods have successfully assessed the ground water qualitatively and spatially with a more effective step towards ground water quality management.
文摘Metallic elements have various origins: natural and anthropogenic sources as geochemical, marine and atmospheric sources resulting from the fallout of pollutants emitted or dust raised and which are transported by water and air currents. Thus marine, brackish and fresh continental waters may have high metal concentrations. In addition, some essential metals can become toxic above certain concentration values in aquatic environments. The aquatic ecosystems of Cotonou channel and lake Nokoué receive the pollutants charges from the town cities of Cotonou, Abomey-Calavi and town hall of So Ava. The aim of this study is to analyze waters from Eighteen (18) stations identified in the two ecosystems (nine by ecosystem). The concentrations of magnesium (Mg), calcium (Ca), vanadium (V), chromium (Cr), manganese (Mn), iron (Fe), cobalt (Co), nickel (Ni), copper (Cu), zinc (Zn), arsenic (As), selenium (Se), cadmium (Cd), beryllium (Be), aluminum (Al), strontium (Sr), molybdenum (Mo), silver (Ag), tin (Sn), barium (Ba), platinum (Pt), mercury (Hg), thallium (Tl), lead (Pb), thorium (Th) and uranium (U) were measured after acid digestion of the water samples using the inductively coupled plasma source mass spectrometer (ICP-MS). The results of the analyses indicate an unequal distribution of metals in the different ecosystems. However, atypical concentrations were observed at some stations of the lake and the channel. Magnesium, calcium and manganese have very high values in Lake Nokoué respectively at Ganvié market station GAN_M (2990 ± 105 mg/L), Ganvié center, station GAN_C (4991 ± 177 mg/L) and Lake middle station MLak4 (10662 ± 17.03 μg/L). On the other hand, iron, aluminum and strontium have very high concentrations in the Cotonou Channel respectively at Agbato station AGB (5236 ± 103 and 8289 ± 519 μg/L) and at the estuary station EST (6118 ± 68 μg/L). The concentrations were compared to wells and cborehole waters in sixth neighborhood of Cotonou. We have used statistical analyzers such as MANOVA which have made it possible to classify the waters and metals in the ecosystems studied compared to groundwater and Well water waters. We use hierarchical clustering on principal components to identify similarities between stations based on metal concentration with R software packages “FactoMineR” and “factoextra”. In general, we can conclude that most of the metals have an anthropogenic source except strontium and major elements (Ca and Mg) which could respectively provide from marine waters and geochemical sources.