Various uncertainties arising during acquisition process of geoscience data may result in anomalous data instances(i.e.,outliers)that do not conform with the expected pattern of regular data instances.With sparse mult...Various uncertainties arising during acquisition process of geoscience data may result in anomalous data instances(i.e.,outliers)that do not conform with the expected pattern of regular data instances.With sparse multivariate data obtained from geotechnical site investigation,it is impossible to identify outliers with certainty due to the distortion of statistics of geotechnical parameters caused by outliers and their associated statistical uncertainty resulted from data sparsity.This paper develops a probabilistic outlier detection method for sparse multivariate data obtained from geotechnical site investigation.The proposed approach quantifies the outlying probability of each data instance based on Mahalanobis distance and determines outliers as those data instances with outlying probabilities greater than 0.5.It tackles the distortion issue of statistics estimated from the dataset with outliers by a re-sampling technique and accounts,rationally,for the statistical uncertainty by Bayesian machine learning.Moreover,the proposed approach also suggests an exclusive method to determine outlying components of each outlier.The proposed approach is illustrated and verified using simulated and real-life dataset.It showed that the proposed approach properly identifies outliers among sparse multivariate data and their corresponding outlying components in a probabilistic manner.It can significantly reduce the masking effect(i.e.,missing some actual outliers due to the distortion of statistics by the outliers and statistical uncertainty).It also found that outliers among sparse multivariate data instances affect significantly the construction of multivariate distribution of geotechnical parameters for uncertainty quantification.This emphasizes the necessity of data cleaning process(e.g.,outlier detection)for uncertainty quantification based on geoscience data.展开更多
A multivariate statistical analysis was performed on multi-element soil geochemical data from the Koda Hill-Bulenga gold prospects in the Wa-Lawra gold belt, northwest Ghana. The objectives of the study were to define...A multivariate statistical analysis was performed on multi-element soil geochemical data from the Koda Hill-Bulenga gold prospects in the Wa-Lawra gold belt, northwest Ghana. The objectives of the study were to define gold relationships with other trace elements to determine possible pathfinder elements for gold from the soil geochemical data. The study focused on seven elements, namely, Au, Fe, Pb, Mn, Ag, As and Cu. Factor analysis and hierarchical cluster analysis were performed on the analyzed samples. Factor analysis explained 79.093% of the total variance of the data through three factors. This had the gold factor being factor 3, having associations of copper, iron, lead and manganese and accounting for 20.903% of the total variance. From hierarchical clustering, gold was also observed to be clustering with lead, copper, arsenic and silver. There was further indication that, gold concentrations were lower than that of its associations. It can be inferred from the results that, the occurrence of gold and its associated elements can be linked to both primary dispersion from underlying rocks and secondary processes such as lateritization. This data shows that Fe and Mn strongly associated with gold, and alongside Pb, Ag, As and Cu, these elements can be used as pathfinders for gold in the area, with ferruginous zones as targets.展开更多
A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of...A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of Shixiao San.Batches of these two kinds of samples were subjected to analysis,and the datasets of sample codes,tR-m/z pairs and ion intensities were processed with principal component analysis(PCA).The result of score plot showed a clear classification of the aqueous and vinegar groups.And the chemical markers having great contributions to the differentiation were screened out on the loading plot.The identities of the chemical markers were performed by comparing the mass fragments and retention times with those of reference compounds and/or the known compounds published in the literatures.Based on the proposed strategy,quercetin-3-Oneohesperidoside,isorhamnetin-3-O-neohespeeridoside,kaempferol-3-O-neohesperidoside,isorhamnetin-3-O-rutinoside and isorhamnetin-3-O-(2G-a-l-rhamnosyl)-rutinoside were explored as representative markers in distinguishing the vinegar extract from the aqueous extract.The anti-hyperlipidemic activities of two processed extracts of Shixiao San were examined on serum levels of lipids,lipoprotein and blood antioxidant enzymes in a rat hyperlipidemia model,and the vinegary extract,exerting strong lipid-lowering and antioxidative effects,was superior to the aqueous extract.Therefore,boiling with vinegary was predicted as the greatest processing procedure for anti-hyperlipidemic effect of Shixiao San.Furthermore,combining the changes in the metabolic profiling and bioactivity evaluation,the five representative markers may be related to the observed antihyperlipidemic effect.展开更多
In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify pat...In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast.展开更多
Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in soil properties. However, there is not enough research quantifying which environmental factor(s) can be the most re...Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in soil properties. However, there is not enough research quantifying which environmental factor(s) can be the most relevant to predicting soil properties at the catchment scale in semi-arid areas. Thus, this research aims to investigate the ability of multivariate statistical analyses to distinguish which soil properties follow a clear spatial pattern conditioned by specific environmental characteristics in a semi-arid region of Iran. To achieve this goal, we digitized parent materials and landforms by recent orthophotography. Also, we extracted ten topographical attributes and five remote sensing variables from a digital elevation model(DEM) and the Landsat Enhanced Thematic Mapper(ETM), respectively. These factors were contrasted for 334 soil samples(depth of 0–30 cm). Cluster analysis and soil maps reveal that Cluster 1 comprises of limestones, massive limestones and mixed deposits of conglomerates with low soil organic carbon(SOC) and clay contents, and Cluster 2 is composed of soils that originated from quaternary and early quaternary parent materials such as terraces, alluvial fans, lake deposits, and marls or conglomerates that register the highest SOC content and the lowest sand and silt contents. Further, it is confirmed that soils with the highest SOC and clay contents are located in wetlands, lagoons, alluvial fans and piedmonts, while soils with the lowest SOC and clay contents are located in dissected alluvial fans, eroded hills, rock outcrops and steep hills. The results of principal component analysis using the remote sensing data and topographical attributes identify five main components, which explain 73.3% of the total variability of soil properties. Environmental factors such as hillslope morphology and all of the remote sensing variables can largely explain SOC variability, but no significant correlation is found for soil texture and calcium carbonate equivalent contents. Therefore, we conclude that SOC can be considered as the best-predicted soil property in semi-arid regions.展开更多
There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal compo...There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal component analysis, cluster analysis, factor analysis, pattern analysis, discriminant analysis, multivariate analysis of variance, correspondence analysis, canonical correlation analysis, redundancy analysis, genetic diversity analysis, and stability analysis, which involve in joint regression, additive main effects and multiplicative interactions, and genotype-by-environment interaction biplot analysis. The advanced statistical tools, such as non-parametric analysis of disease association, meta-analysis, Bayesian analysis, and decision theory, take an important place in analysis of disease dynamics. Disease forecasting methods by simulation models for plant diseases have a great potentiality in practical disease control strategies. Common mathematical tools such as monomolecular, exponential, logistic, Gompertz and linked differential equations take an important place in growth curve analysis of disease epidemics. The highly informative means of displaying a range of numerical data through construction of box and whisker plots has been suggested. The probable applications of recent advanced tools of linear and non-linear mixed models like the linear mixed model, generalized linear model, and generalized linear mixed models have been presented. The most recent technologies such as micro-array analysis, though cost effective, provide estimates of gene expressions for thousands of genes simultaneously and need attention by the molecular biologists. Some of these advanced tools can be well applied in different branches of rice research, including crop improvement, crop production, crop protection, social sciences as well as agricultural engineering. The rice research scientists should take advantage of these new opportunities adequately in adoption of the new highly potential advanced technologies while planning experimental designs, data collection, analysis and interpretation of their research data sets.展开更多
Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.T...Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error.展开更多
Designing relevant animal models in order to investigate the neurobiological basis for human mental disorders is an important challenge. The need for new tests to be developed and traditional tests to be improved has ...Designing relevant animal models in order to investigate the neurobiological basis for human mental disorders is an important challenge. The need for new tests to be developed and traditional tests to be improved has recently been em-phasized. The authors propose a multivariate test approach, the multivariate concentric square fieldTM (MCSF) test. To measure and evaluate variation in the behavioral traits, we here put forward a statistical procedure of which the working title is “trend analysis”. Low doses of the benzodiazepine agonist diazepam (DZP;1.0, 1.5, or 2.0 mg/kg) were used for exploring the use of the trend analysis in combination with multivariate data analysis for assessment of MCSF per-formance in rats. The commonly used elevated plus maze (EPM) test was used for comparison. The trend analysis comparing vehicle and the DZP1.5 groups revealed significantly higher general activity and risk-taking behavior in the DZP1.5 rats relative to vehicle rats. This finding was supported by multivariate data analysis procedures. It is concluded that the trend analysis together with multivariate data analysis procedures offers possibilities to extract information and illustrates effects obtained in the MCSF test. Diazepam in doses that have no apparent increase in open arm activity in the EPM was effective to alter the behavior in the MCSF test. The MCSF test and the use of multivariate data analysis and the proposed trend analysis may be useful alternatives to behavioral test batteries and traditionally used tests for the understanding of mechanisms underlying various mental states. Finally, the impact of an ethological reasoning and multivariate measures enabling behavioral profiling of animals may be a useful complementary methodology when phenotyping animals in behavioral neuroscience.展开更多
Bootstrap methods are considered in the application of statistical process control because they can deal with unknown distributions and are easy to calculate using a personal computer. In this study we propose the use...Bootstrap methods are considered in the application of statistical process control because they can deal with unknown distributions and are easy to calculate using a personal computer. In this study we propose the use of bootstrap-t multivariate control technique on the minimax control chart. The technique takes care of correlated variables as well as the requirement of the distributional assumptions needed for the operation of the minimax control chart. The bootstrap-t technique provides the mean θB of all the bootstrap estimators ** where θi is the estimate using the ith bootstrap sample and B is the number of bootstraps. The computation of the proposed bootstrap-t minimax statistic was performed on the values obtained from the bootstrap estimation. This method was used to determine the position of the four control limits of the minimax control chart. The bootstrap-t approach introduced to minimax multivariate control chart helps to detect shifts in the mean vector of a multivariate process and it overcomes the computational complexity of obtaining the distribution of multivariate data.展开更多
NonorthogonalMultiple Access(NOMA)is incorporated into the wireless network systems to achieve better connectivity,spectral and energy effectiveness,higher data transfer rate,and also obtain the high quality of servic...NonorthogonalMultiple Access(NOMA)is incorporated into the wireless network systems to achieve better connectivity,spectral and energy effectiveness,higher data transfer rate,and also obtain the high quality of services(QoS).In order to improve throughput and minimum latency,aMultivariate Renkonen Regressive Weighted Preference Bootstrap Aggregation based Nonorthogonal Multiple Access(MRRWPBA-NOMA)technique is introduced for network communication.In the downlink transmission,each mobile device’s resources and their characteristics like energy,bandwidth,and trust are measured.Followed by,the Weighted Preference Bootstrap Aggregation is applied to recognize the resource-efficient mobile devices for aware data transmission by constructing the different weak hypotheses i.e.,Multivariate Renkonen Regression functions.Based on the classification,resource and trust-aware devices are selected for transmission.Simulation of the proposed MRRWPBA-NOMA technique and existing methods are carried out with different metrics such as data delivery ratio,throughput,latency,packet loss rate,and energy efficiency,signaling overhead.The simulation results assessment indicates that the proposed MRRWPBA-NOMA outperforms well than the conventional methods.展开更多
This article presented a new data fusion approach for reasonably predicting dynamic serviceability reliability of the long-span bridge girder.Firstly,multivariate Bayesian dynamic linear model(MBDLM)considering dynami...This article presented a new data fusion approach for reasonably predicting dynamic serviceability reliability of the long-span bridge girder.Firstly,multivariate Bayesian dynamic linear model(MBDLM)considering dynamic correlation among the multiple variables is provided to predict dynamic extreme deflections;secondly,with the proposed MBDLM,the dynamic correlation coefficients between any two performance functions can be predicted;finally,based on MBDLM and Gaussian copula technique,a new data fusion method is given to predict the serviceability reliability of the long-span bridge girder,and the monitoring extreme deflection data from an actual bridge is provided to illustrated the feasibility and application of the proposed method.展开更多
为了提高太阳电池阵多变量预测的精度,解决阳电池阵遥测参数存在周期波动与增长性互相耦合的问题,提出一种基于STL-Prophet-Informer模型的太阳电池阵多变量预测算法.该算法首先应用局部加权周期趋势分解算法(seasonal and trend decomp...为了提高太阳电池阵多变量预测的精度,解决阳电池阵遥测参数存在周期波动与增长性互相耦合的问题,提出一种基于STL-Prophet-Informer模型的太阳电池阵多变量预测算法.该算法首先应用局部加权周期趋势分解算法(seasonal and trend decomposition procedure based on loess,STL)对太阳电池阵的多个参数分解为趋势分量、周期分量和残差分量,然后采用对趋势性数据预测效果较好的Prophet预测趋势分量,Informer模型预测周期分量和残差分量,最后将各分量预测结果相加后得到总的太阳电池阵参数预测值.以某卫星太阳电池阵实际遥测数据做算例分析,提出算法的各项误差评价指标和单一的Informer模型、LSTM模型等相比有明显减小,将该组合预测模型用于太阳电池阵多变量参数预测中,可以提高参数预测精度,提升卫星自主运行性能.展开更多
交通智能(IC)卡可以记录居民的移动出行,反映居民的源-目的地(OD)信息;但智能卡记录的OD流数据规模大,直接可视化空间分布容易导致视觉杂乱,并且多元数据类型多,更难以和流数据结合对比分析。首先,针对直接可视化大规模OD数据的空间分...交通智能(IC)卡可以记录居民的移动出行,反映居民的源-目的地(OD)信息;但智能卡记录的OD流数据规模大,直接可视化空间分布容易导致视觉杂乱,并且多元数据类型多,更难以和流数据结合对比分析。首先,针对直接可视化大规模OD数据的空间分布容易视觉遮挡的问题,提出基于正交非负矩阵分解(ONMF)的流聚类方法。所提方法对源-目的地数据聚类后再可视化,可以减少不必要的遮挡。然后,针对多元时空数据类型多难以结合对比分析的问题,设计了公交站点多元时序数据视图。该可视化方法将公交站点的流量大小和空气质量、空气温度、相对湿度、降雨量这四类多元数据在同一时间序列上编码,提高了视图的空间利用率并且可以对比分析。再次,为了辅助用户探索分析,开发了基于OD流和多元数据的交互式可视分析系统,并设计了多种交互操作提升用户探索效率。最后,基于新加坡交通智能卡数据集,从聚类效果和运行时间对该聚类方法评估。结果显示,在用轮廓系数评估聚类效果上,所提方法比原始方法提升了0.028,比用K均值聚类方法提升了0.253;在运行时间上比聚类效果较好的ONMFS(ONMF through Subspace exploration)方法少了254 s。通过案例分析和系统功能对比验证了系统的有效性。展开更多
基金supported by the National Key R&D Program of China(Project No.2016YFC0800200)the NRF-NSFC 3rd Joint Research Grant(Earth Science)(Project No.41861144022)+2 种基金the National Natural Science Foundation of China(Project Nos.51679174,and 51779189)the Shenzhen Key Technology R&D Program(Project No.20170324)The financial support is grateful acknowledged。
文摘Various uncertainties arising during acquisition process of geoscience data may result in anomalous data instances(i.e.,outliers)that do not conform with the expected pattern of regular data instances.With sparse multivariate data obtained from geotechnical site investigation,it is impossible to identify outliers with certainty due to the distortion of statistics of geotechnical parameters caused by outliers and their associated statistical uncertainty resulted from data sparsity.This paper develops a probabilistic outlier detection method for sparse multivariate data obtained from geotechnical site investigation.The proposed approach quantifies the outlying probability of each data instance based on Mahalanobis distance and determines outliers as those data instances with outlying probabilities greater than 0.5.It tackles the distortion issue of statistics estimated from the dataset with outliers by a re-sampling technique and accounts,rationally,for the statistical uncertainty by Bayesian machine learning.Moreover,the proposed approach also suggests an exclusive method to determine outlying components of each outlier.The proposed approach is illustrated and verified using simulated and real-life dataset.It showed that the proposed approach properly identifies outliers among sparse multivariate data and their corresponding outlying components in a probabilistic manner.It can significantly reduce the masking effect(i.e.,missing some actual outliers due to the distortion of statistics by the outliers and statistical uncertainty).It also found that outliers among sparse multivariate data instances affect significantly the construction of multivariate distribution of geotechnical parameters for uncertainty quantification.This emphasizes the necessity of data cleaning process(e.g.,outlier detection)for uncertainty quantification based on geoscience data.
文摘A multivariate statistical analysis was performed on multi-element soil geochemical data from the Koda Hill-Bulenga gold prospects in the Wa-Lawra gold belt, northwest Ghana. The objectives of the study were to define gold relationships with other trace elements to determine possible pathfinder elements for gold from the soil geochemical data. The study focused on seven elements, namely, Au, Fe, Pb, Mn, Ag, As and Cu. Factor analysis and hierarchical cluster analysis were performed on the analyzed samples. Factor analysis explained 79.093% of the total variance of the data through three factors. This had the gold factor being factor 3, having associations of copper, iron, lead and manganese and accounting for 20.903% of the total variance. From hierarchical clustering, gold was also observed to be clustering with lead, copper, arsenic and silver. There was further indication that, gold concentrations were lower than that of its associations. It can be inferred from the results that, the occurrence of gold and its associated elements can be linked to both primary dispersion from underlying rocks and secondary processes such as lateritization. This data shows that Fe and Mn strongly associated with gold, and alongside Pb, Ag, As and Cu, these elements can be used as pathfinders for gold in the area, with ferruginous zones as targets.
基金Natural Science Foundation of China(T11036061/T0108).
文摘A novel study using LCeMS(Liquid chromatography tandem mass spectrometry)coupled with multivariate data analysis and bioactivity evaluation was established for discrimination of aqueous extract and vinegar extract of Shixiao San.Batches of these two kinds of samples were subjected to analysis,and the datasets of sample codes,tR-m/z pairs and ion intensities were processed with principal component analysis(PCA).The result of score plot showed a clear classification of the aqueous and vinegar groups.And the chemical markers having great contributions to the differentiation were screened out on the loading plot.The identities of the chemical markers were performed by comparing the mass fragments and retention times with those of reference compounds and/or the known compounds published in the literatures.Based on the proposed strategy,quercetin-3-Oneohesperidoside,isorhamnetin-3-O-neohespeeridoside,kaempferol-3-O-neohesperidoside,isorhamnetin-3-O-rutinoside and isorhamnetin-3-O-(2G-a-l-rhamnosyl)-rutinoside were explored as representative markers in distinguishing the vinegar extract from the aqueous extract.The anti-hyperlipidemic activities of two processed extracts of Shixiao San were examined on serum levels of lipids,lipoprotein and blood antioxidant enzymes in a rat hyperlipidemia model,and the vinegary extract,exerting strong lipid-lowering and antioxidative effects,was superior to the aqueous extract.Therefore,boiling with vinegary was predicted as the greatest processing procedure for anti-hyperlipidemic effect of Shixiao San.Furthermore,combining the changes in the metabolic profiling and bioactivity evaluation,the five representative markers may be related to the observed antihyperlipidemic effect.
基金Supported by the National Natural Science Foundation of China (601133010)
文摘In this paper, a new approach for visualizing multivariate categorical data is presented. The approach uses a graph to represent multivariate categorical data and draws the graph in such a way that we can identify patterns, trends and relationship within the data. A mathematical model for the graph layout problem is deduced and a spectral graph drawing algorithm for visualizing multivariate categorical data is proposed. The experiments show that the drawings by the algorithm well capture the structures of multivariate categorical data and the computing speed is fast.
基金financial support of Isfahan University of Technology (IUT) for this research
文摘Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in soil properties. However, there is not enough research quantifying which environmental factor(s) can be the most relevant to predicting soil properties at the catchment scale in semi-arid areas. Thus, this research aims to investigate the ability of multivariate statistical analyses to distinguish which soil properties follow a clear spatial pattern conditioned by specific environmental characteristics in a semi-arid region of Iran. To achieve this goal, we digitized parent materials and landforms by recent orthophotography. Also, we extracted ten topographical attributes and five remote sensing variables from a digital elevation model(DEM) and the Landsat Enhanced Thematic Mapper(ETM), respectively. These factors were contrasted for 334 soil samples(depth of 0–30 cm). Cluster analysis and soil maps reveal that Cluster 1 comprises of limestones, massive limestones and mixed deposits of conglomerates with low soil organic carbon(SOC) and clay contents, and Cluster 2 is composed of soils that originated from quaternary and early quaternary parent materials such as terraces, alluvial fans, lake deposits, and marls or conglomerates that register the highest SOC content and the lowest sand and silt contents. Further, it is confirmed that soils with the highest SOC and clay contents are located in wetlands, lagoons, alluvial fans and piedmonts, while soils with the lowest SOC and clay contents are located in dissected alluvial fans, eroded hills, rock outcrops and steep hills. The results of principal component analysis using the remote sensing data and topographical attributes identify five main components, which explain 73.3% of the total variability of soil properties. Environmental factors such as hillslope morphology and all of the remote sensing variables can largely explain SOC variability, but no significant correlation is found for soil texture and calcium carbonate equivalent contents. Therefore, we conclude that SOC can be considered as the best-predicted soil property in semi-arid regions.
文摘There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal component analysis, cluster analysis, factor analysis, pattern analysis, discriminant analysis, multivariate analysis of variance, correspondence analysis, canonical correlation analysis, redundancy analysis, genetic diversity analysis, and stability analysis, which involve in joint regression, additive main effects and multiplicative interactions, and genotype-by-environment interaction biplot analysis. The advanced statistical tools, such as non-parametric analysis of disease association, meta-analysis, Bayesian analysis, and decision theory, take an important place in analysis of disease dynamics. Disease forecasting methods by simulation models for plant diseases have a great potentiality in practical disease control strategies. Common mathematical tools such as monomolecular, exponential, logistic, Gompertz and linked differential equations take an important place in growth curve analysis of disease epidemics. The highly informative means of displaying a range of numerical data through construction of box and whisker plots has been suggested. The probable applications of recent advanced tools of linear and non-linear mixed models like the linear mixed model, generalized linear model, and generalized linear mixed models have been presented. The most recent technologies such as micro-array analysis, though cost effective, provide estimates of gene expressions for thousands of genes simultaneously and need attention by the molecular biologists. Some of these advanced tools can be well applied in different branches of rice research, including crop improvement, crop production, crop protection, social sciences as well as agricultural engineering. The rice research scientists should take advantage of these new opportunities adequately in adoption of the new highly potential advanced technologies while planning experimental designs, data collection, analysis and interpretation of their research data sets.
基金Projects(61001188,1161140319)supported by the National Natural Science Foundation of ChinaProject(2012ZX03001034)supported by the National Science and Technology Major ProjectProject(YETP1202)supported by Beijing Higher Education Young Elite Teacher Project,China
文摘Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error.
文摘Designing relevant animal models in order to investigate the neurobiological basis for human mental disorders is an important challenge. The need for new tests to be developed and traditional tests to be improved has recently been em-phasized. The authors propose a multivariate test approach, the multivariate concentric square fieldTM (MCSF) test. To measure and evaluate variation in the behavioral traits, we here put forward a statistical procedure of which the working title is “trend analysis”. Low doses of the benzodiazepine agonist diazepam (DZP;1.0, 1.5, or 2.0 mg/kg) were used for exploring the use of the trend analysis in combination with multivariate data analysis for assessment of MCSF per-formance in rats. The commonly used elevated plus maze (EPM) test was used for comparison. The trend analysis comparing vehicle and the DZP1.5 groups revealed significantly higher general activity and risk-taking behavior in the DZP1.5 rats relative to vehicle rats. This finding was supported by multivariate data analysis procedures. It is concluded that the trend analysis together with multivariate data analysis procedures offers possibilities to extract information and illustrates effects obtained in the MCSF test. Diazepam in doses that have no apparent increase in open arm activity in the EPM was effective to alter the behavior in the MCSF test. The MCSF test and the use of multivariate data analysis and the proposed trend analysis may be useful alternatives to behavioral test batteries and traditionally used tests for the understanding of mechanisms underlying various mental states. Finally, the impact of an ethological reasoning and multivariate measures enabling behavioral profiling of animals may be a useful complementary methodology when phenotyping animals in behavioral neuroscience.
文摘Bootstrap methods are considered in the application of statistical process control because they can deal with unknown distributions and are easy to calculate using a personal computer. In this study we propose the use of bootstrap-t multivariate control technique on the minimax control chart. The technique takes care of correlated variables as well as the requirement of the distributional assumptions needed for the operation of the minimax control chart. The bootstrap-t technique provides the mean θB of all the bootstrap estimators ** where θi is the estimate using the ith bootstrap sample and B is the number of bootstraps. The computation of the proposed bootstrap-t minimax statistic was performed on the values obtained from the bootstrap estimation. This method was used to determine the position of the four control limits of the minimax control chart. The bootstrap-t approach introduced to minimax multivariate control chart helps to detect shifts in the mean vector of a multivariate process and it overcomes the computational complexity of obtaining the distribution of multivariate data.
基金the Taif University Researchers Supporting Project number(TURSP-2020/36),Taif University,Taif,Saudi Arabiafundedby Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2022R97), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia。
文摘NonorthogonalMultiple Access(NOMA)is incorporated into the wireless network systems to achieve better connectivity,spectral and energy effectiveness,higher data transfer rate,and also obtain the high quality of services(QoS).In order to improve throughput and minimum latency,aMultivariate Renkonen Regressive Weighted Preference Bootstrap Aggregation based Nonorthogonal Multiple Access(MRRWPBA-NOMA)technique is introduced for network communication.In the downlink transmission,each mobile device’s resources and their characteristics like energy,bandwidth,and trust are measured.Followed by,the Weighted Preference Bootstrap Aggregation is applied to recognize the resource-efficient mobile devices for aware data transmission by constructing the different weak hypotheses i.e.,Multivariate Renkonen Regression functions.Based on the classification,resource and trust-aware devices are selected for transmission.Simulation of the proposed MRRWPBA-NOMA technique and existing methods are carried out with different metrics such as data delivery ratio,throughput,latency,packet loss rate,and energy efficiency,signaling overhead.The simulation results assessment indicates that the proposed MRRWPBA-NOMA outperforms well than the conventional methods.
基金This work was supported by Natural Science Foundation of Gansu Province of China(20JR10RA625,20JR10RA623)National Key Research and Development Project of China(Project No.2019YFC1511005)+1 种基金Fundamental Research Funds for the Central Universities(Grant No.lzujbky-2020-55)National Natural Science Foundation of China(Grant No.51608243).
文摘This article presented a new data fusion approach for reasonably predicting dynamic serviceability reliability of the long-span bridge girder.Firstly,multivariate Bayesian dynamic linear model(MBDLM)considering dynamic correlation among the multiple variables is provided to predict dynamic extreme deflections;secondly,with the proposed MBDLM,the dynamic correlation coefficients between any two performance functions can be predicted;finally,based on MBDLM and Gaussian copula technique,a new data fusion method is given to predict the serviceability reliability of the long-span bridge girder,and the monitoring extreme deflection data from an actual bridge is provided to illustrated the feasibility and application of the proposed method.
文摘为了提高太阳电池阵多变量预测的精度,解决阳电池阵遥测参数存在周期波动与增长性互相耦合的问题,提出一种基于STL-Prophet-Informer模型的太阳电池阵多变量预测算法.该算法首先应用局部加权周期趋势分解算法(seasonal and trend decomposition procedure based on loess,STL)对太阳电池阵的多个参数分解为趋势分量、周期分量和残差分量,然后采用对趋势性数据预测效果较好的Prophet预测趋势分量,Informer模型预测周期分量和残差分量,最后将各分量预测结果相加后得到总的太阳电池阵参数预测值.以某卫星太阳电池阵实际遥测数据做算例分析,提出算法的各项误差评价指标和单一的Informer模型、LSTM模型等相比有明显减小,将该组合预测模型用于太阳电池阵多变量参数预测中,可以提高参数预测精度,提升卫星自主运行性能.
文摘交通智能(IC)卡可以记录居民的移动出行,反映居民的源-目的地(OD)信息;但智能卡记录的OD流数据规模大,直接可视化空间分布容易导致视觉杂乱,并且多元数据类型多,更难以和流数据结合对比分析。首先,针对直接可视化大规模OD数据的空间分布容易视觉遮挡的问题,提出基于正交非负矩阵分解(ONMF)的流聚类方法。所提方法对源-目的地数据聚类后再可视化,可以减少不必要的遮挡。然后,针对多元时空数据类型多难以结合对比分析的问题,设计了公交站点多元时序数据视图。该可视化方法将公交站点的流量大小和空气质量、空气温度、相对湿度、降雨量这四类多元数据在同一时间序列上编码,提高了视图的空间利用率并且可以对比分析。再次,为了辅助用户探索分析,开发了基于OD流和多元数据的交互式可视分析系统,并设计了多种交互操作提升用户探索效率。最后,基于新加坡交通智能卡数据集,从聚类效果和运行时间对该聚类方法评估。结果显示,在用轮廓系数评估聚类效果上,所提方法比原始方法提升了0.028,比用K均值聚类方法提升了0.253;在运行时间上比聚类效果较好的ONMFS(ONMF through Subspace exploration)方法少了254 s。通过案例分析和系统功能对比验证了系统的有效性。