为提高浮动车数据中异常数据检测能力及不同载客状态下的模型检测分析能力,提出基于S-DTA-IIForest(Summation&Difference Third Order Average&Improvement-Isolation Forest)的浮动车数据异常检测算法。构建由相邻两项求和(S...为提高浮动车数据中异常数据检测能力及不同载客状态下的模型检测分析能力,提出基于S-DTA-IIForest(Summation&Difference Third Order Average&Improvement-Isolation Forest)的浮动车数据异常检测算法。构建由相邻两项求和(S)、三阶求和平均差分(DTA)的二维度空间SDTA特征向量;提出差额累计更新和动态区分辨识的改进孤立森林IIForest算法,通过设置停止阈值参数,避免当出现新样本异常值分数大于停止阈值时,仅更新样本不更新孤立森林模型的问题,设计每个二叉树区分辨识度参数,区分辨识度位于停止区间时停止二叉树生长,提高算法收敛性能,以ROC(Receiver Operating Characteristic)曲线下面积AUC(Area Under ROC Cure)、F1-score为指标对模型精度进行对比分析,并以重庆市中心城区学府大道开展实例验证。结果表明:本文S-DTA-IIForest组合算法AUC、F1-score分别为86.63%、0.89,AUC较传统孤立森林IForest(Isolation Forest)提高32.4%,运行效率提高1.29%,具有收敛速度更快、精度更高的优势,载客条件下模型AUC、F1-score较未载客分别提高7.7%、10.8%,组合算法对载客数据有更高的检测精度,且未载客状态数据异常率较载客状态增加71.4%,未载客数据异常率更高。展开更多
This work presents a comprehensive fourth-order predictive modeling (PM) methodology that uses the MaxEnt principle to incorporate fourth-order moments (means, covariances, skewness, kurtosis) of model parameters, com...This work presents a comprehensive fourth-order predictive modeling (PM) methodology that uses the MaxEnt principle to incorporate fourth-order moments (means, covariances, skewness, kurtosis) of model parameters, computed and measured model responses, as well as fourth (and higher) order sensitivities of computed model responses to model parameters. This new methodology is designated by the acronym 4<sup>th</sup>-BERRU-PM, which stands for “fourth-order best-estimate results with reduced uncertainties.” The results predicted by the 4<sup>th</sup>-BERRU-PM incorporates, as particular cases, the results previously predicted by the second-order predictive modeling methodology 2<sup>nd</sup>-BERRU-PM, and vastly generalizes the results produced by extant data assimilation and data adjustment procedures.展开更多
This work presents a comprehensive second-order predictive modeling (PM) methodology designated by the acronym 2<sup>nd</sup>-BERRU-PMD. The attribute “2<sup>nd</sup>” indicates that this met...This work presents a comprehensive second-order predictive modeling (PM) methodology designated by the acronym 2<sup>nd</sup>-BERRU-PMD. The attribute “2<sup>nd</sup>” indicates that this methodology incorporates second-order uncertainties (means and covariances) and second-order sensitivities of computed model responses to model parameters. The acronym BERRU stands for “Best- Estimate Results with Reduced Uncertainties” and the last letter (“D”) in the acronym indicates “deterministic,” referring to the deterministic inclusion of the computational model responses. The 2<sup>nd</sup>-BERRU-PMD methodology is fundamentally based on the maximum entropy (MaxEnt) principle. This principle is in contradistinction to the fundamental principle that underlies the extant data assimilation and/or adjustment procedures which minimize in a least-square sense a subjective user-defined functional which is meant to represent the discrepancies between measured and computed model responses. It is shown that the 2<sup>nd</sup>-BERRU-PMD methodology generalizes and extends current data assimilation and/or data adjustment procedures while overcoming the fundamental limitations of these procedures. In the accompanying work (Part II), the alternative framework for developing the “second- order MaxEnt predictive modelling methodology” is presented by incorporating probabilistically (as opposed to “deterministically”) the computed model responses.展开更多
This work presents a comprehensive second-order predictive modeling (PM) methodology based on the maximum entropy (MaxEnt) principle for obtaining best-estimate mean values and correlations for model responses and par...This work presents a comprehensive second-order predictive modeling (PM) methodology based on the maximum entropy (MaxEnt) principle for obtaining best-estimate mean values and correlations for model responses and parameters. This methodology is designated by the acronym 2<sup>nd</sup>-BERRU-PMP, where the attribute “2<sup>nd</sup>” indicates that this methodology incorporates second- order uncertainties (means and covariances) and second (and higher) order sensitivities of computed model responses to model parameters. The acronym BERRU stands for “Best-Estimate Results with Reduced Uncertainties” and the last letter (“P”) in the acronym indicates “probabilistic,” referring to the MaxEnt probabilistic inclusion of the computational model responses. This is in contradistinction to the 2<sup>nd</sup>-BERRU-PMD methodology, which deterministically combines the computed model responses with the experimental information, as presented in the accompanying work (Part I). Although both the 2<sup>nd</sup>-BERRU-PMP and the 2<sup>nd</sup>-BERRU-PMD methodologies yield expressions that include second (and higher) order sensitivities of responses to model parameters, the respective expressions for the predicted responses, for the calibrated predicted parameters and for their predicted uncertainties (covariances), are not identical to each other. Nevertheless, the results predicted by both the 2<sup>nd</sup>-BERRU-PMP and the 2<sup>nd</sup>-BERRU-PMD methodologies encompass, as particular cases, the results produced by the extant data assimilation and data adjustment procedures, which rely on the minimization, in a least-square sense, of a user-defined functional meant to represent the discrepancies between measured and computed model responses.展开更多
This work (in two parts) will present a novel predictive modeling methodology aimed at obtaining “best-estimate results with reduced uncertainties” for the first four moments (mean values, covariance, skewness and k...This work (in two parts) will present a novel predictive modeling methodology aimed at obtaining “best-estimate results with reduced uncertainties” for the first four moments (mean values, covariance, skewness and kurtosis) of the optimally predicted distribution of model results and calibrated model parameters, by combining fourth-order experimental and computational information, including fourth (and higher) order sensitivities of computed model responses to model parameters. Underlying the construction of this fourth-order predictive modeling methodology is the “maximum entropy principle” which is initially used to obtain a novel closed-form expression of the (moments-constrained) fourth-order Maximum Entropy (MaxEnt) probability distribution constructed from the first four moments (means, covariances, skewness, kurtosis), which are assumed to be known, of an otherwise unknown distribution of a high-dimensional multivariate uncertain quantity of interest. This fourth-order MaxEnt distribution provides optimal compatibility of the available information while simultaneously ensuring minimal spurious information content, yielding an estimate of a probability density with the highest uncertainty among all densities satisfying the known moment constraints. Since this novel generic fourth-order MaxEnt distribution is of interest in its own right for applications in addition to predictive modeling, its construction is presented separately, in this first part of a two-part work. The fourth-order predictive modeling methodology that will be constructed by particularizing this generic fourth-order MaxEnt distribution will be presented in the accompanying work (Part-2).展开更多
文摘为提高浮动车数据中异常数据检测能力及不同载客状态下的模型检测分析能力,提出基于S-DTA-IIForest(Summation&Difference Third Order Average&Improvement-Isolation Forest)的浮动车数据异常检测算法。构建由相邻两项求和(S)、三阶求和平均差分(DTA)的二维度空间SDTA特征向量;提出差额累计更新和动态区分辨识的改进孤立森林IIForest算法,通过设置停止阈值参数,避免当出现新样本异常值分数大于停止阈值时,仅更新样本不更新孤立森林模型的问题,设计每个二叉树区分辨识度参数,区分辨识度位于停止区间时停止二叉树生长,提高算法收敛性能,以ROC(Receiver Operating Characteristic)曲线下面积AUC(Area Under ROC Cure)、F1-score为指标对模型精度进行对比分析,并以重庆市中心城区学府大道开展实例验证。结果表明:本文S-DTA-IIForest组合算法AUC、F1-score分别为86.63%、0.89,AUC较传统孤立森林IForest(Isolation Forest)提高32.4%,运行效率提高1.29%,具有收敛速度更快、精度更高的优势,载客条件下模型AUC、F1-score较未载客分别提高7.7%、10.8%,组合算法对载客数据有更高的检测精度,且未载客状态数据异常率较载客状态增加71.4%,未载客数据异常率更高。
文摘This work presents a comprehensive fourth-order predictive modeling (PM) methodology that uses the MaxEnt principle to incorporate fourth-order moments (means, covariances, skewness, kurtosis) of model parameters, computed and measured model responses, as well as fourth (and higher) order sensitivities of computed model responses to model parameters. This new methodology is designated by the acronym 4<sup>th</sup>-BERRU-PM, which stands for “fourth-order best-estimate results with reduced uncertainties.” The results predicted by the 4<sup>th</sup>-BERRU-PM incorporates, as particular cases, the results previously predicted by the second-order predictive modeling methodology 2<sup>nd</sup>-BERRU-PM, and vastly generalizes the results produced by extant data assimilation and data adjustment procedures.
文摘This work presents a comprehensive second-order predictive modeling (PM) methodology designated by the acronym 2<sup>nd</sup>-BERRU-PMD. The attribute “2<sup>nd</sup>” indicates that this methodology incorporates second-order uncertainties (means and covariances) and second-order sensitivities of computed model responses to model parameters. The acronym BERRU stands for “Best- Estimate Results with Reduced Uncertainties” and the last letter (“D”) in the acronym indicates “deterministic,” referring to the deterministic inclusion of the computational model responses. The 2<sup>nd</sup>-BERRU-PMD methodology is fundamentally based on the maximum entropy (MaxEnt) principle. This principle is in contradistinction to the fundamental principle that underlies the extant data assimilation and/or adjustment procedures which minimize in a least-square sense a subjective user-defined functional which is meant to represent the discrepancies between measured and computed model responses. It is shown that the 2<sup>nd</sup>-BERRU-PMD methodology generalizes and extends current data assimilation and/or data adjustment procedures while overcoming the fundamental limitations of these procedures. In the accompanying work (Part II), the alternative framework for developing the “second- order MaxEnt predictive modelling methodology” is presented by incorporating probabilistically (as opposed to “deterministically”) the computed model responses.
文摘This work presents a comprehensive second-order predictive modeling (PM) methodology based on the maximum entropy (MaxEnt) principle for obtaining best-estimate mean values and correlations for model responses and parameters. This methodology is designated by the acronym 2<sup>nd</sup>-BERRU-PMP, where the attribute “2<sup>nd</sup>” indicates that this methodology incorporates second- order uncertainties (means and covariances) and second (and higher) order sensitivities of computed model responses to model parameters. The acronym BERRU stands for “Best-Estimate Results with Reduced Uncertainties” and the last letter (“P”) in the acronym indicates “probabilistic,” referring to the MaxEnt probabilistic inclusion of the computational model responses. This is in contradistinction to the 2<sup>nd</sup>-BERRU-PMD methodology, which deterministically combines the computed model responses with the experimental information, as presented in the accompanying work (Part I). Although both the 2<sup>nd</sup>-BERRU-PMP and the 2<sup>nd</sup>-BERRU-PMD methodologies yield expressions that include second (and higher) order sensitivities of responses to model parameters, the respective expressions for the predicted responses, for the calibrated predicted parameters and for their predicted uncertainties (covariances), are not identical to each other. Nevertheless, the results predicted by both the 2<sup>nd</sup>-BERRU-PMP and the 2<sup>nd</sup>-BERRU-PMD methodologies encompass, as particular cases, the results produced by the extant data assimilation and data adjustment procedures, which rely on the minimization, in a least-square sense, of a user-defined functional meant to represent the discrepancies between measured and computed model responses.
文摘This work (in two parts) will present a novel predictive modeling methodology aimed at obtaining “best-estimate results with reduced uncertainties” for the first four moments (mean values, covariance, skewness and kurtosis) of the optimally predicted distribution of model results and calibrated model parameters, by combining fourth-order experimental and computational information, including fourth (and higher) order sensitivities of computed model responses to model parameters. Underlying the construction of this fourth-order predictive modeling methodology is the “maximum entropy principle” which is initially used to obtain a novel closed-form expression of the (moments-constrained) fourth-order Maximum Entropy (MaxEnt) probability distribution constructed from the first four moments (means, covariances, skewness, kurtosis), which are assumed to be known, of an otherwise unknown distribution of a high-dimensional multivariate uncertain quantity of interest. This fourth-order MaxEnt distribution provides optimal compatibility of the available information while simultaneously ensuring minimal spurious information content, yielding an estimate of a probability density with the highest uncertainty among all densities satisfying the known moment constraints. Since this novel generic fourth-order MaxEnt distribution is of interest in its own right for applications in addition to predictive modeling, its construction is presented separately, in this first part of a two-part work. The fourth-order predictive modeling methodology that will be constructed by particularizing this generic fourth-order MaxEnt distribution will be presented in the accompanying work (Part-2).