Receiver operating characteristics(ROC)curve and the area under the curve(AUC)value are often used to illustrate the diagnostic ability of binary classifiers.However,both ROC and AUC focus on high accuracy in theory,w...Receiver operating characteristics(ROC)curve and the area under the curve(AUC)value are often used to illustrate the diagnostic ability of binary classifiers.However,both ROC and AUC focus on high accuracy in theory,which may not be effective for practical applications.In addition,it is difficult to judge which one is better when the ROC curves are intersect and the AUC values are equal.Decision curve analysis(DCA)methods improve ROC by incorporating accuracy and consequences.However,similar to ROC,DCA requires a quantitative indicator to objectively determine which one is better when DCA curves intersect.A DCA-based statistical indicator named maximum net benefit(MNB)is constructed for evaluating clinical treatment regimens rather than just accuracy as in ROC and AUC.As a simple and effective statistical indicator,the construction process of MNB is given theoretically.Moreover,the MNB can still provide effective identification when the AUC values are equal,which is proved by theory.Furthermore,the feasibility and effectiveness of the proposed MNB are verified by gene selection and classifier performance comparison on actual data.展开更多
MapReduce是目前最为流行的用于大数据分析的并行系统之一.许多企业已经搭建了自己的MapReduce集群,为广大用户提供计算服务.用户可以向集群提交具有完成时限要求的MapReduce作业,若作业被按时完成,则企业可以获得一定的收益.针对这种...MapReduce是目前最为流行的用于大数据分析的并行系统之一.许多企业已经搭建了自己的MapReduce集群,为广大用户提供计算服务.用户可以向集群提交具有完成时限要求的MapReduce作业,若作业被按时完成,则企业可以获得一定的收益.针对这种应用场景,该文首次提出了MapReduce集群中的最大收益问题.为有效地解决该问题,首先提出了一种基于序列的任务调度策略(简称为SEQ策略),并证明了在处理具有完成时限约束的作业时SEQ策略存在优势.基于SEQ策略,该文提出了最大收益的调度算法(Scheduling Algorithm for Maximum Benefit,简称AMB算法),该算法可以快速地确定可接收作业,并给出有效的执行方案,以达到最大化收益的目的.另外,针对在实际应用中的某些异常情况(如节点宕机),该文也设计了有效的超时处理策略,进一步增加了算法的实用性.最后,通过大量的实验验证了该文所提出算法的有效性.展开更多
基金Support by Natural Science Foundation of Henan Province(Grant No.222300420417)Kaifeng Science and Technology Project(Grant No.2103004).
文摘Receiver operating characteristics(ROC)curve and the area under the curve(AUC)value are often used to illustrate the diagnostic ability of binary classifiers.However,both ROC and AUC focus on high accuracy in theory,which may not be effective for practical applications.In addition,it is difficult to judge which one is better when the ROC curves are intersect and the AUC values are equal.Decision curve analysis(DCA)methods improve ROC by incorporating accuracy and consequences.However,similar to ROC,DCA requires a quantitative indicator to objectively determine which one is better when DCA curves intersect.A DCA-based statistical indicator named maximum net benefit(MNB)is constructed for evaluating clinical treatment regimens rather than just accuracy as in ROC and AUC.As a simple and effective statistical indicator,the construction process of MNB is given theoretically.Moreover,the MNB can still provide effective identification when the AUC values are equal,which is proved by theory.Furthermore,the feasibility and effectiveness of the proposed MNB are verified by gene selection and classifier performance comparison on actual data.
文摘MapReduce是目前最为流行的用于大数据分析的并行系统之一.许多企业已经搭建了自己的MapReduce集群,为广大用户提供计算服务.用户可以向集群提交具有完成时限要求的MapReduce作业,若作业被按时完成,则企业可以获得一定的收益.针对这种应用场景,该文首次提出了MapReduce集群中的最大收益问题.为有效地解决该问题,首先提出了一种基于序列的任务调度策略(简称为SEQ策略),并证明了在处理具有完成时限约束的作业时SEQ策略存在优势.基于SEQ策略,该文提出了最大收益的调度算法(Scheduling Algorithm for Maximum Benefit,简称AMB算法),该算法可以快速地确定可接收作业,并给出有效的执行方案,以达到最大化收益的目的.另外,针对在实际应用中的某些异常情况(如节点宕机),该文也设计了有效的超时处理策略,进一步增加了算法的实用性.最后,通过大量的实验验证了该文所提出算法的有效性.