Collaborative cross-edge analytics is a new computing paradigm in which Internetof Things (IoT) data analytics is performed across multiple geographically dispersededge clouds. Existing work on collaborative cross-edg...Collaborative cross-edge analytics is a new computing paradigm in which Internetof Things (IoT) data analytics is performed across multiple geographically dispersededge clouds. Existing work on collaborative cross-edge analytics mostly focuses on reducingeither analytics response time or wide-area network (WAN) traffic volume. In thiswork, we empirically demonstrate that reducing either analytics response time or networktraffic volume does not necessarily minimize the WAN traffic cost, due to the price heterogeneityof WAN links. To explicitly leverage the price heterogeneity for WAN cost minimization,we propose to schedule analytic tasks based on both price and bandwidth heterogeneities.Unfortunately, the problem of WAN cost minimization underperformance constraintis shown non-deterministic polynomial (NP)-hard and thus computationally intractablefor large inputs. To address this challenge, we propose price- and performanceawaregeo-distributed analytics (PPGA) , an efficient task scheduling heuristic that improvesthe cost-efficiency of IoT data analytic jobs across edge datacenters. We implementPPGA based on Apache Spark and conduct extensive experiments on Amazon EC2to verify the efficacy of PPGA.展开更多
基金This work was supported in part by the National Natural Science Foundation of China under Grant No.61802449the Guangdong Natural Science Funds under Grant No.2021A1515011912.
文摘Collaborative cross-edge analytics is a new computing paradigm in which Internetof Things (IoT) data analytics is performed across multiple geographically dispersededge clouds. Existing work on collaborative cross-edge analytics mostly focuses on reducingeither analytics response time or wide-area network (WAN) traffic volume. In thiswork, we empirically demonstrate that reducing either analytics response time or networktraffic volume does not necessarily minimize the WAN traffic cost, due to the price heterogeneityof WAN links. To explicitly leverage the price heterogeneity for WAN cost minimization,we propose to schedule analytic tasks based on both price and bandwidth heterogeneities.Unfortunately, the problem of WAN cost minimization underperformance constraintis shown non-deterministic polynomial (NP)-hard and thus computationally intractablefor large inputs. To address this challenge, we propose price- and performanceawaregeo-distributed analytics (PPGA) , an efficient task scheduling heuristic that improvesthe cost-efficiency of IoT data analytic jobs across edge datacenters. We implementPPGA based on Apache Spark and conduct extensive experiments on Amazon EC2to verify the efficacy of PPGA.