Connected vehicles for safety and traffic efficient applications require device-to-device connections supporting one-to-many and many-to-many communication, precise absolute and relative positioning and distributed co...Connected vehicles for safety and traffic efficient applications require device-to-device connections supporting one-to-many and many-to-many communication, precise absolute and relative positioning and distributed computing. Currently, the 5.9 GHz Dedicated Short Range Communications (DSRC) and 4G-Long-Term Evolution (LTE) are available for connected vehicle services. But both have limitations in reliability or latency over large scale field operational tests and deployment. This paper proposes the device-to-device (D2D) connectivity framework based on publish-subscribe architecture, with Message Queue Telemetry Transport (MQTT) protocol. With the publish-subscribe communication paradigm, road mobile users can exchange data and information in moderate latency and high reliability manner, having the potential to support many Vehicle to Everything (V2X) applications, including vehicle to vehicle (V2V), vehicle to roadside infrastructure (V2I), and vehicle to bicycle (V2B). The D2D data exchanges also facilitate computing for absolute and relative precise real-time kinematic (RTK) posi-tioning. Vehicular experiments were conducted to evaluate the performance of the proposed publish-subscribe MQTT protocols in term of latency and reliability. The latency of data exchanges is measured by One-trip-time (OTT) and the reliability is measured by the packet loss rate (PLR). Our results show that the latency of GNSS raw data exchanges between vehicles through 4G cellular networks at the rate of 10 Hz and the data rates of 10 kbps are less than 300 ms while the reliability is over 96%. Vehicular positioning experiments have also shown that vehicles can exchange raw GNSS data and complete mov-ing-base RTK positioning with the positioning availability of 98%.展开更多
Research data infrastructures form the cornerstone in both cyber and physical spaces,driving the progression of the data-intensive scientific research paradigm.This opinion paper presents an overview of global researc...Research data infrastructures form the cornerstone in both cyber and physical spaces,driving the progression of the data-intensive scientific research paradigm.This opinion paper presents an overview of global research data infrastructure,drawing insights from national roadmaps and strategic documents related to research data infrastructure.It emphasizes the pivotal role of research data infrastructures by delineating four new missions aimed at positioning them at the core of the current scientific research and communication ecosystem.The four new missions of research data infrastructures are:(1)as a pioneer,to transcend the disciplinary border and address complex,cutting-edge scientific and social challenges with problem-and data-oriented insights;(2)as an architect,to establish a digital,intelligent,flexible research and knowledge services environment;(3)as a platform,to foster the high-end academic communication;(4)as a coordinator,to balance scientific openness with ethics needs.展开更多
A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow acc...A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow accurately.However,accurately predicting traffic flow at the individual road level is extremely difficult due to the complex interplay of spatial and temporal factors.This paper proposes a technique for predicting short-term traffic flow data using an architecture that utilizes convolutional bidirectional long short-term memory(Conv-BiLSTM)with attention mechanisms.Prior studies neglected to include data pertaining to factors such as holidays,weather conditions,and vehicle types,which are interconnected and significantly impact the accuracy of forecast outcomes.In addition,this research incorporates recurring monthly periodic pattern data that significantly enhances the accuracy of forecast outcomes.The experimental findings demonstrate a performance improvement of 21.68%when incorporating the vehicle type feature.展开更多
Artificial intelligence(AI)relies on data and algorithms.State-of-the-art(SOTA)AI smart algorithms have been developed to improve the performance of AI-oriented structures.However,model-centric approaches are limited ...Artificial intelligence(AI)relies on data and algorithms.State-of-the-art(SOTA)AI smart algorithms have been developed to improve the performance of AI-oriented structures.However,model-centric approaches are limited by the absence of high-quality data.Data-centric AI is an emerging approach for solving machine learning(ML)problems.It is a collection of various data manipulation techniques that allow ML practitioners to systematically improve the quality of the data used in an ML pipeline.However,data-centric AI approaches are not well documented.Researchers have conducted various experiments without a clear set of guidelines.This survey highlights six major data-centric AI aspects that researchers are already using to intentionally or unintentionally improve the quality of AI systems.These include big data quality assessment,data preprocessing,transfer learning,semi-supervised learning,machine learning operations(MLOps),and the effect of adding more data.In addition,it highlights recent data-centric techniques adopted by ML practitioners.We addressed how adding data might harm datasets and how HoloClean can be used to restore and clean them.Finally,we discuss the causes of technical debt in AI.Technical debt builds up when software design and implementation decisions run into“or outright collide with”business goals and timelines.This survey lays the groundwork for future data-centric AI discussions by summarizing various data-centric approaches.展开更多
The basic function of the Internet is to delivery data(what) to serve the needs of all applications. IP names the attachment points(where) to facilitate ubiquitous interconnectivity as the current way to deliver data....The basic function of the Internet is to delivery data(what) to serve the needs of all applications. IP names the attachment points(where) to facilitate ubiquitous interconnectivity as the current way to deliver data. The fundamental mismatch between data delivery and naming attachment points leads to a lot of challenges, e.g., mapping from data name to IP address, handling dynamics of underlying topology, scaling up the data distribution, and securing communication, etc. Informationcentric networking(ICN) is proposed to shift the focus of communication paradigm from where to what, by making the named data the first-class citizen in the network, The basic consensus of ICN is to name the data independent from its container(space dimension) and session(time dimension), which breaks the limitation of point-to-point IP semantic. It scales up data distribution by utilizing available resources, and facilitates communication to fit diverse connectivity and heterogeneous networks. However, there are only a few consensuses on the detailed design of ICN, and quite a few different ICN architectures are proposed. This paper reveals the rationales of ICN from the perspective of the Internet evolution, surveys different design choices, and discusses on two debatable topics in ICN, i.e.,self-certifying versus hierarchical names, and edge versus pervasive caching. We hope this survey helps clarify some mis-understandings on ICN and achieve more consensuses.展开更多
Automation has arrived in the low voltage grid domain. In the next few years, the secondary substation—at the barriers of medium and low voltage grids—will thus be upgraded to enable novel functions. In this paper, ...Automation has arrived in the low voltage grid domain. In the next few years, the secondary substation—at the barriers of medium and low voltage grids—will thus be upgraded to enable novel functions. In this paper, we present various smart grid applications running on such intelligent secondary substations(iSSN) including their interaction with each other. We integrate energy consumption and production data, as well as forecasts, sensed from the smart buildings’ energy management systems(BEMSs) into the operation of the low voltage grid. A suitable framework for those modular applications includes features to initiate their installation, update, removal, the remote operator site, and not requiring staff on-site for such typical reappearing maintenance tasks.展开更多
The Internet of Things(IoT)has been growing over the past few years due to its flexibility and ease of use in real-time applications.The IoT's foremost task is ensuring that there is proper communication among dif...The Internet of Things(IoT)has been growing over the past few years due to its flexibility and ease of use in real-time applications.The IoT's foremost task is ensuring that there is proper communication among different types of applications and devices,and that the application layer protocols fulfill this necessity.However,as the number of applications grows,it is necessary to modify or enhance the application layer protocols according to specific IoT applications,allowing specific issues to be addressed,such as dynamic adaption to network conditions and interoperability.Recently,several IoT application layer protocols have been enhanced and modified according to application requirements.However,no existing survey articles focus on these protocols.In this article,we survey traditional and recent advances in IoT application layer protocols,as well as relevant real-time applications and their adapted application layer protocols for improving performance.As changing the nature of protocols for each application is unrealistic,machine learning offers means of making protocols intelligent and is able to adapt dynamically.In this context,we focus on providing open challenges to drive IoT application layer protocols in such a direction.展开更多
The purpose of this paper(presented online as a keynote lecture at the 25th Annual Indonesian Geotechnical Conference on 10 Nov 2021)is to broadly conceptualize the agenda for data-centric geotechnics,an emerging fiel...The purpose of this paper(presented online as a keynote lecture at the 25th Annual Indonesian Geotechnical Conference on 10 Nov 2021)is to broadly conceptualize the agenda for data-centric geotechnics,an emerging field that attempts to prepare geotechnical engineering for digital transformation.The agenda must include(1)development of methods that make sense of all real-world data(not selective input data for a physical model),(2)offering insights of significant value to critical real-world decisions for current or future practice(not decisions for an ideal world or decisions of minor concern to geotechnical engineers),and(3)sensitivity to the physical context of geotechnics(not abstract data-driven analysis connected to geotechnics in a peripheral way,i.e.,engagement with the knowledge and experience base should be substantial).These three elements are termed“data centricity”,“fit for(and transform)practice”,and“geotechnical context”in the agenda.Given that a knowledge of the site is central to any geotechnical engineering project,datadriven site characterization(DDSC)must constitute one key application domain in data-centric geotechnics,although other infrastructure lifecycle phases such as project conceptualization,design,construction,operation,and decommission/reuse would benefit from data-informed decision support as well.One part of DDSC that addresses numerical soil data in a site investigation report and soil property databases is pursued under Project DeepGeo.In principle,the source of data can also go beyond site investigation,and the type of data can go beyond numbers,such as categorical data,text,audios,images,videos,and expert opinion.The purpose of Project DeepGeo is to produce a 3D stratigraphic map of the subsurface volume below a full-scale project site and to estimate relevant engineering properties at each spatial point based on actual site investigation data and other relevant Big Indirect Data(BID).Uncertainty quantification is necessary,as current real-world data is insufficient,incomplete,and/or not directly relevant to construct a deterministic map.The value of a deterministic map for decision support is debatable.The computational cost to do this for a 3D true scale subsurface volume must be reasonable.Ultimately,geotechnical structures need to be a part of a completely smart infrastructure that fits the circular economy and need to focus on delivering service to end-users and the community from project conceptualization to decommission/reuse with full integration to smart city and smart society.Although current geotechnical practice has been very successful in taking“calculated risk”informed by limited data,imperfect theories,prototype testing,observations,among others and exercising judicious caution and engineering judgment,there is no clear pathway forward to leverage on big data and digital technologies such as machine learning,BIM,and digital twin to meet more challenging needs such as sustainability and resilience engineering.展开更多
The recent evolution of the Internet towards "Information-centric" transfer modes has renewed the interest in exploiting proxies to enhance seamless mobility. In this work, we focus on the case of multiple l...The recent evolution of the Internet towards "Information-centric" transfer modes has renewed the interest in exploiting proxies to enhance seamless mobility. In this work, we focus on the case of multiple levels of proxies in ICN architectures, in which content requests from mobile subscribers and the corresponding items are proactively cached to these proxies at different levels. Specifically, we present a multiple-level proactive caching model that selects the appropriate subset of proxies at different levels and supports distributed online decision procedures in terms of the tradeoff between delay and cache cost. We show via extensive simulations the reduction of up to 31.63% in the total cost relative to Full Caching, in which caching in all 1-level neighbor proxies is performed, and up to 84.21% relative to No Caching, in which no caching is used. Moreover, the proposed model outperforms other approaches with a flat cache structure in terms of the total cost.展开更多
The internet’s architecture today has a problem in routing information depending on what the receiver is interested in without knowing the sender and receiver addresses. As a result, the Publish-Subscribe paradigm wa...The internet’s architecture today has a problem in routing information depending on what the receiver is interested in without knowing the sender and receiver addresses. As a result, the Publish-Subscribe paradigm was developed. In this network, we build and use in the design, implementation, and evaluation of Publish-Subscribe network via destination driven multicast routing algorithm for selecting the shortest path in the network. Basically, the networks have Router to perform routing mechanism, the publisher is the producer of information, and the subscriber is the consumer of information with their own deferent type of module for facilitating their function. Every connection in the network is bidirectional way of communication (an undirected graph) with random seed available in the network. Each Router has topology management module for creating a picture of the networks and computing the available path. It informs to the forwarder in order to send the information of network for intended receiver. Record table module used for recording of the network information comes from the subscriber or the publisher via link state advertisement then it informs to the topology manager. In this network, the receiver and sender don’t expect to be active at the same time, don’t know each other’s addresses, and don’t use any blocking mechanisms or client send requests and server replay responses. The Publish-Subscribe network is first developed and designed enough, after which it is implemented and evaluated using the destination driven multicast routing algorithm (DDMC) to pick the shortest path in the network and active match published information. The proposed work evaluated via total bit (produced 1,000,000 bit per second), and throughput was 83.33%.展开更多
With the surge of big data applications and the worsening of the memory-wall problem,the memory system,instead of the computing unit,becomes the commonly recognized major concern of computing.However,this“memorycent...With the surge of big data applications and the worsening of the memory-wall problem,the memory system,instead of the computing unit,becomes the commonly recognized major concern of computing.However,this“memorycentric”common understanding has a humble beginning.More than three decades ago,the memory-bounded speedup model is the first model recognizing memory as the bound of computing and provided a general bound of speedup and a computing-memory trade-off formulation.The memory-bounded model was well received even by then.It was immediately introduced in several advanced computer architecture and parallel computing textbooks in the 1990’s as a must-know for scalable computing.These include Prof.Kai Hwang’s book“Scalable Parallel Computing”in which he introduced the memory-bounded speedup model as the Sun-Ni’s Law,parallel with the Amdahl’s Law and the Gustafson’s Law.Through the years,the impacts of this model have grown far beyond parallel processing and into the fundamental of computing.In this article,we revisit the memory-bounded speedup model and discuss its progress and impacts in depth to make a unique contribution to this special issue,to stimulate new solutions for big data applications,and to promote data-centric thinking and rethinking.展开更多
Modern High-Performance Computing(HPC)systems are adding extra layers to the memory and storage hierarchy,named deep memory and storage hierarchy(DMSH),to increase I/O performance.New hardware technologies,such as NVM...Modern High-Performance Computing(HPC)systems are adding extra layers to the memory and storage hierarchy,named deep memory and storage hierarchy(DMSH),to increase I/O performance.New hardware technologies,such as NVMe and SSD,have been introduced in burst buffer installations to reduce the pressure for external storage and boost the burstiness of modern I/O systems.The DMSH has demonstrated its strength and potential in practice.However,each layer of DMSH is an independent heterogeneous system and data movement among more layers is significantly more complex even without considering heterogeneity.How to efficiently utilize the DMSH is a subject of research facing the HPC community.Further,accessing data with a high-throughput and low-latency is more imperative than ever.Data prefetching is a well-known technique for hiding read latency by requesting data before it is needed to move it from a high-latency medium(e.g.,disk)to a low-latency one(e.g.,main memory).However,existing solutions do not consider the new deep memory and storage hierarchy and also suffer from under-utilization of prefetching resources and unnecessary evictions.Additionally,existing approaches implement a client-pull model where understanding the application's I/O behavior drives prefetching decisions.Moving towards exascale,where machines run multiple applications concurrently by accessing files in a workflow,a more data-centric approach resolves challenges such as cache pollution and redundancy.In this paper,we present the design and implementation of Hermes:a new,heterogeneous-aware,multi-tiered,dynamic,and distributed I/O buffering system.Hermes enables,manages,supervises,and,in some sense,extends I/O buffering to fully integrate into the DMSH.We introduce three novel data placement policies to efficiently utilize all layers and we present three novel techniques to perform memory,metadata,and communication management in hierarchical buffering systems.Additionally,we demonstrate the benefits of a truly hierarchical data prefetcher that adopts a server-push approach to data prefetching.Our evaluation shows that,in addition to automatic data movement through the hierarchy,Hermes can significantly accelerate I/O and outperforms by more than 2x state-of-the-art buffering platforms.Lastly,results show 10%-35%performance gains over existing prefetchers and over 50%when compared to systems with no prefetching.展开更多
文摘Connected vehicles for safety and traffic efficient applications require device-to-device connections supporting one-to-many and many-to-many communication, precise absolute and relative positioning and distributed computing. Currently, the 5.9 GHz Dedicated Short Range Communications (DSRC) and 4G-Long-Term Evolution (LTE) are available for connected vehicle services. But both have limitations in reliability or latency over large scale field operational tests and deployment. This paper proposes the device-to-device (D2D) connectivity framework based on publish-subscribe architecture, with Message Queue Telemetry Transport (MQTT) protocol. With the publish-subscribe communication paradigm, road mobile users can exchange data and information in moderate latency and high reliability manner, having the potential to support many Vehicle to Everything (V2X) applications, including vehicle to vehicle (V2V), vehicle to roadside infrastructure (V2I), and vehicle to bicycle (V2B). The D2D data exchanges also facilitate computing for absolute and relative precise real-time kinematic (RTK) posi-tioning. Vehicular experiments were conducted to evaluate the performance of the proposed publish-subscribe MQTT protocols in term of latency and reliability. The latency of data exchanges is measured by One-trip-time (OTT) and the reliability is measured by the packet loss rate (PLR). Our results show that the latency of GNSS raw data exchanges between vehicles through 4G cellular networks at the rate of 10 Hz and the data rates of 10 kbps are less than 300 ms while the reliability is over 96%. Vehicular positioning experiments have also shown that vehicles can exchange raw GNSS data and complete mov-ing-base RTK positioning with the positioning availability of 98%.
基金the National Social Science Fund of China(Grant No.22CTQ031)Special Project on Library Capacity Building of the Chinese Academy of Sciences(Grant No.E2290431).
文摘Research data infrastructures form the cornerstone in both cyber and physical spaces,driving the progression of the data-intensive scientific research paradigm.This opinion paper presents an overview of global research data infrastructure,drawing insights from national roadmaps and strategic documents related to research data infrastructure.It emphasizes the pivotal role of research data infrastructures by delineating four new missions aimed at positioning them at the core of the current scientific research and communication ecosystem.The four new missions of research data infrastructures are:(1)as a pioneer,to transcend the disciplinary border and address complex,cutting-edge scientific and social challenges with problem-and data-oriented insights;(2)as an architect,to establish a digital,intelligent,flexible research and knowledge services environment;(3)as a platform,to foster the high-end academic communication;(4)as a coordinator,to balance scientific openness with ethics needs.
文摘A significant obstacle in intelligent transportation systems(ITS)is the capacity to predict traffic flow.Recent advancements in deep neural networks have enabled the development of models to represent traffic flow accurately.However,accurately predicting traffic flow at the individual road level is extremely difficult due to the complex interplay of spatial and temporal factors.This paper proposes a technique for predicting short-term traffic flow data using an architecture that utilizes convolutional bidirectional long short-term memory(Conv-BiLSTM)with attention mechanisms.Prior studies neglected to include data pertaining to factors such as holidays,weather conditions,and vehicle types,which are interconnected and significantly impact the accuracy of forecast outcomes.In addition,this research incorporates recurring monthly periodic pattern data that significantly enhances the accuracy of forecast outcomes.The experimental findings demonstrate a performance improvement of 21.68%when incorporating the vehicle type feature.
文摘Artificial intelligence(AI)relies on data and algorithms.State-of-the-art(SOTA)AI smart algorithms have been developed to improve the performance of AI-oriented structures.However,model-centric approaches are limited by the absence of high-quality data.Data-centric AI is an emerging approach for solving machine learning(ML)problems.It is a collection of various data manipulation techniques that allow ML practitioners to systematically improve the quality of the data used in an ML pipeline.However,data-centric AI approaches are not well documented.Researchers have conducted various experiments without a clear set of guidelines.This survey highlights six major data-centric AI aspects that researchers are already using to intentionally or unintentionally improve the quality of AI systems.These include big data quality assessment,data preprocessing,transfer learning,semi-supervised learning,machine learning operations(MLOps),and the effect of adding more data.In addition,it highlights recent data-centric techniques adopted by ML practitioners.We addressed how adding data might harm datasets and how HoloClean can be used to restore and clean them.Finally,we discuss the causes of technical debt in AI.Technical debt builds up when software design and implementation decisions run into“or outright collide with”business goals and timelines.This survey lays the groundwork for future data-centric AI discussions by summarizing various data-centric approaches.
基金supported by the National High-tech R&D Program("863"Program)of China(No.2013AA013505)the National Science Foundation of China(No.61472213)State Scholarship Fund from China Scholarship Council(No.201406210270)
文摘The basic function of the Internet is to delivery data(what) to serve the needs of all applications. IP names the attachment points(where) to facilitate ubiquitous interconnectivity as the current way to deliver data. The fundamental mismatch between data delivery and naming attachment points leads to a lot of challenges, e.g., mapping from data name to IP address, handling dynamics of underlying topology, scaling up the data distribution, and securing communication, etc. Informationcentric networking(ICN) is proposed to shift the focus of communication paradigm from where to what, by making the named data the first-class citizen in the network, The basic consensus of ICN is to name the data independent from its container(space dimension) and session(time dimension), which breaks the limitation of point-to-point IP semantic. It scales up data distribution by utilizing available resources, and facilitates communication to fit diverse connectivity and heterogeneous networks. However, there are only a few consensuses on the detailed design of ICN, and quite a few different ICN architectures are proposed. This paper reveals the rationales of ICN from the perspective of the Internet evolution, surveys different design choices, and discusses on two debatable topics in ICN, i.e.,self-certifying versus hierarchical names, and edge versus pervasive caching. We hope this survey helps clarify some mis-understandings on ICN and achieve more consensuses.
基金supported by the Austrian Ministry for Transport,Innovation and Technology(BMVIT)the Austrian Research Promotion Agency(FFG)under Grant No.849902the Austrian Climate and Energy Fund(KLIEN)under Grant No.846141
文摘Automation has arrived in the low voltage grid domain. In the next few years, the secondary substation—at the barriers of medium and low voltage grids—will thus be upgraded to enable novel functions. In this paper, we present various smart grid applications running on such intelligent secondary substations(iSSN) including their interaction with each other. We integrate energy consumption and production data, as well as forecasts, sensed from the smart buildings’ energy management systems(BEMSs) into the operation of the low voltage grid. A suitable framework for those modular applications includes features to initiate their installation, update, removal, the remote operator site, and not requiring staff on-site for such typical reappearing maintenance tasks.
基金The authors would like to thank DST(SERB),Government of India for grant No.EEQ/2018/000888The work was also supported by the Archimedes Foundation under the Dora plus Grant 11-15/OO/11476We also acknowledge financial support to UoH-IoE by MHRD(F11/9/2019-U3(A)).
文摘The Internet of Things(IoT)has been growing over the past few years due to its flexibility and ease of use in real-time applications.The IoT's foremost task is ensuring that there is proper communication among different types of applications and devices,and that the application layer protocols fulfill this necessity.However,as the number of applications grows,it is necessary to modify or enhance the application layer protocols according to specific IoT applications,allowing specific issues to be addressed,such as dynamic adaption to network conditions and interoperability.Recently,several IoT application layer protocols have been enhanced and modified according to application requirements.However,no existing survey articles focus on these protocols.In this article,we survey traditional and recent advances in IoT application layer protocols,as well as relevant real-time applications and their adapted application layer protocols for improving performance.As changing the nature of protocols for each application is unrealistic,machine learning offers means of making protocols intelligent and is able to adapt dynamically.In this context,we focus on providing open challenges to drive IoT application layer protocols in such a direction.
文摘The purpose of this paper(presented online as a keynote lecture at the 25th Annual Indonesian Geotechnical Conference on 10 Nov 2021)is to broadly conceptualize the agenda for data-centric geotechnics,an emerging field that attempts to prepare geotechnical engineering for digital transformation.The agenda must include(1)development of methods that make sense of all real-world data(not selective input data for a physical model),(2)offering insights of significant value to critical real-world decisions for current or future practice(not decisions for an ideal world or decisions of minor concern to geotechnical engineers),and(3)sensitivity to the physical context of geotechnics(not abstract data-driven analysis connected to geotechnics in a peripheral way,i.e.,engagement with the knowledge and experience base should be substantial).These three elements are termed“data centricity”,“fit for(and transform)practice”,and“geotechnical context”in the agenda.Given that a knowledge of the site is central to any geotechnical engineering project,datadriven site characterization(DDSC)must constitute one key application domain in data-centric geotechnics,although other infrastructure lifecycle phases such as project conceptualization,design,construction,operation,and decommission/reuse would benefit from data-informed decision support as well.One part of DDSC that addresses numerical soil data in a site investigation report and soil property databases is pursued under Project DeepGeo.In principle,the source of data can also go beyond site investigation,and the type of data can go beyond numbers,such as categorical data,text,audios,images,videos,and expert opinion.The purpose of Project DeepGeo is to produce a 3D stratigraphic map of the subsurface volume below a full-scale project site and to estimate relevant engineering properties at each spatial point based on actual site investigation data and other relevant Big Indirect Data(BID).Uncertainty quantification is necessary,as current real-world data is insufficient,incomplete,and/or not directly relevant to construct a deterministic map.The value of a deterministic map for decision support is debatable.The computational cost to do this for a 3D true scale subsurface volume must be reasonable.Ultimately,geotechnical structures need to be a part of a completely smart infrastructure that fits the circular economy and need to focus on delivering service to end-users and the community from project conceptualization to decommission/reuse with full integration to smart city and smart society.Although current geotechnical practice has been very successful in taking“calculated risk”informed by limited data,imperfect theories,prototype testing,observations,among others and exercising judicious caution and engineering judgment,there is no clear pathway forward to leverage on big data and digital technologies such as machine learning,BIM,and digital twin to meet more challenging needs such as sustainability and resilience engineering.
基金supported by National Natural Science Foundation of China (Grant Nos. 61302078 and 61372108)National High-tech R&D Program of China (863 Program) (Grant Nos. 2011AA01A102)+1 种基金National S&T Major Project (Grant Nos. 2011ZX 03005-004-02)Beijing Higher Education Young Elite Teacher Project (Grant Nos. YETP0476)
文摘The recent evolution of the Internet towards "Information-centric" transfer modes has renewed the interest in exploiting proxies to enhance seamless mobility. In this work, we focus on the case of multiple levels of proxies in ICN architectures, in which content requests from mobile subscribers and the corresponding items are proactively cached to these proxies at different levels. Specifically, we present a multiple-level proactive caching model that selects the appropriate subset of proxies at different levels and supports distributed online decision procedures in terms of the tradeoff between delay and cache cost. We show via extensive simulations the reduction of up to 31.63% in the total cost relative to Full Caching, in which caching in all 1-level neighbor proxies is performed, and up to 84.21% relative to No Caching, in which no caching is used. Moreover, the proposed model outperforms other approaches with a flat cache structure in terms of the total cost.
文摘The internet’s architecture today has a problem in routing information depending on what the receiver is interested in without knowing the sender and receiver addresses. As a result, the Publish-Subscribe paradigm was developed. In this network, we build and use in the design, implementation, and evaluation of Publish-Subscribe network via destination driven multicast routing algorithm for selecting the shortest path in the network. Basically, the networks have Router to perform routing mechanism, the publisher is the producer of information, and the subscriber is the consumer of information with their own deferent type of module for facilitating their function. Every connection in the network is bidirectional way of communication (an undirected graph) with random seed available in the network. Each Router has topology management module for creating a picture of the networks and computing the available path. It informs to the forwarder in order to send the information of network for intended receiver. Record table module used for recording of the network information comes from the subscriber or the publisher via link state advertisement then it informs to the topology manager. In this network, the receiver and sender don’t expect to be active at the same time, don’t know each other’s addresses, and don’t use any blocking mechanisms or client send requests and server replay responses. The Publish-Subscribe network is first developed and designed enough, after which it is implemented and evaluated using the destination driven multicast routing algorithm (DDMC) to pick the shortest path in the network and active match published information. The proposed work evaluated via total bit (produced 1,000,000 bit per second), and throughput was 83.33%.
基金supported in part by the U.S.National Science Foundation under Grant Nos.CCF-2029014 and CCF-2008907.
文摘With the surge of big data applications and the worsening of the memory-wall problem,the memory system,instead of the computing unit,becomes the commonly recognized major concern of computing.However,this“memorycentric”common understanding has a humble beginning.More than three decades ago,the memory-bounded speedup model is the first model recognizing memory as the bound of computing and provided a general bound of speedup and a computing-memory trade-off formulation.The memory-bounded model was well received even by then.It was immediately introduced in several advanced computer architecture and parallel computing textbooks in the 1990’s as a must-know for scalable computing.These include Prof.Kai Hwang’s book“Scalable Parallel Computing”in which he introduced the memory-bounded speedup model as the Sun-Ni’s Law,parallel with the Amdahl’s Law and the Gustafson’s Law.Through the years,the impacts of this model have grown far beyond parallel processing and into the fundamental of computing.In this article,we revisit the memory-bounded speedup model and discuss its progress and impacts in depth to make a unique contribution to this special issue,to stimulate new solutions for big data applications,and to promote data-centric thinking and rethinking.
基金This work is funded by the National Science Foundation of USA under Grants Nos.OCI-1835764 and CSR-1814872。
文摘Modern High-Performance Computing(HPC)systems are adding extra layers to the memory and storage hierarchy,named deep memory and storage hierarchy(DMSH),to increase I/O performance.New hardware technologies,such as NVMe and SSD,have been introduced in burst buffer installations to reduce the pressure for external storage and boost the burstiness of modern I/O systems.The DMSH has demonstrated its strength and potential in practice.However,each layer of DMSH is an independent heterogeneous system and data movement among more layers is significantly more complex even without considering heterogeneity.How to efficiently utilize the DMSH is a subject of research facing the HPC community.Further,accessing data with a high-throughput and low-latency is more imperative than ever.Data prefetching is a well-known technique for hiding read latency by requesting data before it is needed to move it from a high-latency medium(e.g.,disk)to a low-latency one(e.g.,main memory).However,existing solutions do not consider the new deep memory and storage hierarchy and also suffer from under-utilization of prefetching resources and unnecessary evictions.Additionally,existing approaches implement a client-pull model where understanding the application's I/O behavior drives prefetching decisions.Moving towards exascale,where machines run multiple applications concurrently by accessing files in a workflow,a more data-centric approach resolves challenges such as cache pollution and redundancy.In this paper,we present the design and implementation of Hermes:a new,heterogeneous-aware,multi-tiered,dynamic,and distributed I/O buffering system.Hermes enables,manages,supervises,and,in some sense,extends I/O buffering to fully integrate into the DMSH.We introduce three novel data placement policies to efficiently utilize all layers and we present three novel techniques to perform memory,metadata,and communication management in hierarchical buffering systems.Additionally,we demonstrate the benefits of a truly hierarchical data prefetcher that adopts a server-push approach to data prefetching.Our evaluation shows that,in addition to automatic data movement through the hierarchy,Hermes can significantly accelerate I/O and outperforms by more than 2x state-of-the-art buffering platforms.Lastly,results show 10%-35%performance gains over existing prefetchers and over 50%when compared to systems with no prefetching.