期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
Hypergraph-Based Data Reduced Scheduling Policy for Data-Intensive Workflow in Clouds
1
作者 Zhigang Hu Jia Li +4 位作者 Meiguang Zheng Xinxin Zhang Hui Kang Yong Tao Jiao Yang 《国际计算机前沿大会会议论文集》 2017年第2期80-82,共3页
Data-intensive computing is expected to be the next-generation IT computing paradigm. Data-intensive workflows in clouds are becoming more and more popular. How to schedule data-intensive workflow efficiently has beco... Data-intensive computing is expected to be the next-generation IT computing paradigm. Data-intensive workflows in clouds are becoming more and more popular. How to schedule data-intensive workflow efficiently has become the key issue. In this paper, first, we build a directed hypergraph model for data-intensive workflow, since Hypergraphs can more accurately model communication volume and better represent asymmetric problems, and the cut metric of hypergraphs is well suited for minimizing the total volume of communication.Second, we propose a concept data supportive ability to help the presentation of data-intensive workflow application and provide the merge operation details considering the data supportive ability. Third, we present an optimized hypergraph multi-level partitioning algorithm. Finally we bring a data reduced scheduling policy HEFT-P for data-intensive workflow. Through simulation,we compare HEFT-P with three typical workflow scheduling policies.The results indicate that HEFT-P could obtain reduced data scheduling and reduce the makespan of executing data-intensive 展开更多
关键词 data-intensive WORKFLOW Directed HYPERGRAPH DATA REDUCED scheduling Cloud computing
下载PDF
An Energy-Aware Heuristic Scheduling for Data-Intensive Workflows in Virtualized Datacenters
2
作者 肖鹏 胡志刚 张艳平 《Journal of Computer Science & Technology》 SCIE EI CSCD 2013年第6期948-961,共14页
With the development of cloud computing, more and more data-intensive workflows have been deployed on virtualized datacenters. As a result, the energy spent on massive data accessing grows rapidly. In this paper, an e... With the development of cloud computing, more and more data-intensive workflows have been deployed on virtualized datacenters. As a result, the energy spent on massive data accessing grows rapidly. In this paper, an energy-aware scheduling algorithm is proposed, which introduces a novel heuristic called Minimal Data-Accessing Energy Path for scheduling data-intensive workflows aiming to reduce the energy consumption of intensive data accessing. Extensive experiments based on both synthetical and real workloads are conducted to investigate the effectiveness and performance of the proposed scheduling approach. The experimental results show that the proposed heuristic scheduling can significantly reduce the energy consumption of storing/retrieving intermediate data generated during the execution of data-intensive workflow. In addition, it exhibits better robustness than existing algorithms when cloud systems are in presence of I/O- intensive workloads. 展开更多
关键词 cloud computing energy efficient heuristic scheduling data-intensive workfiow
原文传递
Methodology and Trends of Linguistic Research in the Era of Big Data
3
作者 Liu Haitao Lin Yanni 《Contemporary Social Sciences》 2020年第4期87-106,共20页
This paper presents methodology and trends of linguistic research in the era of big data.We begin with a discussion of the role of linguists in the information society and illustrate the opportunities and challenges l... This paper presents methodology and trends of linguistic research in the era of big data.We begin with a discussion of the role of linguists in the information society and illustrate the opportunities and challenges linguists are currently facing.After highlighting the significance of authentic data on linguistic research,we argue that language is a complex adaptive system driven by humans.Then,from the perspective of philosophy of science,we introduce the research paradigm of quantitative linguistics through several cases.Finally,we discuss how China’s linguistic research will benefit from the data-intensive approach in terms of scientification and internationalization. 展开更多
关键词 LINGUISTICS big data the data-intensive approach scientific research paradigm
下载PDF
Big Earth Data:a new challenge and opportunity for Digital Earth’s development 被引量:7
4
作者 Huadong Guo Zhen Liu +3 位作者 Hao Jiang Changlin Wang Jie Liu Dong Liang 《International Journal of Digital Earth》 SCIE EI 2017年第1期1-12,共12页
Digital Earth has seen great progress during the last 19 years.When it entered into the era of big data,Digital Earth developed into a new stage,namely one characterized by‘Big Earth Data’,confronting new challenges... Digital Earth has seen great progress during the last 19 years.When it entered into the era of big data,Digital Earth developed into a new stage,namely one characterized by‘Big Earth Data’,confronting new challenges and opportunities.In this paper we give an overview of the development of Digital Earth by summarizing research achievements and marking the milestones of Digital Earth’s development.Then,the opportunities and challenges that Big Earth Data faces are discussed.As a data-intensive scientific research approach,Big Earth Data provides a new vision and methodology to Earth sciences,and the paper identifies the advantages of Big Earth Data to scientific research,especially in knowledge discovery and global change research.We believe that Big Earth Data will advance and promote the development of Digital Earth. 展开更多
关键词 Digital Earth Big Earth Data data-intensive science knowledge discovery global change
原文传递
Task-Aware Flow Scheduling with Heterogeneous Utility Characteristics for Data Center Networks 被引量:2
5
作者 Fang Dong Xiaolin Guo +1 位作者 Pengcheng Zhou Dian Shen 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2019年第4期400-411,共12页
With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their per... With the continuous enrichment of cloud services, an increasing number of applications are being deployed in data centers. These emerging applications are often communication-intensive and data-parallel, and their performance is closely related to the underlying network. With their distributed nature, the applications consist of tasks that involve a collection of parallel flows. Traditional techniques to optimize flow-level metrics are agnostic to task-level requirements, leading to poor application-level performance. In this paper, we address the heterogeneous task-level requirements of applications and propose task-aware flow scheduling. First, we model tasks' sensitivity to their completion time by utilities. Second, on the basis of Nash bargaining theory, we establish a flow scheduling model with heterogeneous utility characteristics, and analyze it using Lagrange multiplier method and KKT condition. Third, we propose two utility-aware bandwidth allocation algorithms with different practical constraints. Finally, we present Tasch, a system that enables tasks to maintain high utilities and guarantees the fairness of utilities. To demonstrate the feasibility of our system, we conduct comprehensive evaluations with realworld traffic trace. Communication stages complete up to 1.4 faster on average, task utilities increase up to 2.26,and the fairness of tasks improves up to 8.66 using Tasch in comparison to per-flow mechanisms. 展开更多
关键词 data center networks coflow FLOW SCHEDULING data-intensive applications
原文传递
The fourth scientific discovery paradigm for precision medicine and healthcare:Challenges ahead 被引量:4
6
作者 Li Shen Jinwei Bai +1 位作者 Jiao Wang Bairong Shen 《Precision Clinical Medicine》 2021年第2期80-84,共5页
With the progression of modern information techniques,such as next generation sequencing(NGS),Internet of Everything(IoE)based smart sensors,and artificial intelligence algorithms,data-intensive research and applicati... With the progression of modern information techniques,such as next generation sequencing(NGS),Internet of Everything(IoE)based smart sensors,and artificial intelligence algorithms,data-intensive research and applications are emerging as the fourth paradigm for scientific discovery.However,we facemany challenges to practical application of this paradigm.In this article,10 challenges to data-intensive discovery and applications in precision medicine and healthcare are summarized and the future perspectives on next generation medicine are discussed. 展开更多
关键词 data-intensive scientific discovery the fourth paradigm biomedical data diversity precision medicine and healthcare
原文传递
Big Earth Data:a comprehensive analysis of visualization analytics issues 被引量:2
7
作者 Patrick Merritt Haixia Bi +2 位作者 Bradley Davis Christopher Windmill Yong Xue 《Big Earth Data》 EI 2018年第4期321-350,共30页
Big Earth Data analysis is a complex task requiring the integration of many skills and technologies.This paper provides a comprehensive review of the technology and terminology within the Big Earth Data problem space ... Big Earth Data analysis is a complex task requiring the integration of many skills and technologies.This paper provides a comprehensive review of the technology and terminology within the Big Earth Data problem space and presents examples of state-of-the-art projects in each major branch of Big Earth Data research.Current issues within Big Earth Data research are highlighted and potential future solutions identified. 展开更多
关键词 Digital earth Big Earth Data data-intensive science knowledge discovery global change
原文传递
Mochi: Composing Data Services for High-Performance Computing Environments
8
作者 Robert BRoss George Amvrosiadis +14 位作者 Philip Carns Charles DCranor Matthieu Dorier Kevin Harms Greg Ganger Garth Gibson Samuel KGutierrez Robert Latham Bob Robey Dana Robinson Bradley Settlemyer Galen Shipman Shane Snyder Jerome Soumagne Qing Zheng 《Journal of Computer Science & Technology》 SCIE EI CSCD 2020年第1期121-144,共24页
Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platfo... Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work. 展开更多
关键词 STORAGE and I/O data-intensive COMPUTING distributed SERVICES HIGH-PERFORMANCE COMPUTING
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部