The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure ...The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.展开更多
Ontology is the conceptual backbone that provides meaning to data on the semantic web. However, ontology is not a static resource and may evolve over time, which often leaves the meaning of data in an undefined or inc...Ontology is the conceptual backbone that provides meaning to data on the semantic web. However, ontology is not a static resource and may evolve over time, which often leaves the meaning of data in an undefined or inconsistent state. It is thus very important to have a method to preserve the data and its meaning when ontology changes. This paper proposed a general method that solves the problem using data migration. It analyzed some of the issues in the method including separation of ontology and data, migration specification, migration result and migration algorithm. The paper also instantiates the general mothod in RDF(S) as an example. The RDF(S) example itself is a simple but complete method for migrating RDF data when RDFS ontology changes.展开更多
Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relationa...Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relational database toNoSQL database. A number of schema transformation techniques have been proposed to improve data transformation process and resulted better query processingtime when compared to the relational database query processing time. However,these approaches produced redundant tables in the resulted schema that in turnconsume large unnecessary storage size and produce high query processing timedue to the generated schema with redundant column families in the transformedcolumn-oriented database. In this paper, an efficient data transformation techniquefrom relational database to column-oriented database is proposed. The proposedschema transformation technique is based on the combination of denormalizationapproach, data access pattern and multiple-nested schema. In order to validate theproposed work, the proposed technique is implemented by transforming data fromMySQL database to MongoDB database. A benchmark transformation techniqueis also performed in which the query processing time and the storage size arecompared. Based on the experimental results, the proposed transformation technique showed significant improvement in terms query processing time and storagespace usage due to the reduced number of column families in the column-orienteddatabase.展开更多
Hybrid memory systems composed of dynamic random access memory(DRAM)and Non-volatile memory(NVM)often exploit page migration technologies to fully take the advantages of different memory media.Most previous proposals ...Hybrid memory systems composed of dynamic random access memory(DRAM)and Non-volatile memory(NVM)often exploit page migration technologies to fully take the advantages of different memory media.Most previous proposals usually migrate data at a granularity of 4 KB pages,and thus waste memory bandwidth and DRAM resource.In this paper,we propose Mocha,a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically,but manages them in a cache/memory hierarchy.Since the commercial NVM device-Intel Optane DC Persistent Memory Modules(DCPMM)actually access the physical media at a granularity of 256 bytes(an Optane block),we manage the DRAM cache at the 256-byte size to adapt to this feature of Optane.This design not only enables fine-grained data migration and management for the DRAM cache,but also avoids write amplification for Intel Optane DCPMM.We also create an Indirect Address Cache(IAC)in Hybrid Memory Controller(HMC)and propose a reverse address mapping table in the DRAM to speed up address translation and cache replacement.Moreover,we exploit a utility-based caching mechanism to filter cold blocks in the NVM,and further improve the efficiency of the DRAM cache.We implement Mocha in an architectural simulator.Experimental results show that Mocha can improve application performance by 8.2%on average(up to 24.6%),reduce 6.9%energy consumption and 25.9%data migration traffic on average,compared with a typical hybrid memory architecture-HSCC.展开更多
For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and dura...For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.展开更多
Vendor lock-in can occur at any layer of the cloud stack-Infrastructure,Platform,and Software-as-a-service.This paper covers the vendor lock-in issue at Platform as a Service(PaaS)level where applications can be creat...Vendor lock-in can occur at any layer of the cloud stack-Infrastructure,Platform,and Software-as-a-service.This paper covers the vendor lock-in issue at Platform as a Service(PaaS)level where applications can be created,deployed,and managed without worrying about the underlying infrastructure.These applications and their persisted data on one PaaS provider are not easy to port to another provider.To overcome this issue,we propose a middleware to abstract and make the database services as cloud-agnostic.The middleware supports several SQL and NoSQL data stores that can be hosted and ported among disparate PaaS providers.It facilitates the developers with data portability and data migration among relational and NoSQL-based cloud databases.NoSQL databases are fundamental to endure Big Data applications as they support the handling of an enormous volume of highly variable data while assuring fault tolerance,availability,and scalability.The implementation of the middleware depicts that using it alleviates the efforts of rewriting the application code while changing the backend database system.A working protocol of a migration tool has been developed using this middleware to facilitate the migration of the database(move existing data from a database on one cloud to a new database even on a different cloud).Although the middleware adds some overhead compared to the native code for the cloud services being used,the experimental evaluation on Twitter(a Big Data application)data set,proves this overhead is negligible.展开更多
Since December 2019,the COVID-19 epidemic has repeatedly hit countries around the world due to various factors such as trade,national policies and the natural environment.To closely monitor the emergence of new COVID-...Since December 2019,the COVID-19 epidemic has repeatedly hit countries around the world due to various factors such as trade,national policies and the natural environment.To closely monitor the emergence of new COVID-19 clusters and ensure high prediction accuracy,we develop a new prediction framework for studying the spread of epidemic on networks based on partial differential equations(PDEs),which captures epidemic diffusion along the edges of a network driven by population flow data.In this paper,we focus on the effect of the population movement on the spread of COVID-19 in several cities from different geographic regions in China for describing the transmission characteristics of COVID-19.Experiment results show that the PDE model obtains relatively good prediction results compared with several typical mathematical models.Furthermore,we study the effectiveness of intervention measures,such as traffic lockdowns and social distancing,which provides a new approach for quantifying the effectiveness of the government policies toward controlling COVID-19 via the adaptive parameters of the model.To our knowledge,this work is the first attempt to apply the PDE model on networks with Baidu Migration Data for COVID-19 prediction.展开更多
The convergence of next-generation Networks and the emergence of new media systems have made media-rich digital libraries popular in application and research. The discovery of media content objects’ usage patterns, w...The convergence of next-generation Networks and the emergence of new media systems have made media-rich digital libraries popular in application and research. The discovery of media content objects’ usage patterns, where QPop Increment is the characteristic feature under study, is the basis of intelligent data migration scheduling, the very key issue for these systems to manage effectively the massive storage facilities in their backbones. In this paper, a clustering algorithm is established, on the basis of temporal segmentation of QPop Increment, so as to improve the mining performance. We employed the standard C-Means algorithm as the clustering kernel, and carried out the experimental mining process with segmented QPop Increases obtained in actual applications. The results indicated that the improved algorithm is more advantageous than the basic one in important indices such as the clustering cohesion. The experimental study in this paper is based on a Media Assets Library prototype developed for the use of the advertainment movie production project for Olympics 2008, under the support of both the Humanistic Olympics Study Center in Beijing, and China State Administration of Radio, Film and TV.展开更多
Developing a comprehensive understanding of inter-city interactions is crucial for regional planning.We therefore examined spatiotemporal patterns of population migration across the Qinghai-Tibet Plateau(QTP)using mig...Developing a comprehensive understanding of inter-city interactions is crucial for regional planning.We therefore examined spatiotemporal patterns of population migration across the Qinghai-Tibet Plateau(QTP)using migration big data from Tencent for the period between 2015 and 2019.We initially used decomposition and breakpoint detection methods to examine time-series migration data and to identify the two seasons with the strongest and weakest population migration levels,between June 18th and August 18th and between October 8th and February 15th,respectively.Population migration within the former period was 2.03 times that seen in the latter.We then used a variety of network analysis methods to examine population flow directions as well as the importance of each individual city in migration.The two capital cities on the QTP,Lhasa and Xining,form centers for population migration and are also transfer hubs through which migrants from other cities off the plateau enter and leave this region.Data show that these two cities contribute more than 35%of total population migration.The majority of migrants tend to move within the province,particularly during the weakest migration season.We also utilized interactive relationship force and radiation models to examine the interaction strength and the radiating energy of each individual city.Results show that Lhasa and Xining exhibit the strongest interactions with other cities and have the largest radiating energies.Indeed,the radiating energy of the QTP cities correlates with their gross domestic product(GDP)(Pearson correlation coefficient:0.754 in the weakest migration season,WMS versus 0.737 in the strongest migration season,SMS),while changes in radiating energy correlate with the tourism-related revenue(Pearson correlation coefficient:0.685).These outcomes suggest that level of economic development and level of tourism are the two most important factors driving the QTP population migration.The results of this analysis provide critical clarification guidance regarding huge QTP development differences.展开更多
The two concepts of“liudong renkou(floating population or FP)”and“renkou liudong(mobility of the floating population or MOFP)”,along with relevant data based on these two concepts,have long been used extensively i...The two concepts of“liudong renkou(floating population or FP)”and“renkou liudong(mobility of the floating population or MOFP)”,along with relevant data based on these two concepts,have long been used extensively in China’s research and policy making,playing a central role in Chinese studies of migration.Unlike the concepts of“migrant”and“migration”in the international literature,which are focused on people’s spatial mobility,“liudong renkou”and“renkou liudong”are identified and measured by the separation of one’s place of residence from one’s place of household registration(hukou),an approach inconsistent with relevant international practices.By analyzing various census data and data from the China Migrant Dynamic Survey(CMDS),this article examines the validity and reliability of these two concepts and the data based on them in the international context,revealing that they have become increasingly invalid and unreliable for the purpose of measuring migration events since China’s reform and opening up in the late 1970s.The results further demonstrate that these two concepts and the data based on them have become increasingly detached from real migration events and processes.They may become invalid by overestimating the volume of the mobile population,ineffective due to systematic omission of certain mobile populations(such as urban-urban migrants),or misleading as to the changing direction of migration flows.In addition,data on the floating population cannot be used to calculate migration rates and are not comparable in the international context.The concepts of“liudong renkou”and“renkou liudong”and data based on these two concepts may still need to be used in China for a long period of time due to the continuing existence of the hukou system and its roles in the provision of public services,social welfare and social security.However,we argue that concepts,measurements,and methods of data collection in research on migration in China should be gradually shifted to and focused on migrations as spatial events;further,transition data,based on an individual’s residence five years ago and one year ago,should be gradually adopted as the main data source and included in the short form of future censuses;additionally,migration event data based on population registration and administrative records should be used more fully,so that China’s migration research can be conducted on the solid basis of valid and reliable data sources.展开更多
With supercomputers developing towards exascale, the number of compute cores increases dramatically, making more complex and larger-scale applications possible. The input/output (I/O) requirements of large-scale app...With supercomputers developing towards exascale, the number of compute cores increases dramatically, making more complex and larger-scale applications possible. The input/output (I/O) requirements of large-scale applications, workflow applications, and their checkpointing include substantial bandwidth and an extremely low latency, posing a serious challenge to high performance computing (HPC) storage systems. Current hard disk drive (HDD) based underlying storage systems are becoming more and more incompetent to meet the requirements of next-generation exascale supercomputers. To rise to the challenge, we propose a hierarchical hybrid storage system, on-line and near-line file system (ONFS). It leverages dynamic random access memory (DRAM) and solid state drive (SSD) in compute nodes, and HDD in storage servers to build a three-level storage system in a unified namespace. It supports portable operating system interface (POSIX) semantics, and provides high bandwidth, low latency, and huge storage capacity. In this paper, we present the technical details on distributed metadata management, the strategy of memory borrow and return, data consistency, parallel access control, and mechanisms guiding downward and upward migration in ONFS. We implement an ONFS prototype on the TH-1A supercomputer, and conduct experiments to test its I/O performance and scalability. The results show that the bandwidths of single-thread and multi-thread 'read'/'write' are 6-fold and 5-fold better than HDD-based Lustre, respectively. The I/O bandwidth of data-intensive applications in ONFS can be 6.35 timcs that in Lustre.展开更多
With the development of computational power, there has been an increased focus on data-fitting related seismic inversion techniques for high fidelity seismic velocity model and image, such as full-waveform inversion a...With the development of computational power, there has been an increased focus on data-fitting related seismic inversion techniques for high fidelity seismic velocity model and image, such as full-waveform inversion and least squares migration. However, though more advanced than conventional methods, these data fitting methods can be very expensive in terms of computational cost. Recently, various techniques to optimize these data-fitting seismic inversion problems have been implemented to cater for the industrial need for much improved efficiency. In this study, we propose a general stochastic conjugate gradient method for these data-fitting related inverse problems. We first prescribe the basic theory of our method and then give synthetic examples. Our numerical experiments illustrate the potential of this method for large-size seismic inversion application.展开更多
文摘The inter-city linkage heat data provided by Baidu Migration is employed as a characterization of inter-city linkages in order to facilitate the study of the network linkage characteristics and hierarchical structure of urban agglomeration in the Greater Bay Area through the use of social network analysis method.This is the inaugural application of big data based on location services in the study of urban agglomeration network structure,which represents a novel research perspective on this topic.The study reveals that the density of network linkages in the Greater Bay Area urban agglomeration has reached 100%,indicating a mature network-like spatial structure.This structure has given rise to three distinct communities:Shenzhen-Dongguan-Huizhou,Guangzhou-Foshan-Zhaoqing,and Zhuhai-Zhongshan-Jiangmen.Additionally,cities within the Greater Bay Area urban agglomeration play different roles,suggesting that varying development strategies may be necessary to achieve staggered development.The study demonstrates that large datasets represented by LBS can offer novel insights and methodologies for the examination of urban agglomeration network structures,contingent on the appropriate mining and processing of the data.
文摘Ontology is the conceptual backbone that provides meaning to data on the semantic web. However, ontology is not a static resource and may evolve over time, which often leaves the meaning of data in an undefined or inconsistent state. It is thus very important to have a method to preserve the data and its meaning when ontology changes. This paper proposed a general method that solves the problem using data migration. It analyzed some of the issues in the method including separation of ontology and data, migration specification, migration result and migration algorithm. The paper also instantiates the general mothod in RDF(S) as an example. The RDF(S) example itself is a simple but complete method for migrating RDF data when RDFS ontology changes.
基金supported by Universiti Putra Malaysia Grant Scheme(Putra Grant)(GP/2020/9692500).
文摘Data transformation is the core process in migrating database from relational database to NoSQL database such as column-oriented database. However,there is no standard guideline for data transformation from relational database toNoSQL database. A number of schema transformation techniques have been proposed to improve data transformation process and resulted better query processingtime when compared to the relational database query processing time. However,these approaches produced redundant tables in the resulted schema that in turnconsume large unnecessary storage size and produce high query processing timedue to the generated schema with redundant column families in the transformedcolumn-oriented database. In this paper, an efficient data transformation techniquefrom relational database to column-oriented database is proposed. The proposedschema transformation technique is based on the combination of denormalizationapproach, data access pattern and multiple-nested schema. In order to validate theproposed work, the proposed technique is implemented by transforming data fromMySQL database to MongoDB database. A benchmark transformation techniqueis also performed in which the query processing time and the storage size arecompared. Based on the experimental results, the proposed transformation technique showed significant improvement in terms query processing time and storagespace usage due to the reduced number of column families in the column-orienteddatabase.
基金supported jointly by the National Key Research and Development Program of China (No.2022YFB4500303)the National Natural Science Foundation of China (NSFC) (Grant Nos.62072198,61832006,61825202,61929103).
文摘Hybrid memory systems composed of dynamic random access memory(DRAM)and Non-volatile memory(NVM)often exploit page migration technologies to fully take the advantages of different memory media.Most previous proposals usually migrate data at a granularity of 4 KB pages,and thus waste memory bandwidth and DRAM resource.In this paper,we propose Mocha,a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically,but manages them in a cache/memory hierarchy.Since the commercial NVM device-Intel Optane DC Persistent Memory Modules(DCPMM)actually access the physical media at a granularity of 256 bytes(an Optane block),we manage the DRAM cache at the 256-byte size to adapt to this feature of Optane.This design not only enables fine-grained data migration and management for the DRAM cache,but also avoids write amplification for Intel Optane DCPMM.We also create an Indirect Address Cache(IAC)in Hybrid Memory Controller(HMC)and propose a reverse address mapping table in the DRAM to speed up address translation and cache replacement.Moreover,we exploit a utility-based caching mechanism to filter cold blocks in the NVM,and further improve the efficiency of the DRAM cache.We implement Mocha in an architectural simulator.Experimental results show that Mocha can improve application performance by 8.2%on average(up to 24.6%),reduce 6.9%energy consumption and 25.9%data migration traffic on average,compared with a typical hybrid memory architecture-HSCC.
基金supported by the Taiwan Ministry of Economic Affairs and Institute for Information Industry under the project titled "Fundamental Industrial Technology Development Program (1/4)"
文摘For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.
文摘Vendor lock-in can occur at any layer of the cloud stack-Infrastructure,Platform,and Software-as-a-service.This paper covers the vendor lock-in issue at Platform as a Service(PaaS)level where applications can be created,deployed,and managed without worrying about the underlying infrastructure.These applications and their persisted data on one PaaS provider are not easy to port to another provider.To overcome this issue,we propose a middleware to abstract and make the database services as cloud-agnostic.The middleware supports several SQL and NoSQL data stores that can be hosted and ported among disparate PaaS providers.It facilitates the developers with data portability and data migration among relational and NoSQL-based cloud databases.NoSQL databases are fundamental to endure Big Data applications as they support the handling of an enormous volume of highly variable data while assuring fault tolerance,availability,and scalability.The implementation of the middleware depicts that using it alleviates the efforts of rewriting the application code while changing the backend database system.A working protocol of a migration tool has been developed using this middleware to facilitate the migration of the database(move existing data from a database on one cloud to a new database even on a different cloud).Although the middleware adds some overhead compared to the native code for the cloud services being used,the experimental evaluation on Twitter(a Big Data application)data set,proves this overhead is negligible.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61672298,61873326,and 61802155)the Philosophy Social Science Research Key Project Fund of Jiangsu University(Grant No.2018SJZDI142)。
文摘Since December 2019,the COVID-19 epidemic has repeatedly hit countries around the world due to various factors such as trade,national policies and the natural environment.To closely monitor the emergence of new COVID-19 clusters and ensure high prediction accuracy,we develop a new prediction framework for studying the spread of epidemic on networks based on partial differential equations(PDEs),which captures epidemic diffusion along the edges of a network driven by population flow data.In this paper,we focus on the effect of the population movement on the spread of COVID-19 in several cities from different geographic regions in China for describing the transmission characteristics of COVID-19.Experiment results show that the PDE model obtains relatively good prediction results compared with several typical mathematical models.Furthermore,we study the effectiveness of intervention measures,such as traffic lockdowns and social distancing,which provides a new approach for quantifying the effectiveness of the government policies toward controlling COVID-19 via the adaptive parameters of the model.To our knowledge,this work is the first attempt to apply the PDE model on networks with Baidu Migration Data for COVID-19 prediction.
文摘The convergence of next-generation Networks and the emergence of new media systems have made media-rich digital libraries popular in application and research. The discovery of media content objects’ usage patterns, where QPop Increment is the characteristic feature under study, is the basis of intelligent data migration scheduling, the very key issue for these systems to manage effectively the massive storage facilities in their backbones. In this paper, a clustering algorithm is established, on the basis of temporal segmentation of QPop Increment, so as to improve the mining performance. We employed the standard C-Means algorithm as the clustering kernel, and carried out the experimental mining process with segmented QPop Increases obtained in actual applications. The results indicated that the improved algorithm is more advantageous than the basic one in important indices such as the clustering cohesion. The experimental study in this paper is based on a Media Assets Library prototype developed for the use of the advertainment movie production project for Olympics 2008, under the support of both the Humanistic Olympics Study Center in Beijing, and China State Administration of Radio, Film and TV.
基金National Natural Science Foundation of China(41590845)Strategic Priority Research Program of the Chinese Academy of Sciences(XDA19040501)+2 种基金Strategic Priority Research Program of the Chinese Academy of Sciences(XDA20040401)National Key Research and Development Program of China(2017YFB0503605)National Key Research and Development Program of China(2017YFC1503003)。
文摘Developing a comprehensive understanding of inter-city interactions is crucial for regional planning.We therefore examined spatiotemporal patterns of population migration across the Qinghai-Tibet Plateau(QTP)using migration big data from Tencent for the period between 2015 and 2019.We initially used decomposition and breakpoint detection methods to examine time-series migration data and to identify the two seasons with the strongest and weakest population migration levels,between June 18th and August 18th and between October 8th and February 15th,respectively.Population migration within the former period was 2.03 times that seen in the latter.We then used a variety of network analysis methods to examine population flow directions as well as the importance of each individual city in migration.The two capital cities on the QTP,Lhasa and Xining,form centers for population migration and are also transfer hubs through which migrants from other cities off the plateau enter and leave this region.Data show that these two cities contribute more than 35%of total population migration.The majority of migrants tend to move within the province,particularly during the weakest migration season.We also utilized interactive relationship force and radiation models to examine the interaction strength and the radiating energy of each individual city.Results show that Lhasa and Xining exhibit the strongest interactions with other cities and have the largest radiating energies.Indeed,the radiating energy of the QTP cities correlates with their gross domestic product(GDP)(Pearson correlation coefficient:0.754 in the weakest migration season,WMS versus 0.737 in the strongest migration season,SMS),while changes in radiating energy correlate with the tourism-related revenue(Pearson correlation coefficient:0.685).These outcomes suggest that level of economic development and level of tourism are the two most important factors driving the QTP population migration.The results of this analysis provide critical clarification guidance regarding huge QTP development differences.
基金National Natural Science Foundation of China,No.41971180,No.41971168Natural Science Foundation of Fujian Province,No.2021J01145。
文摘The two concepts of“liudong renkou(floating population or FP)”and“renkou liudong(mobility of the floating population or MOFP)”,along with relevant data based on these two concepts,have long been used extensively in China’s research and policy making,playing a central role in Chinese studies of migration.Unlike the concepts of“migrant”and“migration”in the international literature,which are focused on people’s spatial mobility,“liudong renkou”and“renkou liudong”are identified and measured by the separation of one’s place of residence from one’s place of household registration(hukou),an approach inconsistent with relevant international practices.By analyzing various census data and data from the China Migrant Dynamic Survey(CMDS),this article examines the validity and reliability of these two concepts and the data based on them in the international context,revealing that they have become increasingly invalid and unreliable for the purpose of measuring migration events since China’s reform and opening up in the late 1970s.The results further demonstrate that these two concepts and the data based on them have become increasingly detached from real migration events and processes.They may become invalid by overestimating the volume of the mobile population,ineffective due to systematic omission of certain mobile populations(such as urban-urban migrants),or misleading as to the changing direction of migration flows.In addition,data on the floating population cannot be used to calculate migration rates and are not comparable in the international context.The concepts of“liudong renkou”and“renkou liudong”and data based on these two concepts may still need to be used in China for a long period of time due to the continuing existence of the hukou system and its roles in the provision of public services,social welfare and social security.However,we argue that concepts,measurements,and methods of data collection in research on migration in China should be gradually shifted to and focused on migrations as spatial events;further,transition data,based on an individual’s residence five years ago and one year ago,should be gradually adopted as the main data source and included in the short form of future censuses;additionally,migration event data based on population registration and administrative records should be used more fully,so that China’s migration research can be conducted on the solid basis of valid and reliable data sources.
基金Project supported by the National Key Research and Development Program of China(No.2016YFB0200402)
文摘With supercomputers developing towards exascale, the number of compute cores increases dramatically, making more complex and larger-scale applications possible. The input/output (I/O) requirements of large-scale applications, workflow applications, and their checkpointing include substantial bandwidth and an extremely low latency, posing a serious challenge to high performance computing (HPC) storage systems. Current hard disk drive (HDD) based underlying storage systems are becoming more and more incompetent to meet the requirements of next-generation exascale supercomputers. To rise to the challenge, we propose a hierarchical hybrid storage system, on-line and near-line file system (ONFS). It leverages dynamic random access memory (DRAM) and solid state drive (SSD) in compute nodes, and HDD in storage servers to build a three-level storage system in a unified namespace. It supports portable operating system interface (POSIX) semantics, and provides high bandwidth, low latency, and huge storage capacity. In this paper, we present the technical details on distributed metadata management, the strategy of memory borrow and return, data consistency, parallel access control, and mechanisms guiding downward and upward migration in ONFS. We implement an ONFS prototype on the TH-1A supercomputer, and conduct experiments to test its I/O performance and scalability. The results show that the bandwidths of single-thread and multi-thread 'read'/'write' are 6-fold and 5-fold better than HDD-based Lustre, respectively. The I/O bandwidth of data-intensive applications in ONFS can be 6.35 timcs that in Lustre.
基金partially supported by the National Natural Science Foundation of China (No.41230318)
文摘With the development of computational power, there has been an increased focus on data-fitting related seismic inversion techniques for high fidelity seismic velocity model and image, such as full-waveform inversion and least squares migration. However, though more advanced than conventional methods, these data fitting methods can be very expensive in terms of computational cost. Recently, various techniques to optimize these data-fitting seismic inversion problems have been implemented to cater for the industrial need for much improved efficiency. In this study, we propose a general stochastic conjugate gradient method for these data-fitting related inverse problems. We first prescribe the basic theory of our method and then give synthetic examples. Our numerical experiments illustrate the potential of this method for large-size seismic inversion application.