This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-da- ta processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-ba...This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-da- ta processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-based according to the processing technique. We highlight the strengths and weaknesses of various big-data cloud processing techniques in order to help the big-data community select the appropri- ate processing technique. We also provide big data research challenges and future directions in aspect to transportation management systems.展开更多
There is a lack of high-quality,large-scale,real-world evidence from patients with metastatic colorectal cancer(mCRC),especially in China.It remains unclear whether efforts to improve the quality of care for mCRC woul...There is a lack of high-quality,large-scale,real-world evidence from patients with metastatic colorectal cancer(mCRC),especially in China.It remains unclear whether efforts to improve the quality of care for mCRC would improve patient survival outcomes in real-world practice.On the basis of an intelligent bigdata platform,we established a large-scale retrospective cohort of mCRC patients.We investigated the temporal changes in the systemic and local treatment(resection,ablation,or radiation to liver,lung,or extrahepatic and/or extrapulmonary metastases)patterns of mCRC,and whether these changes were associated with improved overall survival(OS)over time.Between July 2012 and December 2018,3403 eligible patients were included in this research.The median OS was 42.8 months(95%confidence interval(CI),40.7–46.6)for the entire cohort,25.6 months(95%CI,24.7–26.9)for those treated with systemic therapy only,and not reached(95%CI,78.6 months–not reached)for those receiving local therapy.The utility rate of local therapy increased continuously from 37.9%in 2012–2014 to 46.9%in 2017–2018.A dramatic increase in the utility rate of either cetuximab or bevacizumab was observed since 2017(39.9%,43.2%,and 60.3%in 2012–2014,2015–2016,and 2017–2018,respectively).Compared with 2012–2014,the OS of the entire population significantly improved in 2015–2016(hazard ratio(HR)=0.87(95%CI,0.78–0.99);P=0.034),but not for patients receiving systemic therapy only(HR=0.99(95%CI,0.86–1.14);P=0.889),whereas an improved OS was found in 2015–2018 for both the entire population(HR=0.75(95%CI,0.70–0.81);P<0.001)and for patients receiving systemic therapy only(HR=0.83(95%CI,0.77–0.91);P<0.001).In summary,the quality of care for mCRC,as indicated by the utility rate of targeted and local therapies,has been continuously improving over time in this study cohort,which is associated with continuously improving survival outcomes for these patients.展开更多
Tourism destination images in terms of the gaps between the projected and perceived images are of great significance in the development of destinations.Additionally,the use of big-data in tourism studies remains under...Tourism destination images in terms of the gaps between the projected and perceived images are of great significance in the development of destinations.Additionally,the use of big-data in tourism studies remains under-utilized despite the boom in big-data applications and the increasing number of electronic User Generated Contents(UGC).Aiming to take advantage of tourism UGC to fully understand the destination image gap between official promotion materials and tourist perception of Sanya City in China,this study innovatively employed a big-data analysis technique,Tourism Sentiment Evaluation(TSE)model and proposed a new analysis framework integrating the“cognitive-affective”model with the gpp analysis of projected and perceived destination image to explore the destination image gap of Sanya It is found that Sanya's perceptive destination image is overall consistent with its official positioning;however,there also exist image gaps between the two groups in terms of the impact of festival events and tourists'attitude towards core scenic spots amongst others.This study's findings are discussed in light of their methodological,theoretical,and practical implications for destination positioning,marketing,and management.展开更多
Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality ...Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality represents a great challenge because the cost of non-quality can be very high. Therefore the use of data quality becomes an absolute necessity within an organization. To improve the data quality in a Big-Data source, our purpose, in this paper, is to add semantics to data and help user to recognize the Big-Data schema. The originality of this approach lies in the semantic aspect it offers. It detects issues in data and proposes a data schema by applying a semantic data profiling.展开更多
With the rapid development of the internet, internet of things, mobile internet, and cloud computing, the amount of data in circulation has grown rapidly. More social information has contributed to the growth of big d...With the rapid development of the internet, internet of things, mobile internet, and cloud computing, the amount of data in circulation has grown rapidly. More social information has contributed to the growth of big data, and data has become a core asset. Big data is challenging in terms of effective storage, efficient computation and analysis, and deep data mining. In this paper, we discuss the signif- icance of big data and discuss key technologies and problems in big-data analyties. We also discuss the future prospects of big-data analylics.展开更多
The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, ...The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.展开更多
基金supported in part by the National Basic Research Program(973 Program,No.2015CB352400)NSFC under grant U1401258U.S NSF under grant CCF-1016966
文摘This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-da- ta processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-based according to the processing technique. We highlight the strengths and weaknesses of various big-data cloud processing techniques in order to help the big-data community select the appropri- ate processing technique. We also provide big data research challenges and future directions in aspect to transportation management systems.
基金This study was supported by the grants from the National Natural Science Foundation of China(81930065)the Natural Science Foundation of Guangdong Province(2014A030312015)+1 种基金the Science and Technology Program of Guangdong(2019B020227002)the Science and Technology Program of Guangzhou(201904020046,201803040019,and 201704020228).
文摘There is a lack of high-quality,large-scale,real-world evidence from patients with metastatic colorectal cancer(mCRC),especially in China.It remains unclear whether efforts to improve the quality of care for mCRC would improve patient survival outcomes in real-world practice.On the basis of an intelligent bigdata platform,we established a large-scale retrospective cohort of mCRC patients.We investigated the temporal changes in the systemic and local treatment(resection,ablation,or radiation to liver,lung,or extrahepatic and/or extrapulmonary metastases)patterns of mCRC,and whether these changes were associated with improved overall survival(OS)over time.Between July 2012 and December 2018,3403 eligible patients were included in this research.The median OS was 42.8 months(95%confidence interval(CI),40.7–46.6)for the entire cohort,25.6 months(95%CI,24.7–26.9)for those treated with systemic therapy only,and not reached(95%CI,78.6 months–not reached)for those receiving local therapy.The utility rate of local therapy increased continuously from 37.9%in 2012–2014 to 46.9%in 2017–2018.A dramatic increase in the utility rate of either cetuximab or bevacizumab was observed since 2017(39.9%,43.2%,and 60.3%in 2012–2014,2015–2016,and 2017–2018,respectively).Compared with 2012–2014,the OS of the entire population significantly improved in 2015–2016(hazard ratio(HR)=0.87(95%CI,0.78–0.99);P=0.034),but not for patients receiving systemic therapy only(HR=0.99(95%CI,0.86–1.14);P=0.889),whereas an improved OS was found in 2015–2018 for both the entire population(HR=0.75(95%CI,0.70–0.81);P<0.001)and for patients receiving systemic therapy only(HR=0.83(95%CI,0.77–0.91);P<0.001).In summary,the quality of care for mCRC,as indicated by the utility rate of targeted and local therapies,has been continuously improving over time in this study cohort,which is associated with continuously improving survival outcomes for these patients.
基金supported by The Ministry of education of Humanities and Social Science Project(19YJAZH060)Study on Agglomeration Pattern,Development Quality And Spatial Optimization Of Urban Leisure Industry In Guangdong-Hong Kong-Macao Greater Bay Area,supported by Guangdong Philosophical and Social Sciences Project(GD20SQ21).
文摘Tourism destination images in terms of the gaps between the projected and perceived images are of great significance in the development of destinations.Additionally,the use of big-data in tourism studies remains under-utilized despite the boom in big-data applications and the increasing number of electronic User Generated Contents(UGC).Aiming to take advantage of tourism UGC to fully understand the destination image gap between official promotion materials and tourist perception of Sanya City in China,this study innovatively employed a big-data analysis technique,Tourism Sentiment Evaluation(TSE)model and proposed a new analysis framework integrating the“cognitive-affective”model with the gpp analysis of projected and perceived destination image to explore the destination image gap of Sanya It is found that Sanya's perceptive destination image is overall consistent with its official positioning;however,there also exist image gaps between the two groups in terms of the impact of festival events and tourists'attitude towards core scenic spots amongst others.This study's findings are discussed in light of their methodological,theoretical,and practical implications for destination positioning,marketing,and management.
文摘Data governance is a subject that is becoming increasingly important in business and government. In fact, good governance data allows improved interactions between employees of one or more organizations. Data quality represents a great challenge because the cost of non-quality can be very high. Therefore the use of data quality becomes an absolute necessity within an organization. To improve the data quality in a Big-Data source, our purpose, in this paper, is to add semantics to data and help user to recognize the Big-Data schema. The originality of this approach lies in the semantic aspect it offers. It detects issues in data and proposes a data schema by applying a semantic data profiling.
文摘With the rapid development of the internet, internet of things, mobile internet, and cloud computing, the amount of data in circulation has grown rapidly. More social information has contributed to the growth of big data, and data has become a core asset. Big data is challenging in terms of effective storage, efficient computation and analysis, and deep data mining. In this paper, we discuss the signif- icance of big data and discuss key technologies and problems in big-data analyties. We also discuss the future prospects of big-data analylics.
基金supported in part by the Natural Science Foundation of USA(Nos.ECCS 1128209,CNS 1138963,CNS 1065444,and CCF 1028167)
文摘The buzz-word big-data refers to the large-scale distributed data processing applications that operate on exceptionally large amounts of data. Google's MapReduce and Apache's Hadoop, its open-source implementation, are the defacto software systems for big-data applications. An observation of the MapReduce framework is that the framework generates a large amount of intermediate data. Such abundant information is thrown away after the tasks finish, because MapReduce is unable to utilize them. In this paper, we propose Dache, a data-aware cache framework for big-data applications. In Dache, tasks submit their intermediate results to the cache manager. A task queries the cache manager before executing the actual computing work. A novel cache description scheme and a cache request and reply protocol are designed. We implement Dache by extending Hadoop. Testbed experiment results demonstrate that Dache significantly improves the completion time of MapReduce jobs.