A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralizatio...A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralization,was selected for interpretation.The median+2 MAD(median absolute deviation)method of exploratory data analysis(EDA)and C-A(concentration-area)fractal modeling were then applied to the Mahalanobis distance,as defined by Zn,Cu and Pb from the factor analysis to set the thresholds for defining multi-element anomalies.As a result,the median+2 MAD method more successfully identified the Pb-Zn mineralization than the C-A fractal model.The soil anomaly identified by the median+2 MAD method on the Mahalanobis distances defined by three principal elements(Zn,Cu and Pb)rather than thirteen elements(Co,Zn,Cu,V,Mo,Ni,Cr,Mn,Pb,Ba,Sr,Zr and Ti)was the more favorable reflection of the ore body.The identified soil geochemical anomalies were compared with the in situ economic Pb-Zn ore bodies for validation.The results showed that the median+2 MAD approach is capable of mapping both strong and weak geochemical anomalies related to buried Pb-Zn mineralization,which is therefore useful at the reconnaissance drilling stage.展开更多
Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to ex...Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted from the SWISS-PROT database. Feature vectors were created to capture specific amino acid sequence characteristics. Relative to a Support Vector Machine, a Multi-layer Perceptron, and a Naive Bayes classifier, the C4.5 Decision Tree algorithm was the most consistent performer across all nine compartments in reliably predicting the subcellular localization of proteins based on their amino acid sequences (average Precision=0.88; average Sensitivity=0.86). Furthermore, EDA graphics characterized essential features of proteins in each compartment. As examples, proteins localized to the plasma membrane had higher proportions of hydrophobic amino acids; cytoplasmic proteins had higher proportions of neutral amino acids; and mitochondrial proteins had higher proportions of neutral amino acids and lower proportions of polar amino acids. These data showed that the C4.5 classifier and EDA tools can be effective for characterizing and predicting the subcellular localization of human proteins based on their amino acid sequences.展开更多
Urban resilience assesses a city’s ability to withstand unknown risks.Scholars are not comprehensive in assessing urban resilience,and they lack consideration of population resilience.This study investigated 110 pref...Urban resilience assesses a city’s ability to withstand unknown risks.Scholars are not comprehensive in assessing urban resilience,and they lack consideration of population resilience.This study investigated 110 prefecturelevel cities in the Yangtze River Economic Belt(YREB)as study areas.We calculated the YREB’s level of urban resilience based on the aspects of“economy-society-population-ecology-infrastructure”,which ensured that the comprehensive evaluation of urban resilience is complete and sufficient.The spatio-temporal evolution of urban resilience was analyzed using exploratory spatial data.Geodetectors were used to investigate the impact of several indicators,focusing on economic,social,population,ecological,and infrastructure factors,on urban resilience.The results showed that the urban resilience of the YREB has maintained a slow upward trend from 2005 to 2018,and the average urban resilience of the YREB has risen from 0.2442 to 0.2560.The resilience gap between cities in the study region increased initially and then decreased.The dominant factor in the spatial differentiation of urban resilience was the economic factors,followed by the population factors.Urban resilience has been clarified and an evaluation index system is constructed,which can provide an effective reference for the evaluation of urban resilience among countries around the world.Based on this,factors that optimize urban resilience are configured,and the regional and national sustainable development can be promoted.展开更多
Exploratory data analysis plays a major role in obtaining insights from data.Over the last two decades,researchers have proposed several visual data exploration tools that can assist with each step of the analysis pro...Exploratory data analysis plays a major role in obtaining insights from data.Over the last two decades,researchers have proposed several visual data exploration tools that can assist with each step of the analysis process.Nevertheless,in recent years,data analysis requirements have changed significantly.With constantly increasing size and types of data to be analyzed,scalability and analysis duration are now among the primary concerns of researchers.Moreover,in order to minimize the analysis cost,businesses are in need of data analysis tools that can be used with limited analytical knowledge.To address these challenges,traditional data exploration tools have evolved within the last few years.In this paper,with an in-depth analysis of an industrial tabular dataset,we identify a set of additional exploratory requirements for large datasets.Later,we present a comprehensive survey of the recent advancements in the emerging field of exploratory data analysis.We investigate 50 academic and non-academic visual data exploration tools with respect to their utility in the six fundamental steps of the exploratory data analysis process.We also examine the extent to which these modern data exploration tools fulfill the additional requirements for analyzing large datasets.Finally,we identify and present a set of research opportunities in the field of visual exploratory data analysis.展开更多
A significant Geographic Information Science(GIS)issue is closely related to spatial autocorrelation,a burning question in the phase of information extraction from the statistical analysis of georeferenced data.At pre...A significant Geographic Information Science(GIS)issue is closely related to spatial autocorrelation,a burning question in the phase of information extraction from the statistical analysis of georeferenced data.At present,spatial autocorrelation presents two types of measures:continuous and discrete.Is it possible to use Moran’s I and the Moran scatterplot with continuous data?Is it possible to use the same methodology with discrete data?A particular and cumbersome problem is the choice of the spatial-neighborhood matrix(W)for points data.This paper addresses these issues by introducing the concept of covariogram contiguity,where each weight is based on the variogram model for that particular dataset:(1)the variogram,whose range equals the distance with the highest Moran I value,defines the weights for points separated by less than the estimated range and(2)weights equal zero for points widely separated from the variogram range considered.After the W matrix is computed,the Moran location scatterplot is created in an iterative process.In accordance with various lag distances,Moran’s I is presented as a good search factor for the optimal neighborhood area.Uncertainty/transition regions are also emphasized.At the same time,a new Exploratory Spatial Data Analysis(ESDA)tool is developed,the Moran variance scatterplot,since the conventional Moran scatterplot is not sensitive to neighbor variance.This computer-mapping framework allows the study of spatial patterns,outliers,changeover areas,and trends in an ESDA process.All these tools were implemented in a free web e-Learning program for quantitative geographers called SAKWeb#(or,in the near future,myGeooffice.org).展开更多
Churn prediction is a common task for machine learning applications in business.In this paper,this task is adapted for solving problem of low efficiency of massive open online courses(only 5%of all the students finish...Churn prediction is a common task for machine learning applications in business.In this paper,this task is adapted for solving problem of low efficiency of massive open online courses(only 5%of all the students finish their course).The approach is presented on course“Methods and algorithms of the graph theory”held on national platform of online education in Russia.This paper includes all the steps to build an intelligent system to predict students who are active during the course,but not likely to finish it.The first part consists of constructing the right sample for prediction,EDA and choosing the most appropriate week of the course to make predictions on.The second part is about choosing the right metric and building models.Also,approach with using ensembles like stacking is proposed to increase the accuracy of predictions.As a result,a general approach to build a churn prediction model for online course is reviewed.This approach can be used for making the process of online education adaptive and intelligent for a separate student.展开更多
This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure.Symbolic data is usually aggregated from large data sets and used to hide entry specific details an...This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure.Symbolic data is usually aggregated from large data sets and used to hide entry specific details and to transform huge amounts of data(like big data)into analyzable quantities.It is also used to offer an overview in places where general trends are more important than individual details.Symbolic data comes in many forms like intervals,histograms,categories and modal multi-valued objects.Symbolic data can also be considered as a distribution.Currently,the de facto visualization approach for symbolic data is zoomstars which has many limitations.The biggest limitation is that the default distributions(histograms)are not supported in 2D as additional dimension is required.This paper proposes several new improvements for zoomstars which would enable it to visualize histograms in 2D by using a quantile or an equivalent interval approach.In addition,several improvements for categorical and modal variables are proposed for a clearer indication of presented categories.Recommendations for different approaches to zoomstars are offered depending on the data type and the desired goal.Furthermore,an alternative approach that allows visualizing the whole data set in comprehensive table-like graph,called shape encoding,is proposed.These visualizations and their usefulness are verified with three symbolic data sets in exploratory data mining phase to identify trends,similar objects and important features,detecting outliers and discrepancies in the data.展开更多
The aim of this work is to describe and compare three exploratory chemometrical tools,principal components analysis,independent components analysis and common components analysis,the last one being a modification of t...The aim of this work is to describe and compare three exploratory chemometrical tools,principal components analysis,independent components analysis and common components analysis,the last one being a modification of the multi-block statistical method known as common components and specific weights analysis.The three methods were applied to a set of data to show the differences and similarities of the results obtained,highlighting their complementarity.展开更多
How do people talk about COVID-19 online?To address this question,we offer an unsupervised framework that allows us to examine Twitter framings of the pandemic.Our approach employs a network-based exploration of socia...How do people talk about COVID-19 online?To address this question,we offer an unsupervised framework that allows us to examine Twitter framings of the pandemic.Our approach employs a network-based exploration of social media data to identify,categorize,and understand communication patterns about the novel coronavirus on Twitter.The simplest structure that emerges from our analysis is the distinction between the internal/personal,external/global,and generic threat framings of the pandemic.This structure replicates in different Twitter samples and is validated using the variation of information measure,reflecting the significance and stability of our findings.Such an exploratory study is useful for understanding the contours of the natural,non-random structure in this online space.We contend that this understanding of structure is necessary to address a host of causal,supervised,and related questions downstream.展开更多
According to the connotation and structure of science and technology resources and some relevant data of more than 286 cities at prefecture level and above during 2001-2010, using modified method--Data Envelopment Ana...According to the connotation and structure of science and technology resources and some relevant data of more than 286 cities at prefecture level and above during 2001-2010, using modified method--Data Envelopment Analysis (DEA), science and tech- nology (S&T) resource allocation efficiency of different cities in different periods has been figured out, which, uncovers the distributional difference and change law of S&T resource allocation efficiency from the time-space dimension. Based on that, this paper has analyzed and discussed the spatial distribution pattern and evolution trend of S&T resource allocation efficiency in different cities by virtue of the Exploratory Spatial Data Analysis (ESDA). It turned out that: (1) the average of S&T resource allocation efficiency in cities at prefecture level and above has always stayed at low levels, moreover, with repeated fluctuations between high and low, which shows a decreasing trend year by year. Besides, the gap between the East and the West is widening. (2) The asymmetrical distribution of S&T resource allocation effi- ciency presents a spatial pattern of successively decreasing from Eastern China, Central China to Western China. The cities whose S&T resource allocation efficiency are at higher level and high level take on a cluster distribution, which fits well with the 23 forming urban agglomerations in China. (3) The coupling degree between S&T resource allocation efficiency and economic environment assumes a certain positive correlation, but not completely the same. The differentiation of S&T resource allocation efficiency is common in regional devel- opment, whose existence and evolution are directly or indirectly influenced by and regarded as the reflection of many elements, such as geographical location, the natural endowment and environment of S&T resources and so on. (4) In the perspective of the evolution of spatial structure, S&T resource allocation efficiency of the cities at prefecture level and above shows a notable spatial autocorrelation, which in every period presents a positive correlation. The spatial distribution of S&T resource allocation efficiency in neighboring cities seems to be similar in group, which tends to escalate stepwise. Meanwhile, the whole differentiation of geographical space has a diminishing tendency. (5) Viewed from LISA agglomeration map of S&T resource allocation efficiency in different periods, four agglomeration types have changed differently in spatial location and the range of spatial agglomeration. And the conti- nuity of S&T resource allocation efficiency in geographical space is gradually increasing.展开更多
In 2007,China surpassed the USA to become the largest carbon emitter in the world.China has promised a 60%–65%reduction in carbon emissions per unit GDP by 2030,compared to the baseline of 2005.Therefore,it is import...In 2007,China surpassed the USA to become the largest carbon emitter in the world.China has promised a 60%–65%reduction in carbon emissions per unit GDP by 2030,compared to the baseline of 2005.Therefore,it is important to obtain accurate dynamic information on the spatial and temporal patterns of carbon emissions and carbon footprints to support formulating effective national carbon emission reduction policies.This study attempts to build a carbon emission panel data model that simulates carbon emissions in China from 2000–2013 using nighttime lighting data and carbon emission statistics data.By applying the Exploratory Spatial-Temporal Data Analysis(ESTDA)framework,this study conducted an analysis on the spatial patterns and dynamic spatial-temporal interactions of carbon footprints from 2001–2013.The improved Tapio decoupling model was adopted to investigate the levels of coupling or decoupling between the carbon emission load and economic growth in 336 prefecture-level units.The results show that,firstly,high accuracy was achieved by the model in simulating carbon emissions.Secondly,the total carbon footprints and carbon deficits across China increased with average annual growth rates of 4.82%and 5.72%,respectively.The overall carbon footprints and carbon deficits were larger in the North than that in the South.There were extremely significant spatial autocorrelation features in the carbon footprints of prefecture-level units.Thirdly,the relative lengths of the Local Indicators of Spatial Association(LISA)time paths were longer in the North than that in the South,and they increased from the coastal to the central and western regions.Lastly,the overall decoupling index was mainly a weak decoupling type,but the number of cities with this weak decoupling continued to decrease.The unsustainable development trend of China’s economic growth and carbon emission load will continue for some time.展开更多
It is urgent and important to explore the dynamic evolution in comprehensive transportation green efficiency(CTGE)in the context of green development.We constructed a social development index that reflects the social ...It is urgent and important to explore the dynamic evolution in comprehensive transportation green efficiency(CTGE)in the context of green development.We constructed a social development index that reflects the social benefits of transportation services,and incorporated it into the comprehensive transportation efficiency evaluation framework as an expected output.Based on the panel data of 30 regions in China from 2003-2018,the CTGE in China was measured using the slacks-based measure-data envelopment analysis(SBM-DEA)model.Further,the dynamic evolution trends of CTGE were determined using the spatial Markov model and exploratory spatio-temporal data analysis(ESTDA)technique from a spatio-temporal perspective.The results showed that the CTGE shows a U-shaped change trend but with an overall low level and significant regional differences.The state transition of CTGE has a strong spatial dependence,and there exists the phenomenon of“club convergence”.Neighbourhood background has a significant impact on the CTGE transition types,and the spatial spillover effect is pronounced.The CTGE has an obvious positive correlation and spatial agglomeration characteristics.The geometric characteristics of the LISA time path show that the evolution process of local spatial structure and local spatial dependence of China’s CTGE is stable,but the integration of spatial evolution is weak.The spatio-temporal transition results of LISA indicate that the CTGE has obvious transfer inertness and has certain path-dependence and spatial locking characteristics,which will become the major difficulty in improving the CTGE.展开更多
The aim of this paper is to study the spatialtemporal differentiation of industrial eco-efficiency in China. Using methods based on the data envelopment analysis (DEA) model and exploratory spatial data analysis (E...The aim of this paper is to study the spatialtemporal differentiation of industrial eco-efficiency in China. Using methods based on the data envelopment analysis (DEA) model and exploratory spatial data analysis (ESDA) and data from 1985, 1995, 2005, and 2008 of 30 provinces in China, the spatial-temporal pattern changes in industrial eco-efficiency are discussed. The results show that: first, the patterns of industrial eco-efficiency are dominated by clustering of relatively low efficiency provinces; second, spatial relationships between the industrial eco-efficiencies of different provinces changed slightly throughout the period and the provinces persistently exhibit spatial concentration of relatively low industrial eco-efficiency; finally, there is an obvious trend in the polarization of industrial eco-efficiency, i.e., the higher level spatial units are concentrated in eastern China, and the lower level spatial units are mainly in western and central China. (ESDA)展开更多
Influenced by globalization,rural transition in developed Western countries has experienced processes of productivism,post-productivism,and multifunctional development.By contrast,rural transition in most developing c...Influenced by globalization,rural transition in developed Western countries has experienced processes of productivism,post-productivism,and multifunctional development.By contrast,rural transition in most developing countries has been accompanied by rapid urbanization,which has become a core topic in geography research.As the world’s largest developing country,China has undergone profound development since the reform and opening-up.Moreover,rural spaces in some eastern coastal areas have entered the stage of reconstruction after decades of industrialization and urbanization.This paper takes Suzhou as the case area and measures the process of rural transition from 1990 to 2015 by constructing an index system.It then analyzes the characteristics of space-time evolution using exploratory spatial data analysis(ESDA)methods to reveal the influence of economic and social development on rural transition.The results show that rural transition,which generally entails the weakening of rurality and enhancing of urbanity on a macro scale,tends to be heterogeneous across different regions on a micro scale.This paper argues that multifunctionality will be the main future trend of rural transition in rapidly urbanizing areas.The experience in Suzhou could provide an example for establishing policies on sustainable development in rural spaces and achieving urban-rural co-governance.展开更多
文摘A factor analysis was applied to soil geochemical data to define anomalies related to buried Pb-Zn mineralization.A favorable main factor with a strong association of the elements Zn,Cu and Pb,related to mineralization,was selected for interpretation.The median+2 MAD(median absolute deviation)method of exploratory data analysis(EDA)and C-A(concentration-area)fractal modeling were then applied to the Mahalanobis distance,as defined by Zn,Cu and Pb from the factor analysis to set the thresholds for defining multi-element anomalies.As a result,the median+2 MAD method more successfully identified the Pb-Zn mineralization than the C-A fractal model.The soil anomaly identified by the median+2 MAD method on the Mahalanobis distances defined by three principal elements(Zn,Cu and Pb)rather than thirteen elements(Co,Zn,Cu,V,Mo,Ni,Cr,Mn,Pb,Ba,Sr,Zr and Ti)was the more favorable reflection of the ore body.The identified soil geochemical anomalies were compared with the in situ economic Pb-Zn ore bodies for validation.The results showed that the median+2 MAD approach is capable of mapping both strong and weak geochemical anomalies related to buried Pb-Zn mineralization,which is therefore useful at the reconnaissance drilling stage.
文摘Identifying the subcellular localization of proteins is particularly helpful in the functional annotation of gene products. In this study, we use Machine Learning and Exploratory Data Analysis (EDA) techniques to examine and characterize amino acid sequences of human proteins localized in nine cellular compartments. A dataset of 3,749 protein sequences representing human proteins was extracted from the SWISS-PROT database. Feature vectors were created to capture specific amino acid sequence characteristics. Relative to a Support Vector Machine, a Multi-layer Perceptron, and a Naive Bayes classifier, the C4.5 Decision Tree algorithm was the most consistent performer across all nine compartments in reliably predicting the subcellular localization of proteins based on their amino acid sequences (average Precision=0.88; average Sensitivity=0.86). Furthermore, EDA graphics characterized essential features of proteins in each compartment. As examples, proteins localized to the plasma membrane had higher proportions of hydrophobic amino acids; cytoplasmic proteins had higher proportions of neutral amino acids; and mitochondrial proteins had higher proportions of neutral amino acids and lower proportions of polar amino acids. These data showed that the C4.5 classifier and EDA tools can be effective for characterizing and predicting the subcellular localization of human proteins based on their amino acid sequences.
基金I would like to thank the National Natural Science Foundation of China(Grant No.42061041)for the funding.
文摘Urban resilience assesses a city’s ability to withstand unknown risks.Scholars are not comprehensive in assessing urban resilience,and they lack consideration of population resilience.This study investigated 110 prefecturelevel cities in the Yangtze River Economic Belt(YREB)as study areas.We calculated the YREB’s level of urban resilience based on the aspects of“economy-society-population-ecology-infrastructure”,which ensured that the comprehensive evaluation of urban resilience is complete and sufficient.The spatio-temporal evolution of urban resilience was analyzed using exploratory spatial data.Geodetectors were used to investigate the impact of several indicators,focusing on economic,social,population,ecological,and infrastructure factors,on urban resilience.The results showed that the urban resilience of the YREB has maintained a slow upward trend from 2005 to 2018,and the average urban resilience of the YREB has risen from 0.2442 to 0.2560.The resilience gap between cities in the study region increased initially and then decreased.The dominant factor in the spatial differentiation of urban resilience was the economic factors,followed by the population factors.Urban resilience has been clarified and an evaluation index system is constructed,which can provide an effective reference for the evaluation of urban resilience among countries around the world.Based on this,factors that optimize urban resilience are configured,and the regional and national sustainable development can be promoted.
文摘Exploratory data analysis plays a major role in obtaining insights from data.Over the last two decades,researchers have proposed several visual data exploration tools that can assist with each step of the analysis process.Nevertheless,in recent years,data analysis requirements have changed significantly.With constantly increasing size and types of data to be analyzed,scalability and analysis duration are now among the primary concerns of researchers.Moreover,in order to minimize the analysis cost,businesses are in need of data analysis tools that can be used with limited analytical knowledge.To address these challenges,traditional data exploration tools have evolved within the last few years.In this paper,with an in-depth analysis of an industrial tabular dataset,we identify a set of additional exploratory requirements for large datasets.Later,we present a comprehensive survey of the recent advancements in the emerging field of exploratory data analysis.We investigate 50 academic and non-academic visual data exploration tools with respect to their utility in the six fundamental steps of the exploratory data analysis process.We also examine the extent to which these modern data exploration tools fulfill the additional requirements for analyzing large datasets.Finally,we identify and present a set of research opportunities in the field of visual exploratory data analysis.
文摘A significant Geographic Information Science(GIS)issue is closely related to spatial autocorrelation,a burning question in the phase of information extraction from the statistical analysis of georeferenced data.At present,spatial autocorrelation presents two types of measures:continuous and discrete.Is it possible to use Moran’s I and the Moran scatterplot with continuous data?Is it possible to use the same methodology with discrete data?A particular and cumbersome problem is the choice of the spatial-neighborhood matrix(W)for points data.This paper addresses these issues by introducing the concept of covariogram contiguity,where each weight is based on the variogram model for that particular dataset:(1)the variogram,whose range equals the distance with the highest Moran I value,defines the weights for points separated by less than the estimated range and(2)weights equal zero for points widely separated from the variogram range considered.After the W matrix is computed,the Moran location scatterplot is created in an iterative process.In accordance with various lag distances,Moran’s I is presented as a good search factor for the optimal neighborhood area.Uncertainty/transition regions are also emphasized.At the same time,a new Exploratory Spatial Data Analysis(ESDA)tool is developed,the Moran variance scatterplot,since the conventional Moran scatterplot is not sensitive to neighbor variance.This computer-mapping framework allows the study of spatial patterns,outliers,changeover areas,and trends in an ESDA process.All these tools were implemented in a free web e-Learning program for quantitative geographers called SAKWeb#(or,in the near future,myGeooffice.org).
文摘Churn prediction is a common task for machine learning applications in business.In this paper,this task is adapted for solving problem of low efficiency of massive open online courses(only 5%of all the students finish their course).The approach is presented on course“Methods and algorithms of the graph theory”held on national platform of online education in Russia.This paper includes all the steps to build an intelligent system to predict students who are active during the course,but not likely to finish it.The first part consists of constructing the right sample for prediction,EDA and choosing the most appropriate week of the course to make predictions on.The second part is about choosing the right metric and building models.Also,approach with using ensembles like stacking is proposed to increase the accuracy of predictions.As a result,a general approach to build a churn prediction model for online course is reviewed.This approach can be used for making the process of online education adaptive and intelligent for a separate student.
文摘This paper examines the visualization of symbolic data and considers the challenges rising from its complex structure.Symbolic data is usually aggregated from large data sets and used to hide entry specific details and to transform huge amounts of data(like big data)into analyzable quantities.It is also used to offer an overview in places where general trends are more important than individual details.Symbolic data comes in many forms like intervals,histograms,categories and modal multi-valued objects.Symbolic data can also be considered as a distribution.Currently,the de facto visualization approach for symbolic data is zoomstars which has many limitations.The biggest limitation is that the default distributions(histograms)are not supported in 2D as additional dimension is required.This paper proposes several new improvements for zoomstars which would enable it to visualize histograms in 2D by using a quantile or an equivalent interval approach.In addition,several improvements for categorical and modal variables are proposed for a clearer indication of presented categories.Recommendations for different approaches to zoomstars are offered depending on the data type and the desired goal.Furthermore,an alternative approach that allows visualizing the whole data set in comprehensive table-like graph,called shape encoding,is proposed.These visualizations and their usefulness are verified with three symbolic data sets in exploratory data mining phase to identify trends,similar objects and important features,detecting outliers and discrepancies in the data.
文摘The aim of this work is to describe and compare three exploratory chemometrical tools,principal components analysis,independent components analysis and common components analysis,the last one being a modification of the multi-block statistical method known as common components and specific weights analysis.The three methods were applied to a set of data to show the differences and similarities of the results obtained,highlighting their complementarity.
文摘How do people talk about COVID-19 online?To address this question,we offer an unsupervised framework that allows us to examine Twitter framings of the pandemic.Our approach employs a network-based exploration of social media data to identify,categorize,and understand communication patterns about the novel coronavirus on Twitter.The simplest structure that emerges from our analysis is the distinction between the internal/personal,external/global,and generic threat framings of the pandemic.This structure replicates in different Twitter samples and is validated using the variation of information measure,reflecting the significance and stability of our findings.Such an exploratory study is useful for understanding the contours of the natural,non-random structure in this online space.We contend that this understanding of structure is necessary to address a host of causal,supervised,and related questions downstream.
基金Key Projects of Philosophy of the Social Science funded by the Ministry of Education,No.11JD039National Key Public Bidding Project for Soft Science Research Plan,No.2012GXS1D002National Natural Science Foundation of China,No.41001083
文摘According to the connotation and structure of science and technology resources and some relevant data of more than 286 cities at prefecture level and above during 2001-2010, using modified method--Data Envelopment Analysis (DEA), science and tech- nology (S&T) resource allocation efficiency of different cities in different periods has been figured out, which, uncovers the distributional difference and change law of S&T resource allocation efficiency from the time-space dimension. Based on that, this paper has analyzed and discussed the spatial distribution pattern and evolution trend of S&T resource allocation efficiency in different cities by virtue of the Exploratory Spatial Data Analysis (ESDA). It turned out that: (1) the average of S&T resource allocation efficiency in cities at prefecture level and above has always stayed at low levels, moreover, with repeated fluctuations between high and low, which shows a decreasing trend year by year. Besides, the gap between the East and the West is widening. (2) The asymmetrical distribution of S&T resource allocation effi- ciency presents a spatial pattern of successively decreasing from Eastern China, Central China to Western China. The cities whose S&T resource allocation efficiency are at higher level and high level take on a cluster distribution, which fits well with the 23 forming urban agglomerations in China. (3) The coupling degree between S&T resource allocation efficiency and economic environment assumes a certain positive correlation, but not completely the same. The differentiation of S&T resource allocation efficiency is common in regional devel- opment, whose existence and evolution are directly or indirectly influenced by and regarded as the reflection of many elements, such as geographical location, the natural endowment and environment of S&T resources and so on. (4) In the perspective of the evolution of spatial structure, S&T resource allocation efficiency of the cities at prefecture level and above shows a notable spatial autocorrelation, which in every period presents a positive correlation. The spatial distribution of S&T resource allocation efficiency in neighboring cities seems to be similar in group, which tends to escalate stepwise. Meanwhile, the whole differentiation of geographical space has a diminishing tendency. (5) Viewed from LISA agglomeration map of S&T resource allocation efficiency in different periods, four agglomeration types have changed differently in spatial location and the range of spatial agglomeration. And the conti- nuity of S&T resource allocation efficiency in geographical space is gradually increasing.
基金National Natural Science Foundation of China Youth Science Foundation ProjectNo.41701170+1 种基金National Natural Science Foundation of China,No.41661025,No.42071216Fundamental Research Funds for the Central Universities,No.18LZUJBWZY068。
文摘In 2007,China surpassed the USA to become the largest carbon emitter in the world.China has promised a 60%–65%reduction in carbon emissions per unit GDP by 2030,compared to the baseline of 2005.Therefore,it is important to obtain accurate dynamic information on the spatial and temporal patterns of carbon emissions and carbon footprints to support formulating effective national carbon emission reduction policies.This study attempts to build a carbon emission panel data model that simulates carbon emissions in China from 2000–2013 using nighttime lighting data and carbon emission statistics data.By applying the Exploratory Spatial-Temporal Data Analysis(ESTDA)framework,this study conducted an analysis on the spatial patterns and dynamic spatial-temporal interactions of carbon footprints from 2001–2013.The improved Tapio decoupling model was adopted to investigate the levels of coupling or decoupling between the carbon emission load and economic growth in 336 prefecture-level units.The results show that,firstly,high accuracy was achieved by the model in simulating carbon emissions.Secondly,the total carbon footprints and carbon deficits across China increased with average annual growth rates of 4.82%and 5.72%,respectively.The overall carbon footprints and carbon deficits were larger in the North than that in the South.There were extremely significant spatial autocorrelation features in the carbon footprints of prefecture-level units.Thirdly,the relative lengths of the Local Indicators of Spatial Association(LISA)time paths were longer in the North than that in the South,and they increased from the coastal to the central and western regions.Lastly,the overall decoupling index was mainly a weak decoupling type,but the number of cities with this weak decoupling continued to decrease.The unsustainable development trend of China’s economic growth and carbon emission load will continue for some time.
基金National Key Research and Development Program of China(2019YFB1600400)National Natural Science Foundation of China(72174035)+2 种基金National Natural Science Foundation of China(71774018)Liaoning Revitalization Talents Program(XLYC2008030)Liaoning Provincial Natural Science Foundation Shipping Joint Foundation Program(2020-HYLH-20)。
文摘It is urgent and important to explore the dynamic evolution in comprehensive transportation green efficiency(CTGE)in the context of green development.We constructed a social development index that reflects the social benefits of transportation services,and incorporated it into the comprehensive transportation efficiency evaluation framework as an expected output.Based on the panel data of 30 regions in China from 2003-2018,the CTGE in China was measured using the slacks-based measure-data envelopment analysis(SBM-DEA)model.Further,the dynamic evolution trends of CTGE were determined using the spatial Markov model and exploratory spatio-temporal data analysis(ESTDA)technique from a spatio-temporal perspective.The results showed that the CTGE shows a U-shaped change trend but with an overall low level and significant regional differences.The state transition of CTGE has a strong spatial dependence,and there exists the phenomenon of“club convergence”.Neighbourhood background has a significant impact on the CTGE transition types,and the spatial spillover effect is pronounced.The CTGE has an obvious positive correlation and spatial agglomeration characteristics.The geometric characteristics of the LISA time path show that the evolution process of local spatial structure and local spatial dependence of China’s CTGE is stable,but the integration of spatial evolution is weak.The spatio-temporal transition results of LISA indicate that the CTGE has obvious transfer inertness and has certain path-dependence and spatial locking characteristics,which will become the major difficulty in improving the CTGE.
基金This work was supported by the Ministry of Environmental Production of China (No. 2110203) and the National Natural Science Foundation of China (Grant No. 41101138).
文摘The aim of this paper is to study the spatialtemporal differentiation of industrial eco-efficiency in China. Using methods based on the data envelopment analysis (DEA) model and exploratory spatial data analysis (ESDA) and data from 1985, 1995, 2005, and 2008 of 30 provinces in China, the spatial-temporal pattern changes in industrial eco-efficiency are discussed. The results show that: first, the patterns of industrial eco-efficiency are dominated by clustering of relatively low efficiency provinces; second, spatial relationships between the industrial eco-efficiencies of different provinces changed slightly throughout the period and the provinces persistently exhibit spatial concentration of relatively low industrial eco-efficiency; finally, there is an obvious trend in the polarization of industrial eco-efficiency, i.e., the higher level spatial units are concentrated in eastern China, and the lower level spatial units are mainly in western and central China. (ESDA)
基金National Social Science Foundation of China,No.21FSHB014National Natural Science Foundation of China,No.42001196。
文摘Influenced by globalization,rural transition in developed Western countries has experienced processes of productivism,post-productivism,and multifunctional development.By contrast,rural transition in most developing countries has been accompanied by rapid urbanization,which has become a core topic in geography research.As the world’s largest developing country,China has undergone profound development since the reform and opening-up.Moreover,rural spaces in some eastern coastal areas have entered the stage of reconstruction after decades of industrialization and urbanization.This paper takes Suzhou as the case area and measures the process of rural transition from 1990 to 2015 by constructing an index system.It then analyzes the characteristics of space-time evolution using exploratory spatial data analysis(ESDA)methods to reveal the influence of economic and social development on rural transition.The results show that rural transition,which generally entails the weakening of rurality and enhancing of urbanity on a macro scale,tends to be heterogeneous across different regions on a micro scale.This paper argues that multifunctionality will be the main future trend of rural transition in rapidly urbanizing areas.The experience in Suzhou could provide an example for establishing policies on sustainable development in rural spaces and achieving urban-rural co-governance.