One of the core developments in geomathematics in now days is the use of digital data processing in mineral prospecting and assessment. The information discovery is based on multidisciplinary geoscientific data and an...One of the core developments in geomathematics in now days is the use of digital data processing in mineral prospecting and assessment. The information discovery is based on multidisciplinary geoscientific data and an integrated management approach is crucial. The lack of a standard description hinders interoperations in database search and discovery. Metadata hierarchy aims to provide a standard view of the geoscientific data, and facilitate data description and discovery. In the research of integrated geoscientific database, the metadata hierarchy used a standardized description for each collection in the content structure and realized in semantic structure. It recorded both dataset identification and inner structures and relationships of objects, thus differed from many other applications. There were four tiers in the content structure and three levels in the semantic structure. With its help, database users could determine how applicable a dataset is to a project, and improve their queries to the database. Effectiveness of data accessing is significantly enhanced through the rich, consistent metadata.展开更多
Similarity relation is one of the spatial relations in the community of geographic information science and cartography.It is widely used in the retrieval of spatial databases, the recognition of spatial objects from i...Similarity relation is one of the spatial relations in the community of geographic information science and cartography.It is widely used in the retrieval of spatial databases, the recognition of spatial objects from images, and the description of spatial features on maps.However, little achievements have been made for it by far.In this paper, spatial similarity relation was put forward with the introduction of automated map generalization in the construction of multi-scale map databases;then the definition of spatial similarity relations was presented based on set theory, the concept of spatial similarity degree was given, and the characteristics of spatial similarity were discussed in detail, in-cluding reflexivity, symmetry, non-transitivity, self-similarity in multi-scale spaces, and scale-dependence.Finally a classification system for spatial similarity relations in multi-scale map spaces was addressed.This research may be useful to automated map generalization, spatial similarity retrieval and spatial reasoning.展开更多
In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation litera...In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.展开更多
By using the method of least square linear fitting to analyze data do not exist errors under certain conditions, in order to make the linear data fitting method that can more accurately solve the relationship expressi...By using the method of least square linear fitting to analyze data do not exist errors under certain conditions, in order to make the linear data fitting method that can more accurately solve the relationship expression between the volume and quantity in scientific experiments and engineering practice, this article analyzed data error by commonly linear data fitting method, and proposed improved process of the least distance squ^re method based on least squares method. Finally, the paper discussed the advantages and disadvantages through the example analysis of two kinds of linear data fitting method, and given reasonable control conditions for its application.展开更多
Although much has been known about how humans psychologically perform data-driven scientific discovery,less has been known about its brain mechanism.The number series completion is a typical data-driven scientific dis...Although much has been known about how humans psychologically perform data-driven scientific discovery,less has been known about its brain mechanism.The number series completion is a typical data-driven scientific discovery task,and has been demonstrated to possess the priming effect,which is attributed to the regularity identification and its subsequent extrapolation.In order to reduce the heterogeneities and make the experimental task proper for a brain imaging study,the number magnitude and arithmetic operation involved in number series completion tasks are further restricted.Behavioral performance in Experiment 1 shows the reliable priming effect for targets as expected.Then,a factorial design (the priming effect:prime vs.target;the period length:simple vs.complex) of event-related functional magnetic resonance imaging (fMRI) is used in Experiment 2 to examine the neural basis of data-driven scientific discovery.The fMRI results reveal a double dissociation of the left DLPFC (dorsolateral prefrontal cortex) and the left APFC (anterior prefrontal cortex) between the simple (period length=1) and the complex (period length=2) number series completion task.The priming effect in the left DLPFC is more significant for the simple task than for the complex task,while the priming effect in the left APFC is more significant for the complex task than for the simple task.The reliable double dissociation may suggest the different roles of the left DLPFC and left APFC in data-driven scientific discovery.The left DLPFC (BA 46) may play a crucial role in rule identification,while the left APFC (BA 10) may be related to mental set maintenance needed during rule identification and extrapolation.展开更多
With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introd...With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introduce a unified and robust model-free feature screening approach for high-dimensional survival data with censoring, which has several advantages: it is a model-free approach under a general model framework, and hence avoids the complication to specify an actual model form with huge number of candidate variables; under mild conditions without requiring the existence of any moment of the response, it enjoys the ranking consistency and sure screening properties in ultra-high dimension. In particular, we impose a conditional independence assumption of the response and the censoring variable given each covariate, instead of assuming the censoring variable is independent of the response and the covariates. Moreover, we also propose a more robust variant to the new procedure, which possesses desirable theoretical properties without any finite moment condition of the predictors and the response. The computation of the newly proposed methods does not require any complicated numerical optimization and it is fast and easy to implement. Extensive numerical studies demonstrate that the proposed methods perform competitively for various configurations. Application is illustrated with an analysis of a genetic data set.展开更多
基金Funded by the National 863 Program of China (No.2002AA130406)the Key Project of China Geological Survey (No.200218310077).
文摘One of the core developments in geomathematics in now days is the use of digital data processing in mineral prospecting and assessment. The information discovery is based on multidisciplinary geoscientific data and an integrated management approach is crucial. The lack of a standard description hinders interoperations in database search and discovery. Metadata hierarchy aims to provide a standard view of the geoscientific data, and facilitate data description and discovery. In the research of integrated geoscientific database, the metadata hierarchy used a standardized description for each collection in the content structure and realized in semantic structure. It recorded both dataset identification and inner structures and relationships of objects, thus differed from many other applications. There were four tiers in the content structure and three levels in the semantic structure. With its help, database users could determine how applicable a dataset is to a project, and improve their queries to the database. Effectiveness of data accessing is significantly enhanced through the rich, consistent metadata.
文摘Similarity relation is one of the spatial relations in the community of geographic information science and cartography.It is widely used in the retrieval of spatial databases, the recognition of spatial objects from images, and the description of spatial features on maps.However, little achievements have been made for it by far.In this paper, spatial similarity relation was put forward with the introduction of automated map generalization in the construction of multi-scale map databases;then the definition of spatial similarity relations was presented based on set theory, the concept of spatial similarity degree was given, and the characteristics of spatial similarity were discussed in detail, in-cluding reflexivity, symmetry, non-transitivity, self-similarity in multi-scale spaces, and scale-dependence.Finally a classification system for spatial similarity relations in multi-scale map spaces was addressed.This research may be useful to automated map generalization, spatial similarity retrieval and spatial reasoning.
文摘In the era of the big data. the national strategies and the rapid development of computers and storage technologies bring opportunities and challenges to the library's data services. Based on the investigation literature of the scientific data services in the university libraries in the United States, the development process of the scientific data is analyzed from three aspects of the service types, the service mode and the service contents. The author of this paper also proposes opportunities and challenges from 5 aspects of the policy support. strengthening the publicity, the self learning, the self positioning and relying on the embedded subject librarians, to promote the development of the library scientific data services.
文摘By using the method of least square linear fitting to analyze data do not exist errors under certain conditions, in order to make the linear data fitting method that can more accurately solve the relationship expression between the volume and quantity in scientific experiments and engineering practice, this article analyzed data error by commonly linear data fitting method, and proposed improved process of the least distance squ^re method based on least squares method. Finally, the paper discussed the advantages and disadvantages through the example analysis of two kinds of linear data fitting method, and given reasonable control conditions for its application.
基金supported by the National Natural Science Foundation of China (Grant Nos.60775039 and 60875075)supported by the Grant-in-aid for Scientific Research (Grant No.18300053) from the Japanese Society for the Promotion of Science+2 种基金Support Center for Advanced Telecommunications Technology Research,Foundationthe Open Foundation of Key Laboratory of Multimedia and Intelligent Software Technology (Beijing University of Technology) Beijingthe Doctoral Research Fund of Beijing University of Technology (Grant No.00243)
文摘Although much has been known about how humans psychologically perform data-driven scientific discovery,less has been known about its brain mechanism.The number series completion is a typical data-driven scientific discovery task,and has been demonstrated to possess the priming effect,which is attributed to the regularity identification and its subsequent extrapolation.In order to reduce the heterogeneities and make the experimental task proper for a brain imaging study,the number magnitude and arithmetic operation involved in number series completion tasks are further restricted.Behavioral performance in Experiment 1 shows the reliable priming effect for targets as expected.Then,a factorial design (the priming effect:prime vs.target;the period length:simple vs.complex) of event-related functional magnetic resonance imaging (fMRI) is used in Experiment 2 to examine the neural basis of data-driven scientific discovery.The fMRI results reveal a double dissociation of the left DLPFC (dorsolateral prefrontal cortex) and the left APFC (anterior prefrontal cortex) between the simple (period length=1) and the complex (period length=2) number series completion task.The priming effect in the left DLPFC is more significant for the simple task than for the complex task,while the priming effect in the left APFC is more significant for the complex task than for the simple task.The reliable double dissociation may suggest the different roles of the left DLPFC and left APFC in data-driven scientific discovery.The left DLPFC (BA 46) may play a crucial role in rule identification,while the left APFC (BA 10) may be related to mental set maintenance needed during rule identification and extrapolation.
基金supported by the Research Grant Council of Hong Kong (Grant Nos. 509413 and 14311916)Direct Grants for Research of The Chinese University of Hong Kong (Grant Nos. 3132754 and 4053235)+3 种基金the Natural Science Foundation of Jiangxi Province (Grant No. 20161BAB201024)the Key Science Fund Project of Jiangxi Province Eduction Department (Grant No. GJJ150439)National Natural Science Foundation of China (Grant Nos. 11461029, 11601197 and 61562030)the Canadian Institutes of Health Research (Grant No. 145546)
文摘With the rapid-growth-in-size scientific data in various disciplines, feature screening plays an important role to reduce the high-dimensionality to a moderate scale in many scientific fields. In this paper, we introduce a unified and robust model-free feature screening approach for high-dimensional survival data with censoring, which has several advantages: it is a model-free approach under a general model framework, and hence avoids the complication to specify an actual model form with huge number of candidate variables; under mild conditions without requiring the existence of any moment of the response, it enjoys the ranking consistency and sure screening properties in ultra-high dimension. In particular, we impose a conditional independence assumption of the response and the censoring variable given each covariate, instead of assuming the censoring variable is independent of the response and the covariates. Moreover, we also propose a more robust variant to the new procedure, which possesses desirable theoretical properties without any finite moment condition of the predictors and the response. The computation of the newly proposed methods does not require any complicated numerical optimization and it is fast and easy to implement. Extensive numerical studies demonstrate that the proposed methods perform competitively for various configurations. Application is illustrated with an analysis of a genetic data set.