There is a critical need to develop a means for fast,task-driven discovery of geospatial data found in geoportals.Existing geoportals,however,only provide metadata-based means for discovery,with little support for tas...There is a critical need to develop a means for fast,task-driven discovery of geospatial data found in geoportals.Existing geoportals,however,only provide metadata-based means for discovery,with little support for task-driven discovery,especially when considering spatial–temporal awareness.To address this gap,this paper presents a Case-Based Reasoning-supported Geospatial Data Discovery(CBR-GDD)method and implementation that accesses geospatial data by tasks.The advantages of the CBR-GDD approach is that it builds an analogue reasoning process that provides an internal mechanism bridging tasks and geospatial data with spatial–temporal awareness,thus providing solutions based on past tasks.The CBR-GDD approach includes a set of algorithms that were successfully implemented via three components as an extension of geoportals:ontology-enhanced knowledge base,similarity assessment model,and case retrieval nets.A set of experiments and case studies validate the CBR-GDD approach and application,and demonstrate its efficiency.展开更多
In the research field of spatiotemporal data discovery,how to utilize the semantic characteristics of spatiotemporal datasets is an important topic.This paper presented a content-based recommendation method,and applie...In the research field of spatiotemporal data discovery,how to utilize the semantic characteristics of spatiotemporal datasets is an important topic.This paper presented a content-based recommendation method,and applied Bayesian networks and ontologies into the vocabulary recommendation process for spatiotemporal data discovery.The source data of this research was from the MUDROD(Mining and Utilizing Dataset Relevancy from Oceanographic Datasets)search platform.From the historical search log,major keywords were extracted and organized according to ontologies in a hierarchical structure.Using the search history,the posterior probability between each subclass and their super class in the ontologies was calculated,indicating a recommendation likelihood.We created a Bayesian network model for inference based on ontologies.This model can address the following two objectives:(1)Given one class in the ontology,the model can judge which class has the biggest likelihood to be selected for recommendation.(2)Based on the search history of a user,the Bayesian network model can judge which class has the biggest probability to be recommended.Comparison experimentation with existing system and evaluation experimentation with expert knowledge show that this method is specifically helpful for spatiotemporal data discovery.展开更多
Important Dates Submission due November 15, 2005 Notification of acceptance December 30, 2005 Camera-ready copy due January 10, 2006 Workshop Scope Intelligence and Security Informatics (ISI) can be broadly defined as...Important Dates Submission due November 15, 2005 Notification of acceptance December 30, 2005 Camera-ready copy due January 10, 2006 Workshop Scope Intelligence and Security Informatics (ISI) can be broadly defined as the study of the development and use of advanced information technologies and systems for national and international security-related applications. The First and Second Symposiums on ISI were held in Tucson,Arizona,in 2003 and 2004,respectively. In 2005,the IEEE International Conference on ISI was held in Atlanta,Georgia. These ISI conferences have brought together academic researchers,law enforcement and intelligence experts,information technology consultant and practitioners to discuss their research and practice related to various ISI topics including ISI data management,data and text mining for ISI applications,terrorism informatics,deception detection,terrorist and criminal social network analysis,crime analysis,monitoring and surveillance,policy studies and evaluation,information assurance,among others. We continue this stream of ISI conferences by organizing the Workshop on Intelligence and Security Informatics (WISI’06) in conjunction with the Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD’06). WISI’06 will provide a stimulating forum for ISI researchers in Pacific Asia and other regions of the world to exchange ideas and report research progress. The workshop also welcomes contributions dealing with ISI challenges specific to the Pacific Asian region.展开更多
An adequate compute and storage infrastructure supporting the full exploitation of Copernicus and Earth Observation datasets is currently not available in Europe.This paper presents the cross-disciplinary open-source ...An adequate compute and storage infrastructure supporting the full exploitation of Copernicus and Earth Observation datasets is currently not available in Europe.This paper presents the cross-disciplinary open-source technologies being leveraged in the C-SCALE project to develop an open federation of compute and data providers as an alternative to monolithic infrastructures for processing and analysing Copernicus and Earth Observation data.Three critical aspects of the federation and the chosen technologies are elaborated upon:(1)federated data discovery,(2)federated access and(3)software distribution.With these technologies the open federation aims to provide homogenous access to resources,thereby enabling its users to generate meaningful results quickly and easily.This will be achieved by abstracting the complexity of infrastructure resource access provisioning and orchestration,including discovery of data across distributed archives,away from the end-users.Which is needed because end-users wish to focus on analysing ready-to-use data products and models rather than spending their time on the setup and maintenance of complex and heterogeneous IT infrastructures.The open federation will support processing and analysing the vast amounts of Copernicus and Earth Observation data that are critical for the implementation of the Destination Earth resp.Digital Twins vision for a high precision digital model of the Earth to model,monitor and simulate natural phenomena and related human activities.展开更多
The marking scheme method removes the low scores of the contractor's attributes given by experts when the overall score is calculated, which may result in that a contractor with some latent risks will win the proj...The marking scheme method removes the low scores of the contractor's attributes given by experts when the overall score is calculated, which may result in that a contractor with some latent risks will win the project. In order to remedy the above defect of the marking scheme method, an outlier detection model, which is one mission of knowledge discovery in data, is established on the basis of the sum of similar coefficients. Then, the model is applied to the historical score data of tender evaluation for civil projects in Tianjin, China, according to which the outliers of the scores of the contractor's attributes can be detected and analyzed. Consequently, risk pre-warning can be carried out, and some advice to employers can be given to prevent some latent risks and help them improve the success rate of bidding projects.展开更多
In this era of big data, data are often collected from multiple sources that have different reliabilities, and there is inevitable conflict with respect to the various information obtained when it relates to the the s...In this era of big data, data are often collected from multiple sources that have different reliabilities, and there is inevitable conflict with respect to the various information obtained when it relates to the the same object.One important task is to identify the most trustworthy value out of all the conflicting claims, and this is known as truth discovery. Existing truth discovery methods simultaneously identify the most trustworthy information and source reliability degrees and are based on the idea that more reliable sources often provide more trustworthy information,and vice versa. However, there are often semantic constrains defined upon relational database, which can be violated by a single data source. To remove violations, an important task is to repair data to satisfy the constrains,and this is known as data cleaning. The two problems above may coexist, but considering them together can provide some benefits, and to the authors knowledge, this has not yet been the focus of any research. In this paper, therefore, a schema-decomposing based method is proposed to simultaneously discover the truth and to clean the data, with the aim of improving accuracy. Experimental results using real world data sets of notebooks and mobile phones, as well as simulated data sets, demonstrate the effectiveness and efficiency of our proposed method.展开更多
Tsinghua Science and Technology is founded and published since 1996. It is an international academic journal sponsored by Tsinghua University and is published bimonthly. This journal aims at presenting the up-to-date ...Tsinghua Science and Technology is founded and published since 1996. It is an international academic journal sponsored by Tsinghua University and is published bimonthly. This journal aims at presenting the up-to-date scientific achievements in computer science, and other information technology fields. It is indexed by Ei and other abstracting and indexing services. From 2013, the journal commits to the open access at IEEE Xplore Digital Library.展开更多
There are several issues with Web-based search interfaces on a Sensor Web data infrastructure.It can be difficult to(1)find the proper keywords for the formulation of queries and(2)explore the information if the user ...There are several issues with Web-based search interfaces on a Sensor Web data infrastructure.It can be difficult to(1)find the proper keywords for the formulation of queries and(2)explore the information if the user does not have previous knowledge about the particular sensor systems providing the informa-tion.We investigate how the visualization of sensor resources on a 3D Web-based Digital Earth globe organized by level-of-detail(LOD)can enhance search and exploration of information by easing the formulation of geospatial queries against the metadata of sensor systems.Our case study provides an approach inspired by geographical mashups in which freely available functionality and data are flexibly combined.We use PostgreSQL,PostGIS,PHP,and X3D-Earth technologies to allow the Web3D standard and its geospatial component to be used for visual exploration and LOD control of a dynamic scene.Our goal is to facilitate the dynamic exploration of the Sensor Web and to allow the user to seamlessly focus in on a particular sensor system from a set of registered sensor networks deployed across the globe.We present a prototype metadata exploration system featuring LOD for a multiscaled Sensor Web as a Digital Earth application.展开更多
基金supported by the National Key Research and Development Program of China[grant number 2016YFB0502204]Opening research fund of State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing[grant number(16)Key04]+1 种基金Opening fund of Guangxi Key Laboratory of Earth Surface Processes and Intelligent Simulation(Guangxi Teachers Education University)[grant number 2015GXESPKF02]National Natural Science Foundation of China[grant number 41401524].
文摘There is a critical need to develop a means for fast,task-driven discovery of geospatial data found in geoportals.Existing geoportals,however,only provide metadata-based means for discovery,with little support for task-driven discovery,especially when considering spatial–temporal awareness.To address this gap,this paper presents a Case-Based Reasoning-supported Geospatial Data Discovery(CBR-GDD)method and implementation that accesses geospatial data by tasks.The advantages of the CBR-GDD approach is that it builds an analogue reasoning process that provides an internal mechanism bridging tasks and geospatial data with spatial–temporal awareness,thus providing solutions based on past tasks.The CBR-GDD approach includes a set of algorithms that were successfully implemented via three components as an extension of geoportals:ontology-enhanced knowledge base,similarity assessment model,and case retrieval nets.A set of experiments and case studies validate the CBR-GDD approach and application,and demonstrate its efficiency.
基金Publication of this article was funded in part by the George Mason University Libraries Open Access Publishing Fund.
文摘In the research field of spatiotemporal data discovery,how to utilize the semantic characteristics of spatiotemporal datasets is an important topic.This paper presented a content-based recommendation method,and applied Bayesian networks and ontologies into the vocabulary recommendation process for spatiotemporal data discovery.The source data of this research was from the MUDROD(Mining and Utilizing Dataset Relevancy from Oceanographic Datasets)search platform.From the historical search log,major keywords were extracted and organized according to ontologies in a hierarchical structure.Using the search history,the posterior probability between each subclass and their super class in the ontologies was calculated,indicating a recommendation likelihood.We created a Bayesian network model for inference based on ontologies.This model can address the following two objectives:(1)Given one class in the ontology,the model can judge which class has the biggest likelihood to be selected for recommendation.(2)Based on the search history of a user,the Bayesian network model can judge which class has the biggest probability to be recommended.Comparison experimentation with existing system and evaluation experimentation with expert knowledge show that this method is specifically helpful for spatiotemporal data discovery.
文摘Important Dates Submission due November 15, 2005 Notification of acceptance December 30, 2005 Camera-ready copy due January 10, 2006 Workshop Scope Intelligence and Security Informatics (ISI) can be broadly defined as the study of the development and use of advanced information technologies and systems for national and international security-related applications. The First and Second Symposiums on ISI were held in Tucson,Arizona,in 2003 and 2004,respectively. In 2005,the IEEE International Conference on ISI was held in Atlanta,Georgia. These ISI conferences have brought together academic researchers,law enforcement and intelligence experts,information technology consultant and practitioners to discuss their research and practice related to various ISI topics including ISI data management,data and text mining for ISI applications,terrorism informatics,deception detection,terrorist and criminal social network analysis,crime analysis,monitoring and surveillance,policy studies and evaluation,information assurance,among others. We continue this stream of ISI conferences by organizing the Workshop on Intelligence and Security Informatics (WISI’06) in conjunction with the Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD’06). WISI’06 will provide a stimulating forum for ISI researchers in Pacific Asia and other regions of the world to exchange ideas and report research progress. The workshop also welcomes contributions dealing with ISI challenges specific to the Pacific Asian region.
基金the C-SCALE project(https://c-scale.eu/),which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101017529。
文摘An adequate compute and storage infrastructure supporting the full exploitation of Copernicus and Earth Observation datasets is currently not available in Europe.This paper presents the cross-disciplinary open-source technologies being leveraged in the C-SCALE project to develop an open federation of compute and data providers as an alternative to monolithic infrastructures for processing and analysing Copernicus and Earth Observation data.Three critical aspects of the federation and the chosen technologies are elaborated upon:(1)federated data discovery,(2)federated access and(3)software distribution.With these technologies the open federation aims to provide homogenous access to resources,thereby enabling its users to generate meaningful results quickly and easily.This will be achieved by abstracting the complexity of infrastructure resource access provisioning and orchestration,including discovery of data across distributed archives,away from the end-users.Which is needed because end-users wish to focus on analysing ready-to-use data products and models rather than spending their time on the setup and maintenance of complex and heterogeneous IT infrastructures.The open federation will support processing and analysing the vast amounts of Copernicus and Earth Observation data that are critical for the implementation of the Destination Earth resp.Digital Twins vision for a high precision digital model of the Earth to model,monitor and simulate natural phenomena and related human activities.
基金Project of Tianjin Water Resources Bureau(No.KY2007-09)
文摘The marking scheme method removes the low scores of the contractor's attributes given by experts when the overall score is calculated, which may result in that a contractor with some latent risks will win the project. In order to remedy the above defect of the marking scheme method, an outlier detection model, which is one mission of knowledge discovery in data, is established on the basis of the sum of similar coefficients. Then, the model is applied to the historical score data of tender evaluation for civil projects in Tianjin, China, according to which the outliers of the scores of the contractor's attributes can be detected and analyzed. Consequently, risk pre-warning can be carried out, and some advice to employers can be given to prevent some latent risks and help them improve the success rate of bidding projects.
基金partially supported by the Key Research and Development Plan of National Ministry of Science and Technology (No. 2016YFB1000703)the Key Program of the National Natural Science Foundation of China (Nos. 61190115, 61472099, 61632010, and U1509216)+2 种基金National Sci-Tech Support Plan (No. 2015BAH10F01)the Scientific Research Foundation for the Returned Overseas Chinese Scholars of Heilongjiang Province (No. LC2016026)MOE-Microsoft Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology
文摘In this era of big data, data are often collected from multiple sources that have different reliabilities, and there is inevitable conflict with respect to the various information obtained when it relates to the the same object.One important task is to identify the most trustworthy value out of all the conflicting claims, and this is known as truth discovery. Existing truth discovery methods simultaneously identify the most trustworthy information and source reliability degrees and are based on the idea that more reliable sources often provide more trustworthy information,and vice versa. However, there are often semantic constrains defined upon relational database, which can be violated by a single data source. To remove violations, an important task is to repair data to satisfy the constrains,and this is known as data cleaning. The two problems above may coexist, but considering them together can provide some benefits, and to the authors knowledge, this has not yet been the focus of any research. In this paper, therefore, a schema-decomposing based method is proposed to simultaneously discover the truth and to clean the data, with the aim of improving accuracy. Experimental results using real world data sets of notebooks and mobile phones, as well as simulated data sets, demonstrate the effectiveness and efficiency of our proposed method.
文摘Tsinghua Science and Technology is founded and published since 1996. It is an international academic journal sponsored by Tsinghua University and is published bimonthly. This journal aims at presenting the up-to-date scientific achievements in computer science, and other information technology fields. It is indexed by Ei and other abstracting and indexing services. From 2013, the journal commits to the open access at IEEE Xplore Digital Library.
基金This work was supported in part by the Korea Institute of Science and Technology(KIST)Institutional Program(Project No.2E24100).
文摘There are several issues with Web-based search interfaces on a Sensor Web data infrastructure.It can be difficult to(1)find the proper keywords for the formulation of queries and(2)explore the information if the user does not have previous knowledge about the particular sensor systems providing the informa-tion.We investigate how the visualization of sensor resources on a 3D Web-based Digital Earth globe organized by level-of-detail(LOD)can enhance search and exploration of information by easing the formulation of geospatial queries against the metadata of sensor systems.Our case study provides an approach inspired by geographical mashups in which freely available functionality and data are flexibly combined.We use PostgreSQL,PostGIS,PHP,and X3D-Earth technologies to allow the Web3D standard and its geospatial component to be used for visual exploration and LOD control of a dynamic scene.Our goal is to facilitate the dynamic exploration of the Sensor Web and to allow the user to seamlessly focus in on a particular sensor system from a set of registered sensor networks deployed across the globe.We present a prototype metadata exploration system featuring LOD for a multiscaled Sensor Web as a Digital Earth application.