To integrate reasoning and text retrieval, the architecture of a semantic search engine which includes several kinds of queries is proposed, and the semantic search engine Smartch is designed and implemented. Based on...To integrate reasoning and text retrieval, the architecture of a semantic search engine which includes several kinds of queries is proposed, and the semantic search engine Smartch is designed and implemented. Based on a logical reasoning process and a graphic user-defined process, Smartch provides four kinds of search services. They are basic search, concept search, graphic user-defined query and association relationship search. The experimental results show that compared with the traditional search engine, the recall and precision of Smartch are improved. Graphic user-defined queries can accurately locate the information of user needs. Association relationship search can find complicated relationships between concepts. Smartch can perform some intelligent functions based on ontology inference.展开更多
A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained ...A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained from the web by search engines and an initial candidate mapping set consisting of ontology concept pairs is generated. According to the concept hierarchies of ontologies, a set of production rules is proposed to delete the concept pairs inconsistent with the ontology semantics from the initial candidate mapping set and add the concept pairs consistent with the ontology semantics to it. Finally, ontology mappings are chosen from the candidate mapping set automatically with a mapping select rule which is based on mutual information. Experimental results show that the F-measure can reach 75% to 100% and it can effectively accomplish the mapping between ontologies.展开更多
As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking alg...As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking algorithm such as term frequency,link analysis(PageRank algorithm and HITS algorithm)etc.However,these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet.Moreover,we expect the search engines could understand users’searching by content meanings rather than literal strings.Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers.But,the current technology for the semantic search is hard to apply.Because some meta data should be annotated to each web pages,then the search engine will have the ability to understand the users intend.However,annotate every web page is very time-consuming and leads to inefficiency.So,this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search.And let the search engine can understand users more semantically when it gets the knowledge.展开更多
Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting...Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting single instruction multiple date (SIMD) capacity of processors to improve the performance of tree search, and proposes several improvement methods on reported SIMD tree search algorithms. Based on blocking tree structure, blocking for memory alignment and dynamic blocking prefetch are proposed to optimize the overhead of memory access. Furthermore, as a way of non-linear loop unrolling, the search branch unwinding shows that the number of branches can exceed the data width of SIMD instructions in the SIMD search algorithm. The experiments suggest that blocking optimized SIMD tree search algorithm can achieve 1.6 times response speed faster than the un-optimized algorithm.展开更多
This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data.We propose an algorithm combining LSA and a ...This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data.We propose an algorithm combining LSA and a Two-Tier Ranking(LSATTR)algorithm based on revised cosine similarity to build a more efficient search engine-Semantic Indexing and Ranking(SIR)-for a semantic-enabled,more effective data discovery.In addition to its ability to handle subject-based search,we propose a mechanism to combine geospatial taxonomy and Yahoo!GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related.The metadata set,in the format of ISO19115,from NASA's SEDAC(Socio-Economic Data Application Center)is used as the corpus of SIR.Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques,such as Lucene,in terms of both recall and precision.Moreover,the semantic associations among all existing words in the corpus are discovered.These associations provide substantial support for automating the population of spatial ontologies.We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.展开更多
基金The National Natural Science Foundation of China(No60403027)
文摘To integrate reasoning and text retrieval, the architecture of a semantic search engine which includes several kinds of queries is proposed, and the semantic search engine Smartch is designed and implemented. Based on a logical reasoning process and a graphic user-defined process, Smartch provides four kinds of search services. They are basic search, concept search, graphic user-defined query and association relationship search. The experimental results show that compared with the traditional search engine, the recall and precision of Smartch are improved. Graphic user-defined queries can accurately locate the information of user needs. Association relationship search can find complicated relationships between concepts. Smartch can perform some intelligent functions based on ontology inference.
基金The National Natural Science Foundation of China(No60425206,90412003)the Foundation of Excellent Doctoral Dis-sertation of Southeast University (NoYBJJ0502)
文摘A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained from the web by search engines and an initial candidate mapping set consisting of ontology concept pairs is generated. According to the concept hierarchies of ontologies, a set of production rules is proposed to delete the concept pairs inconsistent with the ontology semantics from the initial candidate mapping set and add the concept pairs consistent with the ontology semantics to it. Finally, ontology mappings are chosen from the candidate mapping set automatically with a mapping select rule which is based on mutual information. Experimental results show that the F-measure can reach 75% to 100% and it can effectively accomplish the mapping between ontologies.
文摘As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking algorithm such as term frequency,link analysis(PageRank algorithm and HITS algorithm)etc.However,these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet.Moreover,we expect the search engines could understand users’searching by content meanings rather than literal strings.Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers.But,the current technology for the semantic search is hard to apply.Because some meta data should be annotated to each web pages,then the search engine will have the ability to understand the users intend.However,annotate every web page is very time-consuming and leads to inefficiency.So,this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search.And let the search engine can understand users more semantically when it gets the knowledge.
基金Project supported by the Shanghai Leading Academic Discipline Project(Grant No.J50103)the Graduate Student Innovation Foundation of Shanghai University(Grant No.SHUCX112167)
文摘Tree search is a widely used fundamental algorithm. Modern processors provide tremendous computing power by integrating multiple cores, each with a vector processing unit. This paper reviews some studies on exploiting single instruction multiple date (SIMD) capacity of processors to improve the performance of tree search, and proposes several improvement methods on reported SIMD tree search algorithms. Based on blocking tree structure, blocking for memory alignment and dynamic blocking prefetch are proposed to optimize the overhead of memory access. Furthermore, as a way of non-linear loop unrolling, the search branch unwinding shows that the number of branches can exceed the data width of SIMD instructions in the SIMD search algorithm. The experiments suggest that blocking optimized SIMD tree search algorithm can achieve 1.6 times response speed faster than the un-optimized algorithm.
文摘This paper reports our efforts to address the grand challenge of the Digital Earth vision in terms of intelligent data discovery from vast quantities of geo-referenced data.We propose an algorithm combining LSA and a Two-Tier Ranking(LSATTR)algorithm based on revised cosine similarity to build a more efficient search engine-Semantic Indexing and Ranking(SIR)-for a semantic-enabled,more effective data discovery.In addition to its ability to handle subject-based search,we propose a mechanism to combine geospatial taxonomy and Yahoo!GeoPlanet for automatic identification of location information from a spatial query and automatic filtering of datasets that are not spatially related.The metadata set,in the format of ISO19115,from NASA's SEDAC(Socio-Economic Data Application Center)is used as the corpus of SIR.Results show that our semantic search engine SIR built on LSATTR methods outperforms existing keyword-matching techniques,such as Lucene,in terms of both recall and precision.Moreover,the semantic associations among all existing words in the corpus are discovered.These associations provide substantial support for automating the population of spatial ontologies.We expect this work to support the operationalization of the Digital Earth vision by advancing the semantic-based geospatial data discovery.