As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results a...As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results are relevant to the user’s requirements.Unfortunately,most existing indexes and ranking algo-rithms crawl documents and web pages based on a limited set of criteria designed to meet user expectations,making it impossible to deliver exceptionally accurate results.As a result,this study investigates and analyses how search engines work,as well as the elements that contribute to higher ranks.This paper addresses the issue of bias by proposing a new ranking algorithm based on the PageRank(PR)algorithm,which is one of the most widely used page ranking algorithms We pro-pose weighted PageRank(WPR)algorithms to test the relationship between these various measures.The Weighted Page Rank(WPR)model was used in three dis-tinct trials to compare the rankings of documents and pages based on one or more user preferences criteria.Thefindings of utilizing the Weighted Page Rank model showed that using multiple criteria to rankfinal pages is better than using only one,and that some criteria had a greater impact on ranking results than others.展开更多
To integrate reasoning and text retrieval, the architecture of a semantic search engine which includes several kinds of queries is proposed, and the semantic search engine Smartch is designed and implemented. Based on...To integrate reasoning and text retrieval, the architecture of a semantic search engine which includes several kinds of queries is proposed, and the semantic search engine Smartch is designed and implemented. Based on a logical reasoning process and a graphic user-defined process, Smartch provides four kinds of search services. They are basic search, concept search, graphic user-defined query and association relationship search. The experimental results show that compared with the traditional search engine, the recall and precision of Smartch are improved. Graphic user-defined queries can accurately locate the information of user needs. Association relationship search can find complicated relationships between concepts. Smartch can perform some intelligent functions based on ontology inference.展开更多
This study compares websites that take live data into account using search engine optimization(SEO).A series of steps called search engine optimization can help a website rank highly in search engine results.Static we...This study compares websites that take live data into account using search engine optimization(SEO).A series of steps called search engine optimization can help a website rank highly in search engine results.Static websites and dynamic websites are two different types of websites.Static websites must have the necessary expertise in programming compatible with SEO.Whereas in dynamic websites,one can utilize readily available plugins/modules.The fundamental issue of all website holders is the lower level of page rank,congestion,utilization,and exposure of the website on the search engine.Here,the authors have studied the live data of four websites as the real-time data would indicate how the SEO strategy may be applied to website page rank,page difficulty removal,and brand query,etc.It is also necessary to choose relevant keywords on any website.The right keyword might assist to increase the brand query while also lowering the page difficulty both on and off the page.In order to calculate Off-page SEO,On-page SEO,and SEO Difficulty,the authors examined live data in this study and chose four well-known Indian university and institute websites for this study:www.caluniv.ac.in,www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in.Using live data and SEO,the authors estimated the Off-page SEO,On-page SEO,and SEO Difficulty.It has been shown that the Off-page SEO of www.caluniv.ac.in is lower than that of www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in by 9%,7%,and 7%,respectively.On-page SEO is,in comparison,4%,1%,and 1%more.Every university has continued to keep up its own brand query.Additionally,www.caluniv.ac.in has slightly less SEO Difficulty compared to other websites.The final computed results have been displayed and compared.展开更多
The meta search engines provide service to the users by dispensing the users' requests to the existing search engines. The existing search engines selected by meta search engine determine the searching quality. Be...The meta search engines provide service to the users by dispensing the users' requests to the existing search engines. The existing search engines selected by meta search engine determine the searching quality. Because the performance of the existing search engines and the users' requests are changed dynamically, it is not favorable for the fixed search engines to optimize the holistic performance of the meta search engine. This paper applies the genetic algorithm (GA) to realize the scheduling strategy of agent manager in our meta search engine, GSE(general search engine), which can simulate the evolution process of living things more lively and more efficiently. By using GA, the combination of search engines can be optimized and hence the holistic performance of GSE can be improved dramatically.展开更多
A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained ...A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained from the web by search engines and an initial candidate mapping set consisting of ontology concept pairs is generated. According to the concept hierarchies of ontologies, a set of production rules is proposed to delete the concept pairs inconsistent with the ontology semantics from the initial candidate mapping set and add the concept pairs consistent with the ontology semantics to it. Finally, ontology mappings are chosen from the candidate mapping set automatically with a mapping select rule which is based on mutual information. Experimental results show that the F-measure can reach 75% to 100% and it can effectively accomplish the mapping between ontologies.展开更多
The problem of associating the agricultural market names on web sites with their locations is essential for geographical analysis of the agricultural products. In this paper, an algorithm which employs the administrat...The problem of associating the agricultural market names on web sites with their locations is essential for geographical analysis of the agricultural products. In this paper, an algorithm which employs the administrative ontology and the statistics from the search results were proposed. The experiments with 100 market names collected from web sites were conducted. The experimental results demonstrate that the algorithm proposed obtains satisfactory performance in resolving the problem above, thus the effectiveness of the method is verified.展开更多
At present, how to enable Search Engine to construct user personal interest model initially, master user's personalized information timely and provide personalized services accurately have become the hotspot in the r...At present, how to enable Search Engine to construct user personal interest model initially, master user's personalized information timely and provide personalized services accurately have become the hotspot in the research of Search Engine area. Aiming at the problems of user model's construction and combining techniques of manual customization modeling and automatic analytical modeling, a User Interest Model (UIM) is proposed in the paper. On the basis of it, the corresponding establishment and update algorithms of User lnterest Profile (UIP) are presented subsequently. Simulation tests proved that the UIM proposed and corresponding algorithms could enhance the retrieval precision effectively and have superior adaptability.展开更多
Because the web is huge and web pages are updated frequently, the index maintained by a search engine has to refresh web pages periodically. This is extremely resource consuming because the search engine needs to craw...Because the web is huge and web pages are updated frequently, the index maintained by a search engine has to refresh web pages periodically. This is extremely resource consuming because the search engine needs to crawl the web and download web pages to refresh its index, Based on present technologies of web refreshing, we present a cooperative schema between web server and search engine for maintaining freshness of web repository. The web server provides metadata defined through XML standard to describe web sites. Before updating the web page the crawler visits the meta-data files. If the meta-data indicates that the page is not modified, then the crawler will not update it. So this schema can save bandwidth resource. A primitive model based on the schema is implemented. The cost and efficiency of the schema are analyzed.展开更多
Search engines have greatly helped us to find the desired information from the Internet. Most search engines use keywords matching technique. This paper discusses a Dynamic Knowledge Base based Search Engine (DKBSE)...Search engines have greatly helped us to find the desired information from the Internet. Most search engines use keywords matching technique. This paper discusses a Dynamic Knowledge Base based Search Engine (DKBSE), which can expand the user's query using the keywords' concept or meaning. To do this, the DKBSE needs to construct and maintain the knowledge base dynamically via the system's searching results and the user's feedback information. The DKBSE expands the user's initial query using the knowledge base, and returns the searched information after the expanded query.展开更多
Web search engines are very useful information service tools in the Internet. The current web search engines produce search results relating to the search terms and the actual information collected by them. Since the ...Web search engines are very useful information service tools in the Internet. The current web search engines produce search results relating to the search terms and the actual information collected by them. Since the selections of the search results cannot affect the future ones, they may not cover most people’s interests. In this paper, feedback information produced by the user’s accessing lists will be represented by the rough set and can reconstruct the query string and influence the search results. And thus the search engines can provide self-adaptability. Key words WWW - search engine - query reconstruction - feedback CLC number TP 311. 135.4 Foundation item: Supported by the National Natural Science Fundation of China (60373066). National Grand Fundamental Research 973 Program of China (2002CB31200). Opening Foundation of State Key Laboratory of Software Engineering in Wuhan University, Opening Foundation of Jiangsu Key Laboratory of Computer Information Processing Technology in Soochow UniversityBiography: ZHANG Wei-feng (1975-), male, Ph.D. research direction: artificial intelligence, search engine, data mining, network language.展开更多
As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking alg...As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking algorithm such as term frequency,link analysis(PageRank algorithm and HITS algorithm)etc.However,these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet.Moreover,we expect the search engines could understand users’searching by content meanings rather than literal strings.Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers.But,the current technology for the semantic search is hard to apply.Because some meta data should be annotated to each web pages,then the search engine will have the ability to understand the users intend.However,annotate every web page is very time-consuming and leads to inefficiency.So,this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search.And let the search engine can understand users more semantically when it gets the knowledge.展开更多
COVID-19 is a devastating pandemic with widespread negative health,social,and economic consequences.Due to drastic changes in the business environment of tour and travel agencies,firms and marketing managers can now u...COVID-19 is a devastating pandemic with widespread negative health,social,and economic consequences.Due to drastic changes in the business environment of tour and travel agencies,firms and marketing managers can now use search engine optimization to effectively position themselves.The study’s main goal is to evaluate the effect of search engine optimization on the market performance of registered tours and travel agencies in Nairobi County,Kenya.Kenya’s tourism ministry and state government work hard to improve the business climate for tour and travel companies.Despite the overall positive image,international tourist market growth rates in Kenya have been 3.5 percent slower from 2017 to 2019 compared to previous years.This was further aggregated by the onset of the COVID-19 pandemic in the year 2020,when the growth rate of tours and travel agencies fell by 65%.The study’s main goal is to evaluate the effect of search engine optimization on the market performance of registered tours and travel agencies in Nairobi County,Kenya.This study adopted a positivist philosophy.Both descriptive and explanatory research designs were used.A self-administered semi-structured questionnaire was used to collect data from 324 registered tours and travels agencies picked from and a sample of 179 were used.Data analysis included the development and interpretation of both descriptive and inferential statistics,such as frequencies,mean,percentages,and standard deviation,and was presented using tables and numerical values.The results of regression analysis established that search engine optimization had a positive and significant effect on market performance of the registered tours and travel agencies picked from a sample of 179.The study recommends that agency management ensure that the firm’s website is easily accessible in order to improve agency performance.Using the internet to gain a large market share can assist tours and travel agencies in improving the performance and income of their websites.展开更多
Notifiable infectious diseases are a major public health concern in China,causing about five million illnesses and twelve thousand deaths every year.Early detection of disease activity,when followed by a rapid respons...Notifiable infectious diseases are a major public health concern in China,causing about five million illnesses and twelve thousand deaths every year.Early detection of disease activity,when followed by a rapid response,can reduce both social and medical impact of the disease.We aim to improve early detection by monitoring health-seeking behavior and disease-related news over the Internet.Specifically,we counted unique search queries submitted to the Baidu search engine in 2008 that contained disease-related search terms.Meanwhile we counted the news articles aggregated by Baidu's robot programs that contained disease-related keywords.We found that the search frequency data and the news count data both have distinct temporal association with disease activity.We adopted a linear model and used searches and news with 1–200-day lead time as explanatory variables to predict the number of infections and deaths attributable to four notifiable infectious diseases,i.e.,scarlet fever,dysentery,AIDS,and tuberculosis.With the search frequency data and news count data,our approach can quantitatively estimate up-to-date epidemic trends 10–40 days ahead of the release of Chinese Centers for Disease Control and Prevention(Chinese CDC)reports.This approach may provide an additional tool for notifiable infectious disease surveillance.展开更多
In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the char...In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the characteristics of power-law function and exhibits strong similarity, and the user' s queries and clicked URLs present dramatic locality, which implies that query cache and 'hot click' cache can be employed to improve system performance. Then three typical cache replacement policies are compared, including LRU, FIFO, and LFU with attenuation. In addition, the distribution character-istics of web information are also analyzed, which demonstrates that the link popularity and replica pop-ularity of a URL have positive influence on its importance. Finally, variance between the link popularity and user popularity, and variance between replica popularity and user popularity are analyzed, which give us some important insight that helps us improve the ranking algorithms in a search engine.展开更多
Users' behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine ...Users' behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine users. By adequately performing analysis on log data, researchers and Internet companies can get guidance to better search engines. In this paper, we perform our analysis based on approximately 750million entries of search requests obtained from log of a real commercial search engine. Several aspects of users' behavior are studied, including query length, ratio of query refining, recommendation access, and so on. Different information needs may lead to different behaviors, and we address this discussion in this paper. We firmly believe that these analyses would be helpful with respect of improving both effectiveness and efficiency of search engines.展开更多
The volume of publically available geospatial data on the web is rapidly increasing due to advances in server-based technologies and the ease at which data can now be created.However,challenges remain with connecting ...The volume of publically available geospatial data on the web is rapidly increasing due to advances in server-based technologies and the ease at which data can now be created.However,challenges remain with connecting individuals searching for geospatial data with servers and websites where such data exist.The objective of this paper is to present a publically available Geospatial Search Engine(GSE)that utilizes a web crawler built on top of the Google search engine in order to search the web for geospatial data.The crawler seeding mechanism combines search terms entered by users with predefined keywords that identify geospatial data services.A procedure runs daily to update map server layers and metadata,and to eliminate servers that go offline.The GSE supports Web Map Services,ArcGIS services,and websites that have geospatial data for download.We applied the GSE to search for all available geospatial services under these formats and provide search results including the spatial distribution of all obtained services.While enhancements to our GSE and to web crawler technology in general lie ahead,our work represents an important step toward realizing the potential of a publically accessible tool for discovering the global availability of geospatial data.展开更多
Nowadays,with increasing open knowledge graphs(KGs)being published on the Web,users depend on open data portals and search engines to find KGs.However,existing systems provide search services and present results with ...Nowadays,with increasing open knowledge graphs(KGs)being published on the Web,users depend on open data portals and search engines to find KGs.However,existing systems provide search services and present results with only metadata while ignoring the contents of KGs,i.e.,triples.It brings difficulty for users’comprehension and relevance judgement.To overcome the limitation of metadata,in this paper we propose a content-based search engine for open KGs named CKGSE.Our system provides keyword search,KG snippet generation,KG profiling and browsing,all based on KGs’detailed,informative contents rather than their brief,limited metadata.To evaluate its usability,we implement a prototype with Chinese KGs crawled from Open KG.CN and report some preliminary results and findings.展开更多
Being too critical, outspoken or engaging in behavior deemed immoral can get you into big trouble in China. And it’s not the authori-ties we’re talking about here. The power of the people has found a new outlet to vent
The human brain is smaller than you might expect:One of them,dripping with formaldehyde,fits in a single gloved hand of a lab supervisor.Soon,this rubbery organ will be frozen solid,coated in glue,and then sliced into...The human brain is smaller than you might expect:One of them,dripping with formaldehyde,fits in a single gloved hand of a lab supervisor.Soon,this rubbery organ will be frozen solid,coated in glue,and then sliced into several thousand wispy slivers,each just 60 micrometers thick.A custom apparatus will scan those sections using 3D polarized light imaging(3DPLI)to measure the spatial orientation of nerve fibers at the micrometer level.The scans will be gathered into展开更多
This paper starts with a description of the present status of the Digital Library of India Initiative. As part of this initiative large corpus of scanned text is available in many Indian languages and has stimulated a...This paper starts with a description of the present status of the Digital Library of India Initiative. As part of this initiative large corpus of scanned text is available in many Indian languages and has stimulated a vast amount of research in Indian language technology briefly described in this paper. Other than the Digital Library of India Initiative which is part of the Million Books to the Web Project initiated by Prof Raj Reddy of Carnegie Mellon University, there are a few more initiatives in India towards taking the heritage of the country to the Web. This paper presents the future directions for the Digital Library of India Initiative both in terms of growing collection and the technical challenges in managing such large collection poses.展开更多
文摘As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results are relevant to the user’s requirements.Unfortunately,most existing indexes and ranking algo-rithms crawl documents and web pages based on a limited set of criteria designed to meet user expectations,making it impossible to deliver exceptionally accurate results.As a result,this study investigates and analyses how search engines work,as well as the elements that contribute to higher ranks.This paper addresses the issue of bias by proposing a new ranking algorithm based on the PageRank(PR)algorithm,which is one of the most widely used page ranking algorithms We pro-pose weighted PageRank(WPR)algorithms to test the relationship between these various measures.The Weighted Page Rank(WPR)model was used in three dis-tinct trials to compare the rankings of documents and pages based on one or more user preferences criteria.Thefindings of utilizing the Weighted Page Rank model showed that using multiple criteria to rankfinal pages is better than using only one,and that some criteria had a greater impact on ranking results than others.
基金The National Natural Science Foundation of China(No60403027)
文摘To integrate reasoning and text retrieval, the architecture of a semantic search engine which includes several kinds of queries is proposed, and the semantic search engine Smartch is designed and implemented. Based on a logical reasoning process and a graphic user-defined process, Smartch provides four kinds of search services. They are basic search, concept search, graphic user-defined query and association relationship search. The experimental results show that compared with the traditional search engine, the recall and precision of Smartch are improved. Graphic user-defined queries can accurately locate the information of user needs. Association relationship search can find complicated relationships between concepts. Smartch can perform some intelligent functions based on ontology inference.
文摘This study compares websites that take live data into account using search engine optimization(SEO).A series of steps called search engine optimization can help a website rank highly in search engine results.Static websites and dynamic websites are two different types of websites.Static websites must have the necessary expertise in programming compatible with SEO.Whereas in dynamic websites,one can utilize readily available plugins/modules.The fundamental issue of all website holders is the lower level of page rank,congestion,utilization,and exposure of the website on the search engine.Here,the authors have studied the live data of four websites as the real-time data would indicate how the SEO strategy may be applied to website page rank,page difficulty removal,and brand query,etc.It is also necessary to choose relevant keywords on any website.The right keyword might assist to increase the brand query while also lowering the page difficulty both on and off the page.In order to calculate Off-page SEO,On-page SEO,and SEO Difficulty,the authors examined live data in this study and chose four well-known Indian university and institute websites for this study:www.caluniv.ac.in,www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in.Using live data and SEO,the authors estimated the Off-page SEO,On-page SEO,and SEO Difficulty.It has been shown that the Off-page SEO of www.caluniv.ac.in is lower than that of www.jnu.ac.in,www.iima.ac.in,and www.iitb.ac.in by 9%,7%,and 7%,respectively.On-page SEO is,in comparison,4%,1%,and 1%more.Every university has continued to keep up its own brand query.Additionally,www.caluniv.ac.in has slightly less SEO Difficulty compared to other websites.The final computed results have been displayed and compared.
基金Supported in part by the National Natural Science F oundation of China(NSFC) (6 0 0 730 12 )
文摘The meta search engines provide service to the users by dispensing the users' requests to the existing search engines. The existing search engines selected by meta search engine determine the searching quality. Because the performance of the existing search engines and the users' requests are changed dynamically, it is not favorable for the fixed search engines to optimize the holistic performance of the meta search engine. This paper applies the genetic algorithm (GA) to realize the scheduling strategy of agent manager in our meta search engine, GSE(general search engine), which can simulate the evolution process of living things more lively and more efficiently. By using GA, the combination of search engines can be optimized and hence the holistic performance of GSE can be improved dramatically.
基金The National Natural Science Foundation of China(No60425206,90412003)the Foundation of Excellent Doctoral Dis-sertation of Southeast University (NoYBJJ0502)
文摘A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained from the web by search engines and an initial candidate mapping set consisting of ontology concept pairs is generated. According to the concept hierarchies of ontologies, a set of production rules is proposed to delete the concept pairs inconsistent with the ontology semantics from the initial candidate mapping set and add the concept pairs consistent with the ontology semantics to it. Finally, ontology mappings are chosen from the candidate mapping set automatically with a mapping select rule which is based on mutual information. Experimental results show that the F-measure can reach 75% to 100% and it can effectively accomplish the mapping between ontologies.
基金supported by the Knowledge Innovation Program of the Chinese Academy of Sciences
文摘The problem of associating the agricultural market names on web sites with their locations is essential for geographical analysis of the agricultural products. In this paper, an algorithm which employs the administrative ontology and the statistics from the search results were proposed. The experiments with 100 market names collected from web sites were conducted. The experimental results demonstrate that the algorithm proposed obtains satisfactory performance in resolving the problem above, thus the effectiveness of the method is verified.
基金Supported by the National Natural Science Foundation of China (50674086)the Doctoral Foundation of Ministry of Education of China (20060290508)the Youth Scientific Research Foundation of CUMT (0D060125)
文摘At present, how to enable Search Engine to construct user personal interest model initially, master user's personalized information timely and provide personalized services accurately have become the hotspot in the research of Search Engine area. Aiming at the problems of user model's construction and combining techniques of manual customization modeling and automatic analytical modeling, a User Interest Model (UIM) is proposed in the paper. On the basis of it, the corresponding establishment and update algorithms of User lnterest Profile (UIP) are presented subsequently. Simulation tests proved that the UIM proposed and corresponding algorithms could enhance the retrieval precision effectively and have superior adaptability.
基金Supported by the National Natural Science Funda-tion of China (60403027)
文摘Because the web is huge and web pages are updated frequently, the index maintained by a search engine has to refresh web pages periodically. This is extremely resource consuming because the search engine needs to crawl the web and download web pages to refresh its index, Based on present technologies of web refreshing, we present a cooperative schema between web server and search engine for maintaining freshness of web repository. The web server provides metadata defined through XML standard to describe web sites. Before updating the web page the crawler visits the meta-data files. If the meta-data indicates that the page is not modified, then the crawler will not update it. So this schema can save bandwidth resource. A primitive model based on the schema is implemented. The cost and efficiency of the schema are analyzed.
文摘Search engines have greatly helped us to find the desired information from the Internet. Most search engines use keywords matching technique. This paper discusses a Dynamic Knowledge Base based Search Engine (DKBSE), which can expand the user's query using the keywords' concept or meaning. To do this, the DKBSE needs to construct and maintain the knowledge base dynamically via the system's searching results and the user's feedback information. The DKBSE expands the user's initial query using the knowledge base, and returns the searched information after the expanded query.
文摘Web search engines are very useful information service tools in the Internet. The current web search engines produce search results relating to the search terms and the actual information collected by them. Since the selections of the search results cannot affect the future ones, they may not cover most people’s interests. In this paper, feedback information produced by the user’s accessing lists will be represented by the rough set and can reconstruct the query string and influence the search results. And thus the search engines can provide self-adaptability. Key words WWW - search engine - query reconstruction - feedback CLC number TP 311. 135.4 Foundation item: Supported by the National Natural Science Fundation of China (60373066). National Grand Fundamental Research 973 Program of China (2002CB31200). Opening Foundation of State Key Laboratory of Software Engineering in Wuhan University, Opening Foundation of Jiangsu Key Laboratory of Computer Information Processing Technology in Soochow UniversityBiography: ZHANG Wei-feng (1975-), male, Ph.D. research direction: artificial intelligence, search engine, data mining, network language.
文摘As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking algorithm such as term frequency,link analysis(PageRank algorithm and HITS algorithm)etc.However,these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet.Moreover,we expect the search engines could understand users’searching by content meanings rather than literal strings.Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers.But,the current technology for the semantic search is hard to apply.Because some meta data should be annotated to each web pages,then the search engine will have the ability to understand the users intend.However,annotate every web page is very time-consuming and leads to inefficiency.So,this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search.And let the search engine can understand users more semantically when it gets the knowledge.
文摘COVID-19 is a devastating pandemic with widespread negative health,social,and economic consequences.Due to drastic changes in the business environment of tour and travel agencies,firms and marketing managers can now use search engine optimization to effectively position themselves.The study’s main goal is to evaluate the effect of search engine optimization on the market performance of registered tours and travel agencies in Nairobi County,Kenya.Kenya’s tourism ministry and state government work hard to improve the business climate for tour and travel companies.Despite the overall positive image,international tourist market growth rates in Kenya have been 3.5 percent slower from 2017 to 2019 compared to previous years.This was further aggregated by the onset of the COVID-19 pandemic in the year 2020,when the growth rate of tours and travel agencies fell by 65%.The study’s main goal is to evaluate the effect of search engine optimization on the market performance of registered tours and travel agencies in Nairobi County,Kenya.This study adopted a positivist philosophy.Both descriptive and explanatory research designs were used.A self-administered semi-structured questionnaire was used to collect data from 324 registered tours and travels agencies picked from and a sample of 179 were used.Data analysis included the development and interpretation of both descriptive and inferential statistics,such as frequencies,mean,percentages,and standard deviation,and was presented using tables and numerical values.The results of regression analysis established that search engine optimization had a positive and significant effect on market performance of the registered tours and travel agencies picked from a sample of 179.The study recommends that agency management ensure that the firm’s website is easily accessible in order to improve agency performance.Using the internet to gain a large market share can assist tours and travel agencies in improving the performance and income of their websites.
文摘Notifiable infectious diseases are a major public health concern in China,causing about five million illnesses and twelve thousand deaths every year.Early detection of disease activity,when followed by a rapid response,can reduce both social and medical impact of the disease.We aim to improve early detection by monitoring health-seeking behavior and disease-related news over the Internet.Specifically,we counted unique search queries submitted to the Baidu search engine in 2008 that contained disease-related search terms.Meanwhile we counted the news articles aggregated by Baidu's robot programs that contained disease-related keywords.We found that the search frequency data and the news count data both have distinct temporal association with disease activity.We adopted a linear model and used searches and news with 1–200-day lead time as explanatory variables to predict the number of infections and deaths attributable to four notifiable infectious diseases,i.e.,scarlet fever,dysentery,AIDS,and tuberculosis.With the search frequency data and news count data,our approach can quantitatively estimate up-to-date epidemic trends 10–40 days ahead of the release of Chinese Centers for Disease Control and Prevention(Chinese CDC)reports.This approach may provide an additional tool for notifiable infectious disease surveillance.
基金This work was supported by the National Grand Fundamental Research of China ( Grant No. G1999032706).
文摘In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the characteristics of power-law function and exhibits strong similarity, and the user' s queries and clicked URLs present dramatic locality, which implies that query cache and 'hot click' cache can be employed to improve system performance. Then three typical cache replacement policies are compared, including LRU, FIFO, and LFU with attenuation. In addition, the distribution character-istics of web information are also analyzed, which demonstrates that the link popularity and replica pop-ularity of a URL have positive influence on its importance. Finally, variance between the link popularity and user popularity, and variance between replica popularity and user popularity are analyzed, which give us some important insight that helps us improve the ranking algorithms in a search engine.
文摘Users' behavior analysis has become one of the most important research topics, especially in terms of performance optimization, architecture analysis, and system maintenance, due to the rapid growth of search engine users. By adequately performing analysis on log data, researchers and Internet companies can get guidance to better search engines. In this paper, we perform our analysis based on approximately 750million entries of search requests obtained from log of a real commercial search engine. Several aspects of users' behavior are studied, including query length, ratio of query refining, recommendation access, and so on. Different information needs may lead to different behaviors, and we address this discussion in this paper. We firmly believe that these analyses would be helpful with respect of improving both effectiveness and efficiency of search engines.
文摘The volume of publically available geospatial data on the web is rapidly increasing due to advances in server-based technologies and the ease at which data can now be created.However,challenges remain with connecting individuals searching for geospatial data with servers and websites where such data exist.The objective of this paper is to present a publically available Geospatial Search Engine(GSE)that utilizes a web crawler built on top of the Google search engine in order to search the web for geospatial data.The crawler seeding mechanism combines search terms entered by users with predefined keywords that identify geospatial data services.A procedure runs daily to update map server layers and metadata,and to eliminate servers that go offline.The GSE supports Web Map Services,ArcGIS services,and websites that have geospatial data for download.We applied the GSE to search for all available geospatial services under these formats and provide search results including the spatial distribution of all obtained services.While enhancements to our GSE and to web crawler technology in general lie ahead,our work represents an important step toward realizing the potential of a publically accessible tool for discovering the global availability of geospatial data.
基金supported by the Nantional Science Foundation of Chnia(No.62072224)
文摘Nowadays,with increasing open knowledge graphs(KGs)being published on the Web,users depend on open data portals and search engines to find KGs.However,existing systems provide search services and present results with only metadata while ignoring the contents of KGs,i.e.,triples.It brings difficulty for users’comprehension and relevance judgement.To overcome the limitation of metadata,in this paper we propose a content-based search engine for open KGs named CKGSE.Our system provides keyword search,KG snippet generation,KG profiling and browsing,all based on KGs’detailed,informative contents rather than their brief,limited metadata.To evaluate its usability,we implement a prototype with Chinese KGs crawled from Open KG.CN and report some preliminary results and findings.
文摘Being too critical, outspoken or engaging in behavior deemed immoral can get you into big trouble in China. And it’s not the authori-ties we’re talking about here. The power of the people has found a new outlet to vent
文摘The human brain is smaller than you might expect:One of them,dripping with formaldehyde,fits in a single gloved hand of a lab supervisor.Soon,this rubbery organ will be frozen solid,coated in glue,and then sliced into several thousand wispy slivers,each just 60 micrometers thick.A custom apparatus will scan those sections using 3D polarized light imaging(3DPLI)to measure the spatial orientation of nerve fibers at the micrometer level.The scans will be gathered into
文摘This paper starts with a description of the present status of the Digital Library of India Initiative. As part of this initiative large corpus of scanned text is available in many Indian languages and has stimulated a vast amount of research in Indian language technology briefly described in this paper. Other than the Digital Library of India Initiative which is part of the Million Books to the Web Project initiated by Prof Raj Reddy of Carnegie Mellon University, there are a few more initiatives in India towards taking the heritage of the country to the Web. This paper presents the future directions for the Digital Library of India Initiative both in terms of growing collection and the technical challenges in managing such large collection poses.