As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results a...As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results are relevant to the user’s requirements.Unfortunately,most existing indexes and ranking algo-rithms crawl documents and web pages based on a limited set of criteria designed to meet user expectations,making it impossible to deliver exceptionally accurate results.As a result,this study investigates and analyses how search engines work,as well as the elements that contribute to higher ranks.This paper addresses the issue of bias by proposing a new ranking algorithm based on the PageRank(PR)algorithm,which is one of the most widely used page ranking algorithms We pro-pose weighted PageRank(WPR)algorithms to test the relationship between these various measures.The Weighted Page Rank(WPR)model was used in three dis-tinct trials to compare the rankings of documents and pages based on one or more user preferences criteria.Thefindings of utilizing the Weighted Page Rank model showed that using multiple criteria to rankfinal pages is better than using only one,and that some criteria had a greater impact on ranking results than others.展开更多
The basic idea behind a personalized web search is to deliver search results that are tailored to meet user needs, which is one of the growing concepts in web technologies. The personalized web search presented in thi...The basic idea behind a personalized web search is to deliver search results that are tailored to meet user needs, which is one of the growing concepts in web technologies. The personalized web search presented in this paper is based on exploiting the implicit feedbacks of user satisfaction during her web browsing history to construct a user profile storing the web pages the user is highly interested in. A weight is assigned to each page stored in the user’s profile;this weight reflects the user’s interest in this page. We name this weight the relative rank of the page, since it depends on the user issuing the query. Therefore, the ranking algorithm provided in this paper is based on the principle that;the rank assigned to a page is the addition of two rank values R_rank and A_rank. A_rank is an absolute rank, since it is fixed for all users issuing the same query, it only depends on the link structures of the web and on the keywords of the query. Thus, it could be calculated by the PageRank algorithm suggested by Brin and Page in 1998 and used by the google search engine. While, R_rank is the relative rank, it is calculated by the methods given in this paper which depends mainly on recording implicit measures of user satisfaction during her previous browsing history.展开更多
Google’s algorithm on PageRank is analyzed in details. Some disadvantages of this algorithm is presented, for instance, preferring old pages, ignoring special sites and inaccurate judge of hyperlinks pointed out from...Google’s algorithm on PageRank is analyzed in details. Some disadvantages of this algorithm is presented, for instance, preferring old pages, ignoring special sites and inaccurate judge of hyperlinks pointed out from one page. Furthermore, author’s improved algorithm is described. Experiments show that the author’s consideration on evaluating the importance of pages can make an improvement over the original algorithm. Based on this improved algorithm a topic specific searching system have been developed.展开更多
A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The cl...A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The client uses the Chinese and English to achieve the synonym construction of the keywords, the establishment of the fuzzy-syllable words and synonyms set of keywords and the implementation of fuzzy search strategy over the encryption of cloud data based on keywords. The server side through the analysis of the user’s query request provides keywords for users to choose and topic words and secondary words are picked out. System will match topic words with historical inquiry in time order, and then the new query result of the request is directly gained. The analysis of the simulation experiment shows that the fuzzy search strategy can make better use of historical results on the basis of privacy protection for the realization of efficient data search, saving the search time and improving the efficiency of search.展开更多
Purpose: The aim of this paper is to discuss how the keyword concentration change ratio(KCCR) is used while identifying the stability-mutation feature of Web search keywords during information analyses and predictions...Purpose: The aim of this paper is to discuss how the keyword concentration change ratio(KCCR) is used while identifying the stability-mutation feature of Web search keywords during information analyses and predictions.Design/methodology/approach: By introducing the stability-mutation feature of keywords and its significance, the paper describes the function of the KCCR in identifying keyword stability-mutation features. By using Ginsberg's influenza keywords, the paper shows how the KCCR can be used to identify the keyword stability-mutation feature effectively.Findings: Keyword concentration ratio has close positive correlation with the change rate of research objects retrieved by users, so from the characteristic of the 'stability-mutation' of keywords, we can understand the relationship between these keywords and certain information. In general, keywords representing for mutation fit for the objects changing in short-term, while those representing for stability are suitable for long-term changing objects. Research limitations: It is difficult to acquire the frequency of keywords, so indexes or parameters which are closely related to the true search volume are chosen for this study.Practical implications: The stability-mutation feature identification of Web search keywords can be applied to predict and analyze the information of unknown public events through observing trends of keyword concentration ratio.Originality/value: The stability-mutation feature of Web search could be quantitatively described by the keyword concentration change ratio(KCCR). Through KCCR, the authors took advantage of Ginsberg's influenza epidemic data accordingly and demonstrated how accurate and effective the method proposed in this paper was while it was used in information analyses and predictions.展开更多
Traditionally, SQL query language is used to search the data in databases. However, it is inappropriate for end-users, since it is complex and hard to learn. It is the need of end-user, searching in databases with key...Traditionally, SQL query language is used to search the data in databases. However, it is inappropriate for end-users, since it is complex and hard to learn. It is the need of end-user, searching in databases with keywords, like in web search engines. This paper presents a survey of work on keyword search in databases. It also includes a brief introduction to the SEEKER system which has been developed.展开更多
Electronic medical records (EMR) facilitate the sharing of medical data, but existing sharing schemes suffer fromprivacy leakage and inefficiency. This article proposes a lightweight, searchable, and controllable EMR ...Electronic medical records (EMR) facilitate the sharing of medical data, but existing sharing schemes suffer fromprivacy leakage and inefficiency. This article proposes a lightweight, searchable, and controllable EMR sharingscheme, which employs a large attribute domain and a linear secret sharing structure (LSSS), the computationaloverhead of encryption and decryption reaches a lightweight constant level, and supports keyword search andpolicy hiding, which improves the high efficiency of medical data sharing. The dynamic accumulator technologyis utilized to enable data owners to flexibly authorize or revoke the access rights of data visitors to the datato achieve controllability of the data. Meanwhile, the data is re-encrypted by Intel Software Guard Extensions(SGX) technology to realize resistance to offline dictionary guessing attacks. In addition, blockchain technology isutilized to achieve credible accountability for abnormal behaviors in the sharing process. The experiments reflectthe obvious advantages of the scheme in terms of encryption and decryption computation overhead and storageoverhead, and theoretically prove the security and controllability in the sharing process, providing a feasible solutionfor the safe and efficient sharing of EMR.展开更多
文摘As data grows in size,search engines face new challenges in extracting more relevant content for users’searches.As a result,a number of retrieval and ranking algorithms have been employed to ensure that the results are relevant to the user’s requirements.Unfortunately,most existing indexes and ranking algo-rithms crawl documents and web pages based on a limited set of criteria designed to meet user expectations,making it impossible to deliver exceptionally accurate results.As a result,this study investigates and analyses how search engines work,as well as the elements that contribute to higher ranks.This paper addresses the issue of bias by proposing a new ranking algorithm based on the PageRank(PR)algorithm,which is one of the most widely used page ranking algorithms We pro-pose weighted PageRank(WPR)algorithms to test the relationship between these various measures.The Weighted Page Rank(WPR)model was used in three dis-tinct trials to compare the rankings of documents and pages based on one or more user preferences criteria.Thefindings of utilizing the Weighted Page Rank model showed that using multiple criteria to rankfinal pages is better than using only one,and that some criteria had a greater impact on ranking results than others.
文摘The basic idea behind a personalized web search is to deliver search results that are tailored to meet user needs, which is one of the growing concepts in web technologies. The personalized web search presented in this paper is based on exploiting the implicit feedbacks of user satisfaction during her web browsing history to construct a user profile storing the web pages the user is highly interested in. A weight is assigned to each page stored in the user’s profile;this weight reflects the user’s interest in this page. We name this weight the relative rank of the page, since it depends on the user issuing the query. Therefore, the ranking algorithm provided in this paper is based on the principle that;the rank assigned to a page is the addition of two rank values R_rank and A_rank. A_rank is an absolute rank, since it is fixed for all users issuing the same query, it only depends on the link structures of the web and on the keywords of the query. Thus, it could be calculated by the PageRank algorithm suggested by Brin and Page in 1998 and used by the google search engine. While, R_rank is the relative rank, it is calculated by the methods given in this paper which depends mainly on recording implicit measures of user satisfaction during her previous browsing history.
文摘Google’s algorithm on PageRank is analyzed in details. Some disadvantages of this algorithm is presented, for instance, preferring old pages, ignoring special sites and inaccurate judge of hyperlinks pointed out from one page. Furthermore, author’s improved algorithm is described. Experiments show that the author’s consideration on evaluating the importance of pages can make an improvement over the original algorithm. Based on this improved algorithm a topic specific searching system have been developed.
文摘A search strategy over encrypted cloud data based on keywords has been improved and has presented a method using different strategies on the client and the server to improve the search efficiency in this paper. The client uses the Chinese and English to achieve the synonym construction of the keywords, the establishment of the fuzzy-syllable words and synonyms set of keywords and the implementation of fuzzy search strategy over the encryption of cloud data based on keywords. The server side through the analysis of the user’s query request provides keywords for users to choose and topic words and secondary words are picked out. System will match topic words with historical inquiry in time order, and then the new query result of the request is directly gained. The analysis of the simulation experiment shows that the fuzzy search strategy can make better use of historical results on the basis of privacy protection for the realization of efficient data search, saving the search time and improving the efficiency of search.
基金supported by National Social Science Foundation of China(Grand No.13&ZD173)
文摘Purpose: The aim of this paper is to discuss how the keyword concentration change ratio(KCCR) is used while identifying the stability-mutation feature of Web search keywords during information analyses and predictions.Design/methodology/approach: By introducing the stability-mutation feature of keywords and its significance, the paper describes the function of the KCCR in identifying keyword stability-mutation features. By using Ginsberg's influenza keywords, the paper shows how the KCCR can be used to identify the keyword stability-mutation feature effectively.Findings: Keyword concentration ratio has close positive correlation with the change rate of research objects retrieved by users, so from the characteristic of the 'stability-mutation' of keywords, we can understand the relationship between these keywords and certain information. In general, keywords representing for mutation fit for the objects changing in short-term, while those representing for stability are suitable for long-term changing objects. Research limitations: It is difficult to acquire the frequency of keywords, so indexes or parameters which are closely related to the true search volume are chosen for this study.Practical implications: The stability-mutation feature identification of Web search keywords can be applied to predict and analyze the information of unknown public events through observing trends of keyword concentration ratio.Originality/value: The stability-mutation feature of Web search could be quantitatively described by the keyword concentration change ratio(KCCR). Through KCCR, the authors took advantage of Ginsberg's influenza epidemic data accordingly and demonstrated how accurate and effective the method proposed in this paper was while it was used in information analyses and predictions.
文摘Traditionally, SQL query language is used to search the data in databases. However, it is inappropriate for end-users, since it is complex and hard to learn. It is the need of end-user, searching in databases with keywords, like in web search engines. This paper presents a survey of work on keyword search in databases. It also includes a brief introduction to the SEEKER system which has been developed.
基金the Natural Science Foundation of Hebei Province under Grant Number F2021201052.
文摘Electronic medical records (EMR) facilitate the sharing of medical data, but existing sharing schemes suffer fromprivacy leakage and inefficiency. This article proposes a lightweight, searchable, and controllable EMR sharingscheme, which employs a large attribute domain and a linear secret sharing structure (LSSS), the computationaloverhead of encryption and decryption reaches a lightweight constant level, and supports keyword search andpolicy hiding, which improves the high efficiency of medical data sharing. The dynamic accumulator technologyis utilized to enable data owners to flexibly authorize or revoke the access rights of data visitors to the datato achieve controllability of the data. Meanwhile, the data is re-encrypted by Intel Software Guard Extensions(SGX) technology to realize resistance to offline dictionary guessing attacks. In addition, blockchain technology isutilized to achieve credible accountability for abnormal behaviors in the sharing process. The experiments reflectthe obvious advantages of the scheme in terms of encryption and decryption computation overhead and storageoverhead, and theoretically prove the security and controllability in the sharing process, providing a feasible solutionfor the safe and efficient sharing of EMR.