The basic idea behind a personalized web search is to deliver search results that are tailored to meet user needs, which is one of the growing concepts in web technologies. The personalized web search presented in thi...The basic idea behind a personalized web search is to deliver search results that are tailored to meet user needs, which is one of the growing concepts in web technologies. The personalized web search presented in this paper is based on exploiting the implicit feedbacks of user satisfaction during her web browsing history to construct a user profile storing the web pages the user is highly interested in. A weight is assigned to each page stored in the user’s profile;this weight reflects the user’s interest in this page. We name this weight the relative rank of the page, since it depends on the user issuing the query. Therefore, the ranking algorithm provided in this paper is based on the principle that;the rank assigned to a page is the addition of two rank values R_rank and A_rank. A_rank is an absolute rank, since it is fixed for all users issuing the same query, it only depends on the link structures of the web and on the keywords of the query. Thus, it could be calculated by the PageRank algorithm suggested by Brin and Page in 1998 and used by the google search engine. While, R_rank is the relative rank, it is calculated by the methods given in this paper which depends mainly on recording implicit measures of user satisfaction during her previous browsing history.展开更多
As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking alg...As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking algorithm such as term frequency,link analysis(PageRank algorithm and HITS algorithm)etc.However,these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet.Moreover,we expect the search engines could understand users’searching by content meanings rather than literal strings.Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers.But,the current technology for the semantic search is hard to apply.Because some meta data should be annotated to each web pages,then the search engine will have the ability to understand the users intend.However,annotate every web page is very time-consuming and leads to inefficiency.So,this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search.And let the search engine can understand users more semantically when it gets the knowledge.展开更多
Text Summarization models facilitate biomedical clinicians and researchers in acquiring informative data from enormous domain-specific literature within less time and effort.Evaluating and selecting the most informati...Text Summarization models facilitate biomedical clinicians and researchers in acquiring informative data from enormous domain-specific literature within less time and effort.Evaluating and selecting the most informative sentences from biomedical articles is always challenging.This study aims to develop a dual-mode biomedical text summarization model to achieve enhanced coverage and information.The research also includes checking the fitment of appropriate graph ranking techniques for improved performance of the summarization model.The input biomedical text is mapped as a graph where meaningful sentences are evaluated as the central node and the critical associations between them.The proposed framework utilizes the top k similarity technique in a combination of UMLS and a sampled probability-based clustering method which aids in unearthing relevant meanings of the biomedical domain-specific word vectors and finding the best possible associations between crucial sentences.The quality of the framework is assessed via different parameters like information retention,coverage,readability,cohesion,and ROUGE scores in clustering and non-clustering modes.The significant benefits of the suggested technique are capturing crucial biomedical information with increased coverage and reasonable memory consumption.The configurable settings of combined parameters reduce execution time,enhance memory utilization,and extract relevant information outperforming other biomedical baseline models.An improvement of 17%is achieved when the proposed model is checked against similar biomedical text summarizers.展开更多
The structure of Web site became more complex than before. During the design period of a Web site, the lack of model and method results in improper Web structure, which depend on the designer's experience. From th...The structure of Web site became more complex than before. During the design period of a Web site, the lack of model and method results in improper Web structure, which depend on the designer's experience. From the point of view of software engineering, every period in the software life must be evaluated before starting the next period's work. It is very important and essential to search relevant methods for evaluating Web structure before the site is completed. In this work, after studying the related work about the Web structure mining and analyzing the major structure mining methods (Page\|rank and Hub/Authority), a method based on the Page\|rank for Web structure evaluation in design stage is proposed. A Web structure modeling language WSML is designed, and the implement strategies for evaluating system of the Web site structure are given out. Web structure mining has being used mainly in search engines before. It is the first time to employ the Web structure mining technology to evaluate a Web structure in the design period of a Web site. It contributes to the formalization of the design documents for Web site and the improving of software engineering for large scale Web site, and the evaluating system is a practical tool for Web site construction.展开更多
Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefo...Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefore,the objective of text extraction is to attain reduced expressive contents from the text documents.Text summarization has two main areas such as abstractive,and extractive summarization.Extractive text summarization has further two approaches,in which the first approach applies the sentence score algorithm,and the second approach follows the word embedding principles.All such text extractions have limitations in providing the basic theme of the underlying documents.In this paper,we have employed text summarization by TF-IDF with PageRank keywords,sentence score algorithm,and Word2Vec word embedding.The study compared these forms of the text summarizations with the actual text,by calculating cosine similarities.Furthermore,TF-IDF based PageRank keywords are extracted from the other two extractive summarizations.An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed.This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document.This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text.It also solves the issue of deciding the number of representative keywords for a specific text document.To evaluate the technique,the study used a sample of more than eighteen hundred text documents.The abstractive summarization follows the principles of deep learning to create uniform similarity of extracted words with actual text and all other forms of text summarization.The proposed technique provides a stable measure of similarity as compared to existing forms of text summarization.展开更多
The entirety of Amazon’s sales being powered by Amazon Search,one of the leading e-commerce platforms around the globe.As a result,even slight boosts in appropriateness can have a major impact on profits as well as t...The entirety of Amazon’s sales being powered by Amazon Search,one of the leading e-commerce platforms around the globe.As a result,even slight boosts in appropriateness can have a major impact on profits as well as the shopping experience of millions of users.Throughout the beginning,Amazon’s product search engine was made up of a number of manually adjusted ranking processes that made use of a limited number of input features.Since that time,a significant amount has transpired.Many people overlook the fact that Amazon is a search engine,and even the biggest one for e-commerce.It is indeed time to begin treating Amazon truly as the top e-commerce search engine across the globe because it currently serves 54%of all product queries.In this paper,the authors have considered two most important Amazon search engine algorithms viz.A10 and A11 and comparative study has been discussed.展开更多
The ranking of network node importance is one of the most essential problems in the field of network science.Node ranking algorithms serve as an essential part in many application scenarios such as search engine,socia...The ranking of network node importance is one of the most essential problems in the field of network science.Node ranking algorithms serve as an essential part in many application scenarios such as search engine,social networks,and recommendation systems.This paper presents a systematic review on three representative methods:node ranking based on centralities,Page Rank algorithm,and HITS algorithm.Furthermore,we investigate the latest extensions and improvements of these representative methods,provided with several main application fields.Inspired by the survey of current literature,we attempt to propose promising directions for future research.The conclusions of this paper are enlightening and beneficial to both the academic and industrial communities.展开更多
The ratings in many user-object online rating systems can reflect whether users like or dislike the objects,and in some online rating systems,users can directly choose whether to like an object.So these systems can be...The ratings in many user-object online rating systems can reflect whether users like or dislike the objects,and in some online rating systems,users can directly choose whether to like an object.So these systems can be represented by signed bipartite networks,but the original unsigned node evaluation algorithm cannot be directly used on the signed networks.This paper proposes the Signed Page Rank algorithm for signed bipartite networks to evaluate the object and user nodes at the same time.Based on the global information,the nodes can be sorted by the Signed Page Rank values in descending order,and the result is SR Ranking.The authors analyze the characteristics of top and bottom nodes of the real networks and find out that for objects,the SR Ranking can provide a more reasonable ranking which combines the degree and rating of node,and the algorithm also can help us to identify users with specific rating patterns.By discussing the location of negative edges and the sensitivity of object SR Ranking to negative edges,the authors also explore that the negative edges play an important role in the algorithm and explain that why the bad reviews are more important in real networks.展开更多
文摘The basic idea behind a personalized web search is to deliver search results that are tailored to meet user needs, which is one of the growing concepts in web technologies. The personalized web search presented in this paper is based on exploiting the implicit feedbacks of user satisfaction during her web browsing history to construct a user profile storing the web pages the user is highly interested in. A weight is assigned to each page stored in the user’s profile;this weight reflects the user’s interest in this page. We name this weight the relative rank of the page, since it depends on the user issuing the query. Therefore, the ranking algorithm provided in this paper is based on the principle that;the rank assigned to a page is the addition of two rank values R_rank and A_rank. A_rank is an absolute rank, since it is fixed for all users issuing the same query, it only depends on the link structures of the web and on the keywords of the query. Thus, it could be calculated by the PageRank algorithm suggested by Brin and Page in 1998 and used by the google search engine. While, R_rank is the relative rank, it is calculated by the methods given in this paper which depends mainly on recording implicit measures of user satisfaction during her previous browsing history.
文摘As the tsunami of data has emerged,search engines have become the most powerful tool for obtaining scattered information on the internet.The traditional search engines return the organized results by using ranking algorithm such as term frequency,link analysis(PageRank algorithm and HITS algorithm)etc.However,these algorithms must combine the keyword frequency to determine the relevance between user’s query and the data in the computer system or internet.Moreover,we expect the search engines could understand users’searching by content meanings rather than literal strings.Semantic Web is an intelligent network and it could understand human’s language more semantically and make the communication easier between human and computers.But,the current technology for the semantic search is hard to apply.Because some meta data should be annotated to each web pages,then the search engine will have the ability to understand the users intend.However,annotate every web page is very time-consuming and leads to inefficiency.So,this study designed an ontology-based approach to improve the current traditional keyword-based search and emulate the effects of semantic search.And let the search engine can understand users more semantically when it gets the knowledge.
文摘Text Summarization models facilitate biomedical clinicians and researchers in acquiring informative data from enormous domain-specific literature within less time and effort.Evaluating and selecting the most informative sentences from biomedical articles is always challenging.This study aims to develop a dual-mode biomedical text summarization model to achieve enhanced coverage and information.The research also includes checking the fitment of appropriate graph ranking techniques for improved performance of the summarization model.The input biomedical text is mapped as a graph where meaningful sentences are evaluated as the central node and the critical associations between them.The proposed framework utilizes the top k similarity technique in a combination of UMLS and a sampled probability-based clustering method which aids in unearthing relevant meanings of the biomedical domain-specific word vectors and finding the best possible associations between crucial sentences.The quality of the framework is assessed via different parameters like information retention,coverage,readability,cohesion,and ROUGE scores in clustering and non-clustering modes.The significant benefits of the suggested technique are capturing crucial biomedical information with increased coverage and reasonable memory consumption.The configurable settings of combined parameters reduce execution time,enhance memory utilization,and extract relevant information outperforming other biomedical baseline models.An improvement of 17%is achieved when the proposed model is checked against similar biomedical text summarizers.
文摘The structure of Web site became more complex than before. During the design period of a Web site, the lack of model and method results in improper Web structure, which depend on the designer's experience. From the point of view of software engineering, every period in the software life must be evaluated before starting the next period's work. It is very important and essential to search relevant methods for evaluating Web structure before the site is completed. In this work, after studying the related work about the Web structure mining and analyzing the major structure mining methods (Page\|rank and Hub/Authority), a method based on the Page\|rank for Web structure evaluation in design stage is proposed. A Web structure modeling language WSML is designed, and the implement strategies for evaluating system of the Web site structure are given out. Web structure mining has being used mainly in search engines before. It is the first time to employ the Web structure mining technology to evaluate a Web structure in the design period of a Web site. It contributes to the formalization of the design documents for Web site and the improving of software engineering for large scale Web site, and the evaluating system is a practical tool for Web site construction.
文摘Text Summarization is an essential area in text mining,which has procedures for text extraction.In natural language processing,text summarization maps the documents to a representative set of descriptive words.Therefore,the objective of text extraction is to attain reduced expressive contents from the text documents.Text summarization has two main areas such as abstractive,and extractive summarization.Extractive text summarization has further two approaches,in which the first approach applies the sentence score algorithm,and the second approach follows the word embedding principles.All such text extractions have limitations in providing the basic theme of the underlying documents.In this paper,we have employed text summarization by TF-IDF with PageRank keywords,sentence score algorithm,and Word2Vec word embedding.The study compared these forms of the text summarizations with the actual text,by calculating cosine similarities.Furthermore,TF-IDF based PageRank keywords are extracted from the other two extractive summarizations.An intersection over these three types of TD-IDF keywords to generate the more representative set of keywords for each text document is performed.This technique generates variable-length keywords as per document diversity instead of selecting fixedlength keywords for each document.This form of abstractive summarization improves metadata similarity to the original text compared to all other forms of summarized text.It also solves the issue of deciding the number of representative keywords for a specific text document.To evaluate the technique,the study used a sample of more than eighteen hundred text documents.The abstractive summarization follows the principles of deep learning to create uniform similarity of extracted words with actual text and all other forms of text summarization.The proposed technique provides a stable measure of similarity as compared to existing forms of text summarization.
文摘The entirety of Amazon’s sales being powered by Amazon Search,one of the leading e-commerce platforms around the globe.As a result,even slight boosts in appropriateness can have a major impact on profits as well as the shopping experience of millions of users.Throughout the beginning,Amazon’s product search engine was made up of a number of manually adjusted ranking processes that made use of a limited number of input features.Since that time,a significant amount has transpired.Many people overlook the fact that Amazon is a search engine,and even the biggest one for e-commerce.It is indeed time to begin treating Amazon truly as the top e-commerce search engine across the globe because it currently serves 54%of all product queries.In this paper,the authors have considered two most important Amazon search engine algorithms viz.A10 and A11 and comparative study has been discussed.
基金the National Natural Science Foundation of China(Grant No.71901205)。
文摘The ranking of network node importance is one of the most essential problems in the field of network science.Node ranking algorithms serve as an essential part in many application scenarios such as search engine,social networks,and recommendation systems.This paper presents a systematic review on three representative methods:node ranking based on centralities,Page Rank algorithm,and HITS algorithm.Furthermore,we investigate the latest extensions and improvements of these representative methods,provided with several main application fields.Inspired by the survey of current literature,we attempt to propose promising directions for future research.The conclusions of this paper are enlightening and beneficial to both the academic and industrial communities.
基金supported by the National Natural Science Foundation of China under Grant Nos.61573065and 71731002。
文摘The ratings in many user-object online rating systems can reflect whether users like or dislike the objects,and in some online rating systems,users can directly choose whether to like an object.So these systems can be represented by signed bipartite networks,but the original unsigned node evaluation algorithm cannot be directly used on the signed networks.This paper proposes the Signed Page Rank algorithm for signed bipartite networks to evaluate the object and user nodes at the same time.Based on the global information,the nodes can be sorted by the Signed Page Rank values in descending order,and the result is SR Ranking.The authors analyze the characteristics of top and bottom nodes of the real networks and find out that for objects,the SR Ranking can provide a more reasonable ranking which combines the degree and rating of node,and the algorithm also can help us to identify users with specific rating patterns.By discussing the location of negative edges and the sensitivity of object SR Ranking to negative edges,the authors also explore that the negative edges play an important role in the algorithm and explain that why the bad reviews are more important in real networks.