期刊文献+
共找到15篇文章
< 1 >
每页显示 20 50 100
Method of acquiring web features and its application in web search 被引量:1
1
作者 薛晔伟 沈钧毅 +1 位作者 张云 鲍军鹏 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期330-334,共5页
Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set ... Focusing on the problem that it is hard to utilize the web multi-fields information with various forms in large scale web search,a novel approach,which can automatically acquire features from web pages based on a set of well defined rules,is proposed.The features describe the contents of web pages from different aspects and they can be used to improve the ranking performance for web search.The acquired feature has the advantages of unified form and less noise,and can easily be used in web page relevance ranking.A special specs for judging the relevance between user queries and acquired features is also proposed.Experimental results show that the features acquired by the proposed approach and the feature relevance specs can significantly improve the relevance ranking performance for web search. 展开更多
关键词 web search relevance ranking retrieval effectiveness
下载PDF
Ontology mapping approach using web search engine 被引量:1
2
作者 李珂玥 徐宝文 汪鹏 《Journal of Southeast University(English Edition)》 EI CAS 2007年第3期352-356,共5页
A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained ... A new mapping approach for automated ontology mapping using web search engines (such as Google) is presented. Based on lexico-syntactic patterns, the hyponymy relationships between ontology concepts can be obtained from the web by search engines and an initial candidate mapping set consisting of ontology concept pairs is generated. According to the concept hierarchies of ontologies, a set of production rules is proposed to delete the concept pairs inconsistent with the ontology semantics from the initial candidate mapping set and add the concept pairs consistent with the ontology semantics to it. Finally, ontology mappings are chosen from the candidate mapping set automatically with a mapping select rule which is based on mutual information. Experimental results show that the F-measure can reach 75% to 100% and it can effectively accomplish the mapping between ontologies. 展开更多
关键词 semantic web ONTOLOGY ontology mapping web search engine
下载PDF
Chinese college students' Web querying behaviors:A case study of Peking University
3
作者 QU Peng LIU Chang LAI Maosheng 《Chinese Journal of Library and Information Science》 2010年第4期23-36,共14页
This study examined users' querying behaviors based on a sample of 30 Chinese college students from Peking University. The authors designed 5 search tasks and each participant conducted two randomly selected searc... This study examined users' querying behaviors based on a sample of 30 Chinese college students from Peking University. The authors designed 5 search tasks and each participant conducted two randomly selected search tasks during the experiment. The results show that when searching for pre-designed search tasks, users often have relatively clear goals and strategies before searching. When formulating their queries, users often select words from tasks, use concrete concepts directly, or extract 'central words' or keywords. When reformulating queries, seven query reformulation types were identified from users' behaviors, i.e. broadening, narrowing, issuing new query, paralleling, changing search tools, reformulating syntax terms, and clicking on suggested queries. The results reveal that the search results and/or the contexts can also influence users' querying behaviors. 展开更多
关键词 web searching Query behavior Query formulation Query reformulation
下载PDF
An Efficient Multi-Keyword Query Processing Strategy on P2P Based Web Search 被引量:2
4
作者 SHEN Derong LI Meifang +1 位作者 ZHU Hongkai YU Ge 《Wuhan University Journal of Natural Sciences》 CAS 2007年第5期881-886,共6页
The paper presents a novel benefit based query processing strategy for efficient query routing. Based on DHT as the overlay network, it first applies Nash equilibrium to construct the optimal peer group based on the c... The paper presents a novel benefit based query processing strategy for efficient query routing. Based on DHT as the overlay network, it first applies Nash equilibrium to construct the optimal peer group based on the correlations of keywords and coverage and overlap of the peers to decrease the time cost, and then presents a two-layered architecture for query processing that utilizes Bloom filter as compact representation to reduce the bandwidth consumption. Extensive experiments conducted on a real world dataset have demonstrated that our approach obviously decreases the processing time, while improves the precision and recall as well. 展开更多
关键词 multi-keyword P2P web search CORRELATION coverage and overlap Nash equilibrium
下载PDF
WEB BASED TRANSLATION OF CHINESE ORGANIZATION NAME
5
作者 Yang Muyun Liu Daxin +2 位作者 Zhao Tiejun Qi Haoliang Lin Kaiming 《Journal of Electronics(China)》 2009年第2期279-284,共6页
A web-based translation method for Chinese organization name is proposed.After ana-lyzing the structure of Chinese organization name,the methods of bilingual query formulation and maximum entropy based translation re-... A web-based translation method for Chinese organization name is proposed.After ana-lyzing the structure of Chinese organization name,the methods of bilingual query formulation and maximum entropy based translation re-ranking are suggested to retrieve the English translation from the web via public search engine.The experiments on Chinese university names demonstrate the validness of this approach. 展开更多
关键词 Named entity Chinese organization name translation Bilingual query web search
下载PDF
Hybrid Algorithm to Evaluate E-Business Website Comments
6
作者 Osama M. Rababah Ahmad K. Hwaitat +1 位作者 Dana A. Al Qudah Rula Halaseh 《Communications and Network》 2016年第3期137-143,共7页
Online reviews are considered of an important indicator for users to decide on the activity they wish to do, whether it is watching a movie, going to a restaurant, or buying a product. It also serves businesses as it ... Online reviews are considered of an important indicator for users to decide on the activity they wish to do, whether it is watching a movie, going to a restaurant, or buying a product. It also serves businesses as it keeps tracking user feedback. The sheer volume of online reviews makes it difficult for a human to process and extract all significant information to make purchasing choices. As a result, there has been a trend toward systems that can automatically summarize opinions from a set of reviews. In this paper, we present a hybrid algorithm that combines an auto-summarization algorithm with a sentiment analysis (SA) algorithm, to offer a personalized user experiences and to solve the semantic-pragmatic gap. The algorithm consists of six steps that start with the original text document and generate a summary of that text by choosing the N most relevant sentences in the text. The tagged texts are then processed and then passed to a Naive Bayesian classifier along with their tags as training data. The raw data used in this paper belong to the tagged corpus positive and negative processed movie reviews introduced in [1]. The measures that are used to gauge the performance of the SA and classification algorithm for all test cases consist of accuracy, recall, and precision. We describe in details both the aspect of extraction and sentiment detection modules of our system. 展开更多
关键词 Auto-Summarization Comments Evaluation web Search Semantic-Pragmatic Gap Natural Language Processing Machine Learning Sentiment Detection web 2.0
下载PDF
Stability-mutation feature identification of Web search keywords based on keyword concentration change ratio
7
作者 Hongtao LU Guanghui YE Gang LI 《Chinese Journal of Library and Information Science》 2014年第3期33-44,共12页
Purpose: The aim of this paper is to discuss how the keyword concentration change ratio(KCCR) is used while identifying the stability-mutation feature of Web search keywords during information analyses and predictions... Purpose: The aim of this paper is to discuss how the keyword concentration change ratio(KCCR) is used while identifying the stability-mutation feature of Web search keywords during information analyses and predictions.Design/methodology/approach: By introducing the stability-mutation feature of keywords and its significance, the paper describes the function of the KCCR in identifying keyword stability-mutation features. By using Ginsberg's influenza keywords, the paper shows how the KCCR can be used to identify the keyword stability-mutation feature effectively.Findings: Keyword concentration ratio has close positive correlation with the change rate of research objects retrieved by users, so from the characteristic of the 'stability-mutation' of keywords, we can understand the relationship between these keywords and certain information. In general, keywords representing for mutation fit for the objects changing in short-term, while those representing for stability are suitable for long-term changing objects. Research limitations: It is difficult to acquire the frequency of keywords, so indexes or parameters which are closely related to the true search volume are chosen for this study.Practical implications: The stability-mutation feature identification of Web search keywords can be applied to predict and analyze the information of unknown public events through observing trends of keyword concentration ratio.Originality/value: The stability-mutation feature of Web search could be quantitatively described by the keyword concentration change ratio(KCCR). Through KCCR, the authors took advantage of Ginsberg's influenza epidemic data accordingly and demonstrated how accurate and effective the method proposed in this paper was while it was used in information analyses and predictions. 展开更多
关键词 web search web search keyword Information analysis and prediction Concentration change ratio Feature identification Influenza epidemic
下载PDF
ISTC: A New Method for Clustering Search Results 被引量:2
8
作者 ZHANG Wei XU Baowen +1 位作者 ZHANG Weifeng XU Junling 《Wuhan University Journal of Natural Sciences》 CAS 2008年第4期501-504,共4页
A new common phrase scoring method is proposed according to term frequency-inverse document frequency (TFIDF) and independence of the phrase. Combining the two properties can help identify more reasonable common phr... A new common phrase scoring method is proposed according to term frequency-inverse document frequency (TFIDF) and independence of the phrase. Combining the two properties can help identify more reasonable common phrases, which improve the accuracy of clustering. Also, the equation to measure the in-dependence of a phrase is proposed in this paper. The new algorithm which improves suffix tree clustering algorithm (STC) is named as improved suffix tree clustering (ISTC). To validate the proposed algorithm, a prototype system is implemented and used to cluster several groups of web search results obtained from Google search engine. Experimental results show that the improved algorithm offers higher accuracy than traditional suffix tree clustering. 展开更多
关键词 web search results clustering suffix tree term frequency-inverse document frequency (TFIDF) independence of phrases
下载PDF
User behavior modeling for better Web search ranking 被引量:1
9
作者 Yiqun LIU Chao WANG +1 位作者 Min ZHANG Shaoping MA 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第6期923-936,共14页
Modem search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve click- through rate (CTR), Web search ranking, and query ... Modem search engines record user interactions and use them to improve search quality. In particular, user click-through has been successfully used to improve click- through rate (CTR), Web search ranking, and query rec- ommendations and suggestions. Although click-through logs can provide implicit feedback of users' click preferences, de- riving accurate absolute relevance judgments is difficult be- cause of the existence of click noises and behavior biases. Previous studies showed that user clicking behaviors are bi- ased toward many aspects such as "position" (user's attention decreases from top to bottom) and "trust" (Web site reputa- tions will affect user's judgment). To address these problems, researchers have proposed several behavior models (usually referred to as click models) to describe users? practical browsing behaviors and to obtain an unbiased estimation of result relevance. In this study, we review recent efforts to construct click models for better search ranking and propose a novel convolutional neural network architecture for build- ing click models. Compared to traditional click models, our model not only considers user behavior assumptions as input signals but also uses the content and context information of search engine result pages. In addition, our model uses pa- rameters from traditional click models to restrict the meaning of some outputs in our model's hidden layer. Experimental results show that the proposed model can achieve consider- able improvement over state-of-the-art click models based on the evaluation metric of click perplexity. 展开更多
关键词 user behavior click model web search
原文传递
Comparison of Three Web Search Algorithms
10
作者 Ying Bao Zi-hu Zhu 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2006年第3期517-528,共12页
In this paper we discuss three important kinds of Markov chains used in Web search algorithms-the maximal irreducible Markov chain, the miuimal irreducible Markov chain and the middle irreducible Markov chain, We disc... In this paper we discuss three important kinds of Markov chains used in Web search algorithms-the maximal irreducible Markov chain, the miuimal irreducible Markov chain and the middle irreducible Markov chain, We discuss the stationary distributions, the convergence rates and the Maclaurin series of the stationary distributions of the three kinds of Markov chains. Among other things, our results show that the maximal and minimal Markov chains have the same stationary distribution and that the stationary distribution of the middle Markov chain reflects the real Web structure more objectively. Our results also prove that the maximal and middle Markov chains have the same convergence rate and that the maximal Markov chain converges faster than the minimal Markov chain when the damping factor α 〉1/√2. 展开更多
关键词 PAGERANK web search Markov chain stationary distribution convergence rate
原文传递
Exploiting the Community Structure of Fraudulent Keywords for Fraud Detection in Web Search
11
作者 Dong-Hui Yang Zhen-Yu Li +2 位作者 Xiao-Hui Wang Kavé Salamatian Gao-Gang Xie 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第5期1167-1183,共17页
Internet users heavily rely on web search engines for their intended information.The major revenue of search engines is advertisements(or ads).However,the search advertising suffers from fraud.Fraudsters generate fake... Internet users heavily rely on web search engines for their intended information.The major revenue of search engines is advertisements(or ads).However,the search advertising suffers from fraud.Fraudsters generate fake traffic which does not reach the intended audience,and increases the cost of the advertisers.Therefore,it is critical to detect fraud in web search.Previous studies solve this problem through fraudster detection(especially bots)by leveraging fraudsters'unique behaviors.However,they may fail to detect new means of fraud,such as crowdsourcing fraud,since crowd workers behave in part like normal users.To this end,this paper proposes an approach to detecting fraud in web search from the perspective of fraudulent keywords.We begin by using a unique dataset of 150 million web search logs to examine the discriminating features of fraudulent keywords.Specifically,we model the temporal correlation of fraudulent keywords as a graph,which reveals a very well-connected community structure.Next,we design DFW(detection of fraudulent keywords)that mines the temporal correlations between candidate fraudulent keywords and a given list of seeds.In particular,DFW leverages several refinements to filter out non-fraudulent keywords that co-occur with seeds occasionally.The evaluation using the search logs shows that DFW achieves high fraud detection precision(99%)and accuracy(93%).A further analysis reveals several typical temporal evolution patterns of fraudulent keywords and the co-existence of both bots and crowd workers as fraudsters for web search fraud. 展开更多
关键词 community structure fraud analysis fraudulent keyword detection web search
原文传递
Visualization and level-of-detail of metadata for interactive exploration of Sensor Web
12
作者 Byounghyun Yoo V.Judson Harward 《International Journal of Digital Earth》 SCIE EI 2014年第11期847-869,共23页
There are several issues with Web-based search interfaces on a Sensor Web data infrastructure.It can be difficult to(1)find the proper keywords for the formulation of queries and(2)explore the information if the user ... There are several issues with Web-based search interfaces on a Sensor Web data infrastructure.It can be difficult to(1)find the proper keywords for the formulation of queries and(2)explore the information if the user does not have previous knowledge about the particular sensor systems providing the informa-tion.We investigate how the visualization of sensor resources on a 3D Web-based Digital Earth globe organized by level-of-detail(LOD)can enhance search and exploration of information by easing the formulation of geospatial queries against the metadata of sensor systems.Our case study provides an approach inspired by geographical mashups in which freely available functionality and data are flexibly combined.We use PostgreSQL,PostGIS,PHP,and X3D-Earth technologies to allow the Web3D standard and its geospatial component to be used for visual exploration and LOD control of a dynamic scene.Our goal is to facilitate the dynamic exploration of the Sensor Web and to allow the user to seamlessly focus in on a particular sensor system from a set of registered sensor networks deployed across the globe.We present a prototype metadata exploration system featuring LOD for a multiscaled Sensor Web as a Digital Earth application. 展开更多
关键词 Sensor web datavisualization Sensor web data discovery and search LEVEL-OF-DETAIL metadata visualization web3D standard extensible 3D graphics X3D geospatial component
原文传递
Exploring the Linear and Nonlinear Causality Between Internet Big Data and Stock Markets 被引量:4
13
作者 DONG Jichang DAI Wei LI Jingjing 《Journal of Systems Science & Complexity》 SCIE EI CSCD 2020年第3期783-798,共16页
In the era of big data,stock markets are closely connected with Internet big data from diverse sources.This paper makes the first attempt to compare the linkage between stock markets and various Internet big data coll... In the era of big data,stock markets are closely connected with Internet big data from diverse sources.This paper makes the first attempt to compare the linkage between stock markets and various Internet big data collected from search engines,public media and social media.To achieve this purpose,a big data-based causality testing framework is proposed with three steps,i.e.,data crawling,data mining and causality testing.Taking the Shanghai Stock Exchange and Shenzhen Stock Exchange as targets for stock markets,web search data,news,and microblogs as samples of Internet big data,some interesting findings can be obtained.1)There is a strong bi-directional,linear and nonlinear Granger causality between stock markets and investors'web search behaviors due to some similar trends and uncertain factors.2)News sentiments from public media have Granger causality with stock markets in a bi-directional linear way,while microblog sentiments from social media have Granger causality with stock markets in a unidirectional linear way,running from stock markets to microblog sentiments.3)News sentiments can explain the changes in stock markets better than microblog sentiments due to their authority.The results of this paper might provide some valuable information for both stock market investors and modelers. 展开更多
关键词 Granger causality test internet big data investors'sentiment stock markets web search behavior
原文传递
A Supervised Learning Approach to Search of Definitions 被引量:1
14
作者 徐君 曹云波 +2 位作者 李航 赵珉 黄亚楼 《Journal of Computer Science & Technology》 SCIE EI CSCD 2006年第3期439-449,共11页
This paper addresses the issue of search of definitions. Specifically, for a given term, we are to find out its definition candidates and rank the candidates according to their likelihood of being good definitions. Th... This paper addresses the issue of search of definitions. Specifically, for a given term, we are to find out its definition candidates and rank the candidates according to their likelihood of being good definitions. This is in contrast to the traditional methods of either generating a single combined definition or outputting all retrieved definitions. Definition ranking is essential for tasks. A specification for judging the goodness of a definition is given. In the specification, a definition is categorized into one of the three levels: good definition, indifferent definition, or bad definition. Methods of performing definition ranking are also proposed in this paper, which formalize the problem as either classification or ordinal regression. We employ SVM (Support Vector Machines) as the classification model and Ranking SVM as the ordinal regression model respectively, and thus they rank definition candidates according to their likelihood of being good definitions. Features for constructing the SVM and Ranking SVM models are defined, which represent the characteristics of terms, definition candidate, and their relationship. Experimental results indicate that the use of SVM and Ranking SVM can significantly outperform the baseline methods such as heuristic rules, the conventional information retrieval--Okapi, or SVM regression. This is true when both the answers are paragraphs and they are sentences. Experimental results also show that SVM or Ranking SVM models trained in one domain can be adapted to another domain, indicating that generic models for definition ranking can be constructed. 展开更多
关键词 definition search text mining web mining web search
原文传递
A comprehensive review from hyperlink to intelligent technologies based personalized search systems 被引量:1
15
作者 Dheeraj Malhotra O.P.Rishi 《Journal of Management Analytics》 EI 2019年第4期365-389,共25页
In the present era of big data,web page searching and ranking in an efficient manner on the World Wide Web to satisfy the specific search needs of the modern user is undoubtedly a major challenge for search engines.Ev... In the present era of big data,web page searching and ranking in an efficient manner on the World Wide Web to satisfy the specific search needs of the modern user is undoubtedly a major challenge for search engines.Even though a large number of web search techniques have been developed,some problems still exist while searching with generic search engines as none of the search engines can index the entire web.The issue is not just the volume but also the relevance concerning the user’s requirements.Moreover,if the search query is partially incomplete or is ambiguous,then most of the modern search engines tend to return the result by interpreting all possible meanings of the query.Concerning search quality,more than half of the retrieved web pages have been reported to be irrelevant.Hence web search personalization is required to retrieve search results while incorporating the user’s interests.In the proposed research work we have highlighted the strengths and weakness of various studies as proposed in the literature for web search personalization by carrying out a detailed comparison among them.The in-depth comparative study with baselines leads to the recommendation of Intelligent Meta Search System(IMSS)and Advanced Cluster Vector Page Ranking(ACVPR)algorithm as one of the best approaches as proposed in the literature for web search personalization.Furthermore,the detailed discussion about the comparative analysis of all categories gives new opportunities to think in different research areas. 展开更多
关键词 web search personalization meta search tool machine learning big data analytics collaborative filtering logistic regression
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部