期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Content-Based Publish/Subscribe System for Web Syndication 被引量:1
1
作者 Zeinab Hmedeh Harry Kourdounakis +3 位作者 Vassilis Christophides Cedric du Mouza Michel Scholl Nicolas Travers 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第2期359-380,共22页
Content syndication has become a popular way for timely delivery of frequently updated information on the Web. Today, web syndication technologies such as RSS or Atom are used in a wide variety of applications spreadi... Content syndication has become a popular way for timely delivery of frequently updated information on the Web. Today, web syndication technologies such as RSS or Atom are used in a wide variety of applications spreading from large-scale news broadcasting to medium-scale information sharing in scientific and professional communities. However, they exhibit serious limitations for dealing with information overload in Web 2.0. There is a vital need for efficient real- time filtering methods across feeds, to allow users to effectively follow personally interesting information. We investigate in this paper three indexing techniques for users' subscriptions based on inverted lists or on an ordered trie for exact and partial matching. We present analytical models for memory requirements and matching time and we conduct a thorough experimental evaluation to exhibit the impact of critical parameters of realistic web syndication workloads. 展开更多
关键词 pub/sub subscription indexing web syndication partial matching SCALABILITY
原文传递
AS-Index: A Structure for String Search Using n-Grams and Algebraic Signatures 被引量:1
2
作者 Camelia Constantin Cedric du Mouza +2 位作者 Witold Litwin Philippe Rigaux Thomas Schwarz 《Journal of Computer Science & Technology》 SCIE EI CSCD 2016年第1期147-166,共20页
We present the AS-Index, a new index structure for exact string search in disk resident databases. AS-Index relies on a classical inverted file structure, whose main innovation is a probabilistic search based on the p... We present the AS-Index, a new index structure for exact string search in disk resident databases. AS-Index relies on a classical inverted file structure, whose main innovation is a probabilistic search based on the properties of algebraic signatures used for both n-grams hashing and pattern search. Specifically, the properties of our signatures allow to carry out a search by inspecting only two of the posting lists. The algorithm thus enjoys the unique feature of requiring a constant number of disk accesses, independently from both the pattern size and the database size. We conduct extensive experiments on large datasets to evaluate our index behavior. They confirm that it steadily provides a search performance proportional to the two disk accesses necessary to obtain the posting lists. This makes our structure a choice of interest for the class of applications that require very fast lookups in large textual databases. We describe the index structure, our use of algebraic signatures, and the search algorithm. We discuss the operational trade-offs based on the parameters that affect the behavior of our structure, and present the theoretical and experimental performance analysis. We next compare the AS-Index with the state-of-the-art alternatives and show that 1) its construction time matches that of its competitors, due to the similarity of structures, 2) as for search time, it constantly outperforms the standard approach, thanks to the economical access to data complemented by signature calculations, which is at the core of our search method. 展开更多
关键词 full text indexing large-scale indexing algebraic signature
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部