期刊文献+
共找到348篇文章
< 1 2 18 >
每页显示 20 50 100
Query Expansion for Chinese Information Retrieval by Using a Decaying Co-occurrence Model 被引量:3
1
作者 贺宏朝 何丕廉 +1 位作者 高剑峰 黄昌宁 《Transactions of Tianjin University》 EI CAS 2002年第3期183-186,共4页
Query expansion with thesaurus is one of the useful techniques in modern information retrieval (IR). In this paper, a method of query expansion for Chinese IR by using a decaying co-occurrence model is proposed and re... Query expansion with thesaurus is one of the useful techniques in modern information retrieval (IR). In this paper, a method of query expansion for Chinese IR by using a decaying co-occurrence model is proposed and realized. The model is an extension of the traditional co-occurrence model by adding a decaying factor that decreases the mutual information when the distance between the terms increases. Experimental results on TREC-9 collections show this query expansion method results in significant improvements over the IR without query expansion. 展开更多
关键词 query expansion Chinese language information retrieval
下载PDF
A new approach to query expansion in information retrieval 被引量:2
2
作者 李卫疆 Zhao +2 位作者 Tiejun Wang Xian'gang 《High Technology Letters》 EI CAS 2008年第1期77-80,共4页
To eliminate the mismatch between words of relevant documents and user's query and more seriousnegative effects it has on the performance of information retrieval,a method of query expansion on the ba-sis of new t... To eliminate the mismatch between words of relevant documents and user's query and more seriousnegative effects it has on the performance of information retrieval,a method of query expansion on the ba-sis of new terms co-occurrence representation was put forward by analyzing the process of producingquery.The expansion terms were selected according to their correlation to the whole query.At the sametime,the position information between terms were considered.The experimental result on test retrievalconference(TREC)data collection shows that the method proposed in the paper has made an improve-ment of 5%~19% all the time than the language modeling method without expansion.Compared to thepopular approach of query expansion,pseudo feedback,the precision of the proposed method is competi-tive. 展开更多
关键词 information retrieval language model query expansion
下载PDF
Enhancing Amharic Information Retrieval System Based on Statistical Co-Occurrence Technique
3
作者 Abey Bruck Tulu Tilahun 《Journal of Computer and Communications》 2015年第12期67-76,共10页
Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems... Information retrieval (IR) systems are designed to help information seekers retrieving relevant information from vast document. The need for relevant information from a vast amount of document gave birth to IR systems. Even though different IR systems exist, they cannot meet all users’ expectations. A different level of users’ knowledge makes queries to be expressed in different ways. As a result, the system may miss the core meaning of users query and retrieve dissatisfactory results. This happens mainly because of the ambiguities of words involved in the natural languages and expression mismatch among users and authors. The existing ambiguities in Amharic language have negative impacts on the performance of Amharic IR system. Some of the ambiguities for this type of problem are: spelling variants of the same word, polysemous and synonymous terms. If users are not fully knowledgeable about the information domain area, they will mostly formulate weak queries to retrieve documents. Thus, they end up frustrated with the results found from an IR system. This research has been conducted, aiming at augmenting the recall of previous work. Statistical co-occurrence technique has been used in order to expand query terms. The main reason for performing query expansion is to provide relevant documents as per users’ query that can satisfy their information need. Statistical co-occurrence method considers, frequently appearing terms with the query term, regardless of their position. The efficiency of proposed technique has been tested on the prototype system and the result found compared with the result of previous study. Accordingly, 6% recall and 2% f-measure improvement has been made. Hence, the statistical co-occurrence method outperformed the bi-gram based IR system. 展开更多
关键词 STATISTICAL CO-OCCURRENCE information retrieval query EXPANSION Amharic
下载PDF
Bilingual Dictionary Approach for Malay-English Cross-Language Information Retrieval
4
作者 Nurjannaton Hidayah Rais Muhamad Taufik Abdullah Rabiah Abdul Kadir 《通讯和计算机(中英文版)》 2011年第5期354-360,共7页
关键词 跨语言信息检索 双语词典 英语 查询转换 语言翻译 翻译方法 语言表达 自动翻译
下载PDF
Query Performance Prediction for Information Retrieval Based on Covering Topic Score 被引量:3
5
作者 郎皓 王斌 +3 位作者 Gareth Jones 李锦涛 丁凡 刘宜轩 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第4期590-601,共12页
We present a statistical method called Covering Topic Score (CTS) to predict query performance for information retrieval. Estimation is based on how well the topic of a user's query is covered by documents retrieve... We present a statistical method called Covering Topic Score (CTS) to predict query performance for information retrieval. Estimation is based on how well the topic of a user's query is covered by documents retrieved from a certain retrieval system. Our approach is conceptually simple and intuitive, and can be easily extended to incorporate features beyond bag- of-words such as phrases and proximity of terms. Experiments demonstrate that CTS significantly correlates with query performance in a variety of TREC test collections, and in particular CTS gains more prediction power benefiting from features of phrases and proximity of terms. We compare CTS with previous state-of-the-art methods for query performance prediction including clarity score and robustness score. Our experimental results show that CTS consistently performs better than, or at least as well as, these other methods. In addition to its high effectiveness, CTS is also shown to have very low computational complexity, meaning that it can be practical for real applications. 展开更多
关键词 information storage and retrieval information search and retrieval query performance prediction coveringtopic score
原文传递
The Application of the Comparable Corpora in Chinese-English Cross-Lingual Information Retrieval
6
作者 杜林 张毅波 +1 位作者 孙乐 孙玉芳 《Journal of Computer Science & Technology》 SCIE EI CSCD 2001年第4期351-358,共8页
This paper proposes a novel Chinese-English Cross-Lingual Information Retrieval (CECLIR) model PME, in which bilingual dictionary and comparable corpora are used to translate the query terms. The Proximity and mutua... This paper proposes a novel Chinese-English Cross-Lingual Information Retrieval (CECLIR) model PME, in which bilingual dictionary and comparable corpora are used to translate the query terms. The Proximity and mutual information of the term-pairs in the Chinese and English comparable corpora are employed not only to resolve the translation ambiguities but also to perform the query expansion so as to deal with the out-of-vocabulary issues in the CECLIR. The evaluation results show that the query precision of PME algorithm is about 84.4% of the monolingual information retrieval. 展开更多
关键词 cross-lingual information retrieval comparable corpus mutual information query expansion
原文传递
Dominant Meaning Method for Intelligent Topic-Based Information Agent towards More Flexible MOOCs
7
作者 Mohammed Abdel Razek 《Journal of Intelligent Learning Systems and Applications》 2014年第4期186-196,共11页
The use of agent technology in a dynamic environment is rapidly growing as one of the powerful technologies and the need to provide the benefits of the Intelligent Information Agent technique to massive open online co... The use of agent technology in a dynamic environment is rapidly growing as one of the powerful technologies and the need to provide the benefits of the Intelligent Information Agent technique to massive open online courses, is very important from various aspects including the rapid growing of MOOCs environments, and the focusing more on static information than on updated information. One of the main problems in such environment is updating the information to the needs of the student who interacts at each moment. Using such technology can ensure more flexible information, lower waste time and hence higher earnings in learning. This paper presents Intelligent Topic-Based Information Agent to offer an updated knowledge including various types of resource for students. Using dominant meaning method, the agent searches the Internet, controls the metadata coming from the Internet, filters and shows them into a categorized content lists. There are two experiments conducted on the Intelligent Topic-Based Information Agent: one measures the improvement in the retrieval effectiveness and the other measures the impact of the agent on the learning. The experiment results indicate that our methodology to expand the query yields a considerable improvement in the retrieval effectiveness in all categories of Google Web Search API. On the other hand, there is a positive impact on the performance of learning session. 展开更多
关键词 Massive Open Online COURSES MOOCs Search Engine HYPERMEDIA Systems Web-Based Services query Expansion Probabilistic Model information retrieval
下载PDF
Approximate querying between heterogeneous ontologies based on association matrix
8
作者 康达周 徐宝文 +1 位作者 陆建江 汪鹏 《Journal of Southeast University(English Edition)》 EI CAS 2005年第1期1-5,共5页
An approximate approach of querying between heterogeneous ontology-basedinformation systems based on an association matrix is proposed. First, the association matrix isdefined to describe relations between concepts in... An approximate approach of querying between heterogeneous ontology-basedinformation systems based on an association matrix is proposed. First, the association matrix isdefined to describe relations between concepts in two ontologies. Then, a methodof rewriting queriesbased on the association matrix is presented to solve the ontology heterogeneity problem. Itrewrites the queries in one ontology to approximate queries in another ontology based on thesubsumption relations between concepts. The method also uses vectors to represent queries, and thencomputes the vectors with the association matrix; the disjoint relations between concepts can beconsidered by the results. It can get better approximations than the methods currently in use, whichdonot consider disjoint relations. The method can be processed by machines automatically. It issimple to implement and expected to run quite fast. 展开更多
关键词 semantic web information retrieval ONTOLOGY query association matrix
下载PDF
基于Term-Query-URL异构信息网络的查询推荐 被引量:3
9
作者 刘钰峰 李仁发 《湖南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2014年第5期106-112,共7页
查询推荐是一种帮助搜索引擎更好的理解用户检索需求的方法.基于查询的上下文片段训练词汇和查询之间的语义关系,同时结合查询和URL的点击图以及查询中的序列行为构建Term Query URL异构信息网络,采用重启动随机游走(Random Walk withR... 查询推荐是一种帮助搜索引擎更好的理解用户检索需求的方法.基于查询的上下文片段训练词汇和查询之间的语义关系,同时结合查询和URL的点击图以及查询中的序列行为构建Term Query URL异构信息网络,采用重启动随机游走(Random Walk withRestart,RWR)进行查询推荐.综合利用语义信息和日志信息,提高了稀疏查询的推荐效果.基于概率语言模型构造查询的词汇向量,可以为新的查询进行查询推荐.在大规模商业搜索引擎查询日志上的实验表明本文方法相比传统的查询推荐方法性能提升约为3%~10%. 展开更多
关键词 信息检索 查询推荐 点击日志 重启动随机游走
下载PDF
基于知识图谱的零样本文档检索伪查询生成
10
作者 刘军平 孙医贵 +4 位作者 朱强 胡新荣 彭涛 姚迅 王帮超 《软件导刊》 2024年第11期47-52,共6页
为提高文档检索模型性能,减轻手工标记训练数据的工作量,提出一种基于知识图谱的零样本文档检索伪查询生成方法KGQG。该方法利用知识图谱增强伪查询,将外部信息与伪查询相结合,以生成更丰富、更具信息量的伪查询。实验结果表明,在BEIR... 为提高文档检索模型性能,减轻手工标记训练数据的工作量,提出一种基于知识图谱的零样本文档检索伪查询生成方法KGQG。该方法利用知识图谱增强伪查询,将外部信息与伪查询相结合,以生成更丰富、更具信息量的伪查询。实验结果表明,在BEIR基准测试的12个公开数据集中,KGQG方法比经典的稀疏检索模型、稠密检索模型以及最新的基于外部知识扩展的零样本稠密检索模型在归一化折现累计效益(NDCG)指标方面分别提升了4.6、11.88、7.96个百分点。KGQG方法不仅能提高检索性能,而且减少了手动标记训练数据需求,为文档检索模型的未来研究与应用提供了有益参考。 展开更多
关键词 稠密检索 信息检索 零样本学习 查询扩展 知识图谱 自然语言处理
下载PDF
A Full Text Retrieval System in a Digital Library Environment 被引量:1
11
作者 Kehinde Daniel Aruleba Dipo Theophilus Akomolafe Babajide Afeni 《Intelligent Information Management》 2016年第1期1-8,共8页
The volume of information being created, generated and stored is huge. Without adequate knowledge of Information Retrieval (IR) methods, the retrieval process for information would be cumbersome and frustrating. Studi... The volume of information being created, generated and stored is huge. Without adequate knowledge of Information Retrieval (IR) methods, the retrieval process for information would be cumbersome and frustrating. Studies have further revealed that IR methods are essential in information centres (for example, Digital Library environment) for storage and retrieval of information. Therefore, with more than one billion people accessing the Internet, and millions of queries being issued on a daily basis, modern Web search engines are facing a problem of daunting scale. The main problem associated with the existing search engines is how to avoid irrelevant information retrieval and to retrieve the relevant ones. In this study, the existing system of library retrieval was studied. Problems associated with them were analyzed in order to address this problem. The concept of existing information retrieval models was studied, and the knowledge gained was used to design a digital library information retrieval system. It was successfully implemented using a real life data. The need for a continuous evaluation of the IR methods for effective and efficient full text retrieval system was recommended. 展开更多
关键词 Full Text information retrieval LIBRARY Digital Library QUERIES INDEXING CATALOGUE
下载PDF
Query Through Heterogeneous Ontologies Using Association Matrix
12
作者 KANGDa-zhou XUBao-wen +2 位作者 LUJian-jiang WANGPeng LIYan-hui 《Wuhan University Journal of Natural Sciences》 EI CAS 2004年第5期575-579,共5页
This paper introduces the definition and calculation of the association matrix between ontologies. It uses the association matrix to describe the relations between concepts in different ontologies and uses concept vec... This paper introduces the definition and calculation of the association matrix between ontologies. It uses the association matrix to describe the relations between concepts in different ontologies and uses concept vectors to represent queries; then computes the vectors with the association matrix in order to rewrite queries. This paper proposes a simple method of querying through heterogeneous Ontology using association matrix. This method is based on the correctness of approximate information filtering theory; and it is simple to be implemented and expected to run quite fast. Key words semantic Web - information retrieval - ontology - query - association matrix CLC number TP 391 Foundation item: Supported by the National Natural Science Foundation of China (60373066, 60303024), National Grand Fundamental Research 973 Program of China (2002CB312000) and National Research Foundation for the Doctoral Program of Higher Education of China (20020286004)Biography: KANG Da-zhou (1980-), male, Master candidate, research direction: Semantic Web, knowledge representation on the Web. 展开更多
关键词 semantic Web information retrieval ONTOLOGY query association matrix
下载PDF
Deep Neural Network and Pseudo Relevance Feedback Based Query Expansion
13
作者 Abhishek Kumar Shukla Sujoy Das 《Computers, Materials & Continua》 SCIE EI 2022年第5期3557-3570,共14页
The neural network has attracted researchers immensely in the last couple of years due to its wide applications in various areas such as Data mining,Natural language processing,Image processing,and Information retriev... The neural network has attracted researchers immensely in the last couple of years due to its wide applications in various areas such as Data mining,Natural language processing,Image processing,and Information retrieval etc.Word embedding has been applied by many researchers for Information retrieval tasks.In this paper word embedding-based skip-gram model has been developed for the query expansion task.Vocabulary terms are obtained from the top“k”initially retrieved documents using the Pseudo relevance feedback model and then they are trained using the skip-gram model to find the expansion terms for the user query.The performance of the model based on mean average precision is 0.3176.The proposed model compares with other existing models.An improvement of 6.61%,6.93%,and 9.07%on MAP value is observed compare to the Original query,BM25 model,and query expansion with the Chi-Square model respectively.The proposed model also retrieves 84,25,and 81 additional relevant documents compare to the original query,query expansion with Chi-Square model,and BM25 model respectively and thus improves the recall value also.The per query analysis reveals that the proposed model performs well in 30,36,and 30 queries compare to the original query,query expansion with Chi-square model,and BM25 model respectively. 展开更多
关键词 information retrieval query expansion word embedding neural network deep neural network
下载PDF
基于Go语言的安标产品信息检索系统设计
14
作者 汪学明 《软件》 2024年第3期34-36,47,共4页
本文介绍了一款基于Go语言的安标产品信息管理查询系统软件实现方法,该系统软件通过网络请求获取安标产品的综合信息,并打印显示出查询结果,同时将结果自动保存在Excel文件中。通过检索关键字,用户能够快速获取相关产品的详细信息,实现... 本文介绍了一款基于Go语言的安标产品信息管理查询系统软件实现方法,该系统软件通过网络请求获取安标产品的综合信息,并打印显示出查询结果,同时将结果自动保存在Excel文件中。通过检索关键字,用户能够快速获取相关产品的详细信息,实现了安标产品信息的快速检索与查询。论文详细阐述了基于Go语言的软件设计思路、实现过程和功能特点、实际应用场景和未来的改进方向。 展开更多
关键词 Go语言 安标产品 信息查询 快速检索
下载PDF
基于时间和影响力因子的Github Pull Request评审人推荐 被引量:5
15
作者 卢松 杨达 +1 位作者 胡军 张潇 《计算机系统应用》 2016年第12期155-161,共7页
开源社区github提供了pull request的机制让开发者可以把自己的代码集成到github的开源项目中从而为项目做出贡献.Pull request的代码评审是github这类分布式软件开发社区维护开源项目代码质量的非常重要的方式.为一个新到来的pull requ... 开源社区github提供了pull request的机制让开发者可以把自己的代码集成到github的开源项目中从而为项目做出贡献.Pull request的代码评审是github这类分布式软件开发社区维护开源项目代码质量的非常重要的方式.为一个新到来的pull request指派合适的代码评审人可以有效减少pull request从提交到开始审核的延迟.目前github是由项目核心成员人工来完成评审人的指派,为了减少这种人力损耗,我们提出代码评审人的推荐系统,该系统基于信息检索的方法,并考虑了评审人的影响力因子以及评审的时间衰减的因素,对新到来的pull request,自动推荐最相关的评审人.我们的方法对top 1的准确度达到了68%,对top 10的召回率达到了78%. 展开更多
关键词 PULL request 代码评审 信息检索 时间因子 影响力因子
下载PDF
Immune Algorithm For Document Query Optimization
16
作者 WangZiqiang FengBoqin 《工程科学(英文版)》 2005年第1期89-93,共5页
To efficiently retrieve relevant document from the rapid proliferation of large information collections, a novel immune algorithm for document query optimization is proposed. The essential ideal of the immune algorith... To efficiently retrieve relevant document from the rapid proliferation of large information collections, a novel immune algorithm for document query optimization is proposed. The essential ideal of the immune algorithm is that the crossover and mutation of operator are constructed according to its own characteristics of information retrieval. Immune operator is adopted to avoid degeneracy. Relevant documents retrieved are merged to a single document list according to rank formula. Experimental results show that the novel immune algorithm can lead to substantial improvements of relevant document retrieval effectiveness. 展开更多
关键词 免疫算法 信息检索 文件查询优化 失量空间模型
下载PDF
面向稠密检索的伪相关反馈方法 被引量:2
17
作者 胡文浩 罗景 涂新辉 《计算机应用》 CSCD 北大核心 2023年第4期1036-1042,共7页
伪相关反馈(PRF)机制是一种自动化的查询扩展(QE)技术,它利用原始查询和初次检索中前N篇文档蕴含的信息构建更加准确的查询,从而进一步提高信息检索系统的性能。但是,现有的面向稠密检索的PRF方法由于对文本的截断处理容易造成语义信息... 伪相关反馈(PRF)机制是一种自动化的查询扩展(QE)技术,它利用原始查询和初次检索中前N篇文档蕴含的信息构建更加准确的查询,从而进一步提高信息检索系统的性能。但是,现有的面向稠密检索的PRF方法由于对文本的截断处理容易造成语义信息的缺失,而且在检索阶段的空间复杂度较高。针对上述问题,提出了一种基于段落级粒度且适用于长文本稠密检索的PRF方法 Dense-PRF。首先,通过计算语义距离从初次检索的前N篇文档中获得相关段落的向量;其次,对相关段落向量进行平均池化以得到QE项向量;然后,按照权重结合原始查询向量和QE项向量构建新的查询向量;最后,根据新的查询向量得到最终检索结果。在Robust04和WT2G两个经典长文本测试集上将Dense-PRF与基线模型进行了对比实验,相较于模型RepBERT+BM25,Dense-PRF在前20篇文档的准确率和归一化折现累计效益(NDCG)指标上分别提升了1.66、1.32个百分点和2.30、1.91个百分点。实验结果表明Dense-PRF能有效缓解查询与文档词汇不匹配的问题,并提升检索精度。 展开更多
关键词 伪相关反馈 查询扩展 信息检索 稠密检索 长文本
下载PDF
一种结合标签分类和语义查询扩展的文本素材推荐方法
18
作者 孟怡悦 彭蓉 吕其标 《计算机科学》 CSCD 北大核心 2023年第1期76-86,共11页
在各类规划、调研报告的编制过程中,编制人员往往需要根据拟定的目录或标题去收集、阅读大量文本素材,分类整理后再甄选使用,不仅工作量大而且质量无法得到保障。为此,在数字政府规划文档编制领域中提出了一种结合标签分类和语义查询扩... 在各类规划、调研报告的编制过程中,编制人员往往需要根据拟定的目录或标题去收集、阅读大量文本素材,分类整理后再甄选使用,不仅工作量大而且质量无法得到保障。为此,在数字政府规划文档编制领域中提出了一种结合标签分类和语义查询扩展的文本素材推荐方法,从信息检索的角度出发,将目录中的各级标题视为查询语句,将参阅的文本素材作为目标文档,从而进行文本素材检索与推荐。该方法基于差分进化算法,将基于词向量平均的文本素材推荐方法、基于语义查询扩展的文本素材推荐方法和基于标签分类的文本素材推荐方法有机结合,弥补了传统的文本素材推荐方法的不足,实现了通过目录结构的标题检索以段落为粒度的文本素材。在10个数据集上的实验验证结果表明,该方法的性能提升显著,能够大大减少人工素材选择的工作量,同时减少素材分类的工作量,降低文档编制的难度。 展开更多
关键词 文本素材推荐 信息检索 数字政府 查询扩展 差分进化算法
下载PDF
A comprehensive review of significant researches on content based indexing and retrieval of visual information 被引量:3
19
作者 R. PRIYA T. N. SHANMUGAM 《Frontiers of Computer Science》 SCIE EI CSCD 2013年第5期782-799,共18页
Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to ... Developments in multimedia technologies have paved way for the storage of huge collections of video doc- uments on computer systems. It is essential to design tools for content-based access to the documents, so as to allow an efficient exploitation of these collections. Content based anal- ysis provides a flexible and powerful way to access video data when compared with the other traditional video analysis tech- niques. The area of content based video indexing and retrieval (CBVIR), focusing on automating the indexing, retrieval and management of video, has attracted extensive research in the last decade. CBVIR is a lively area of research with endur- ing acknowledgments from several domains. Herein a vital assessment of contemporary researches associated with the content-based indexing and retrieval of visual information. In this paper, we present an extensive review of significant researches on CBV1R. Concise description of content based video analysis along with the techniques associated with the content based video indexing and retrieval is presented. 展开更多
关键词 nultimedia information content based video retrieval (CBVR) content based video indexing and retrieval (CBVIR) shot segmentation object segmentation feature extraction INDEXING motion estimation queryING key frame retrieval and indexing.
原文传递
基于ConceptNet语义的伪相关反馈信息检索方法 被引量:1
20
作者 潘敏 刘宇 +1 位作者 裴全力 李腾 《湖北师范大学学报(自然科学版)》 2023年第2期28-37,共10页
伪相关性反馈技术在信息检索领域应用广泛,在考虑词频和逆文档频率等重要特征时,传统的信息检索方法容易忽略查询词本身的语义信息。提出了一种基于语义的伪相关性反馈信息检索方法SPRF(Semantic Pseudo-Relevance Feedback),充分利用Co... 伪相关性反馈技术在信息检索领域应用广泛,在考虑词频和逆文档频率等重要特征时,传统的信息检索方法容易忽略查询词本身的语义信息。提出了一种基于语义的伪相关性反馈信息检索方法SPRF(Semantic Pseudo-Relevance Feedback),充分利用ConceptNet获取语义信息,不仅考虑了查询词在文档中的词频重要性,还将查询词的语义信息整合到伪相关反馈框架中,以改善查询扩展词的选择。在6个TREC数据集上实验结果表明:SPRF方法对比较强基线模型和几种基于神经网络的方法在P@10和MAP两个指标上具有显著提升。 展开更多
关键词 ConceptNet 伪相关反馈 查询扩展 信息检索
下载PDF
上一页 1 2 18 下一页 到第
使用帮助 返回顶部