期刊文献+
共找到563篇文章
< 1 2 29 >
每页显示 20 50 100
Orbit Weighting Scheme in the Context of Vector Space Information Retrieval
1
作者 Ahmad Ababneh Yousef Sanjalawe +2 位作者 Salam Fraihat Salam Al-E’mari Hamzah Alqudah 《Computers, Materials & Continua》 SCIE EI 2024年第7期1347-1379,共33页
This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schem... This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies. 展开更多
关键词 information retrieval orbit weighting scheme semantic text analysis Tf-Idf weighting scheme vector space model
下载PDF
Hybrid Chinese Information Retrieval Model Based on the Combination of Keyword and Concept 被引量:2
2
作者 樊孝忠 李宏乔 李良富 《Journal of Beijing Institute of Technology》 EI CAS 2003年第S1期120-123,共4页
A hybrid model that is based on the Combination of keywords and concept was put forward. The hybrid model is built on vector space model and probabilistic reasoning network. It not only can exert the advantages of key... A hybrid model that is based on the Combination of keywords and concept was put forward. The hybrid model is built on vector space model and probabilistic reasoning network. It not only can exert the advantages of keywords retrieval and concept retrieval but also can compensate for their shortcomings. Their parameters can be adjusted according to different usage in order to accept the best information retrieval result, and it has been proved by our experiments. 展开更多
关键词 hybrid information retrieval model concept retrieval vector space model probabilistic reasoning network
下载PDF
INFORMATION RETRIEVAL FOR SHORT DOCUMENTS 被引量:2
3
作者 Qi Haoliang Li Mu +1 位作者 Gao Jianfeng Li Sheng 《Journal of Electronics(China)》 2006年第6期933-936,共4页
The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the abstract is av... The major problem of the most current approaches of information models lies in that individual words provide unreliable evidence about the content of the texts. When the document is short, e.g. only the abstract is available, the word-use variability problem will have substantial impact on the Information Retrieval (IR) performance. To solve the problem, a new technology to short document retrieval named Reference Document Model (RDM) is put forward in this letter. RDM gets the statistical semantic of the query/document by pseudo feedback both for the query and document from reference documents. The contributions of this model are three-fold: (1) Pseudo feedback both for the query and the document; (2) Building the query model and the document model from reference documents; (3) Flexible indexing units, which can be ally linguistic elements such as documents, paragraphs, sentences, n-grams, term or character. For short document retrieval, RDM achieves significant improvements over the classical probabilistic models on the task of ad hoc retrieval on Text REtrieval Conference (TREC) test sets. Results also show that the shorter the document, the better the RDM performance. 展开更多
关键词 information retrieval Short documents Reference Document model (RDM)
下载PDF
A new approach to query expansion in information retrieval 被引量:2
4
作者 李卫疆 Zhao +2 位作者 Tiejun Wang Xian'gang 《High Technology Letters》 EI CAS 2008年第1期77-80,共4页
To eliminate the mismatch between words of relevant documents and user's query and more seriousnegative effects it has on the performance of information retrieval,a method of query expansion on the ba-sis of new t... To eliminate the mismatch between words of relevant documents and user's query and more seriousnegative effects it has on the performance of information retrieval,a method of query expansion on the ba-sis of new terms co-occurrence representation was put forward by analyzing the process of producingquery.The expansion terms were selected according to their correlation to the whole query.At the sametime,the position information between terms were considered.The experimental result on test retrievalconference(TREC)data collection shows that the method proposed in the paper has made an improve-ment of 5%~19% all the time than the language modeling method without expansion.Compared to thepopular approach of query expansion,pseudo feedback,the precision of the proposed method is competi-tive. 展开更多
关键词 information retrieval language model query expansion
下载PDF
New Retrieval Method Based on Relative Entropy for LanguageModeling with Different Smoothing Methods
5
作者 霍华 刘俊强 冯博琴 《Journal of Southwest Jiaotong University(English Edition)》 2006年第2期113-120,共8页
A language model for information retrieval is built by using a query language model to generate queries and a document language model to generate documents. The documents are ranked according to the relative entropies... A language model for information retrieval is built by using a query language model to generate queries and a document language model to generate documents. The documents are ranked according to the relative entropies of estimated document language models with respect to the estimated query language model. Two popular and relatively efficient smoothing methods, the Jelinek- Mercer method and the absolute discounting method, are used to smooth the document language model in estimation of the document language, A combined model composed of the feedback document language model and the collection language model is used to estimate the query model. A performacne comparison between the new retrieval method and the existing method with feedback is made, and the retrieval performances of the proposed method with the two different smoothing techniques are evaluated on three Text Retrieval Conference (TREC) data sets. Experimental results show that the method is effective and performs better than the basic language modeling approach; moreover, the method using the Jelinek-Mercer technique performs better than that using the absolute discounting technique, and the perfomance is sensitive to the smoothing peramters. 展开更多
关键词 information retrieval Relative entropy Language modeling SMOOTHING
下载PDF
Information retrieval:a view from the Chinese IR community
6
作者 Zhumin CHEN Xueqi CHENG +16 位作者 Shoubin DONG Zhicheng DOU Jiafeng GUO Xuanjing HUANG Yanyan LAN Chenliang LI Ru LI Tie-Yan LIU Yiqun LIU Jun MA Bing QIN Mingwen WANG Jirong WEN Jun XU Min ZHANG Peng ZHANG Qi ZHANG 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第1期197-211,共15页
During a two day strategic workshop in February 2018,22 information retrieval researchers met to discuss the future challenges and opportunities within the field.The outcome is a list of potential research directions,... During a two day strategic workshop in February 2018,22 information retrieval researchers met to discuss the future challenges and opportunities within the field.The outcome is a list of potential research directions,project ideas,and challenges.This report describes the major conclusions we have obtained during the workshop.A key result is that we need to open our mind to embrace a broader IR field by rethink the definition of information,retrieval,user,system,and evaluation of IR.By providing detailed discussions on these topics,this report is expected to inspire our IR researchers in both academia and industry,and help the future growth of the IR research community. 展开更多
关键词 information retrieval REDEFINITION information scope of retrieval retrieval models USERS system architecture evaluation
原文传递
A Semantic Vector Retrieval Model for Desktop Documents
7
作者 Sheng Li 《Journal of Software Engineering and Applications》 2009年第1期55-59,共5页
The paper provides a semantic vector retrieval model for desktop documents based on the ontology. Comparing with traditional vector space model, the semantic model using semantic and ontology technology to solve sever... The paper provides a semantic vector retrieval model for desktop documents based on the ontology. Comparing with traditional vector space model, the semantic model using semantic and ontology technology to solve several problems that traditional model could not overcome such as the shortcomings of weight computing based on statistical method, the expression of semantic relations between different keywords, the description of document semantic vectors and the similarity calculating, etc. Finally, the experimental results show that the retrieval ability of our new model has significant improvement both on recall and precision. 展开更多
关键词 SEMANTIC DESKTOP information retrieval ONTOLOGY VECTOR retrieval model
下载PDF
Context-Aware信息检索技术与模型 被引量:1
8
作者 肖伟 全惠云 《计算机工程与应用》 CSCD 北大核心 2005年第35期169-172,共4页
Context-aware信息检索技术是一种具有自适应性和自组织性的智能信息检索技术,而且是富有挑战性的研究方向。文章首先叙述了与Context-aware相关的基本概念及实现的基本原理,接着介绍了基于不同Context信息的检索模型,并且讨论了各种模... Context-aware信息检索技术是一种具有自适应性和自组织性的智能信息检索技术,而且是富有挑战性的研究方向。文章首先叙述了与Context-aware相关的基本概念及实现的基本原理,接着介绍了基于不同Context信息的检索模型,并且讨论了各种模型的实现技术及适用范围。 展开更多
关键词 信息检索 Context—Aware 检索模型
下载PDF
Coverless Information Hiding Based on the Molecular Structure Images of Material 被引量:10
9
作者 Yi Cao Zhili Zhou +1 位作者 Xingming Sun Chongzhi Gao 《Computers, Materials & Continua》 SCIE EI 2018年第2期197-207,共11页
The traditional information hiding methods embed the secret information by modifying the carrier,which will inevitably leave traces of modification on the carrier.In this way,it is hard to resist the detection of steg... The traditional information hiding methods embed the secret information by modifying the carrier,which will inevitably leave traces of modification on the carrier.In this way,it is hard to resist the detection of steganalysis algorithm.To address this problem,the concept of coverless information hiding was proposed.Coverless information hiding can effectively resist steganalysis algorithm,since it uses unmodified natural stego-carriers to represent and convey confidential information.However,the state-of-the-arts method has a low hidden capacity,which makes it less appealing.Because the pixel values of different regions of the molecular structure images of material(MSIM)are usually different,this paper proposes a novel coverless information hiding method based on MSIM,which utilizes the average value of sub-image’s pixels to represent the secret information,according to the mapping between pixel value intervals and secret information.In addition,we employ a pseudo-random label sequence that is used to determine the position of sub-images to improve the security of the method.And the histogram of the Bag of words model(BOW)is used to determine the number of subimages in the image that convey secret information.Moreover,to improve the retrieval efficiency,we built a multi-level inverted index structure.Furthermore,the proposed method can also be used for other natural images.Compared with the state-of-the-arts,experimental results and analysis manifest that our method has better performance in anti-steganalysis,security and capacity. 展开更多
关键词 Coverless information hiding molecular structure images of material pixel value inverted index image retrieval bag of words model
下载PDF
Dominant Meaning Method for Intelligent Topic-Based Information Agent towards More Flexible MOOCs
10
作者 Mohammed Abdel Razek 《Journal of Intelligent Learning Systems and Applications》 2014年第4期186-196,共11页
The use of agent technology in a dynamic environment is rapidly growing as one of the powerful technologies and the need to provide the benefits of the Intelligent Information Agent technique to massive open online co... The use of agent technology in a dynamic environment is rapidly growing as one of the powerful technologies and the need to provide the benefits of the Intelligent Information Agent technique to massive open online courses, is very important from various aspects including the rapid growing of MOOCs environments, and the focusing more on static information than on updated information. One of the main problems in such environment is updating the information to the needs of the student who interacts at each moment. Using such technology can ensure more flexible information, lower waste time and hence higher earnings in learning. This paper presents Intelligent Topic-Based Information Agent to offer an updated knowledge including various types of resource for students. Using dominant meaning method, the agent searches the Internet, controls the metadata coming from the Internet, filters and shows them into a categorized content lists. There are two experiments conducted on the Intelligent Topic-Based Information Agent: one measures the improvement in the retrieval effectiveness and the other measures the impact of the agent on the learning. The experiment results indicate that our methodology to expand the query yields a considerable improvement in the retrieval effectiveness in all categories of Google Web Search API. On the other hand, there is a positive impact on the performance of learning session. 展开更多
关键词 Massive Open Online COURSES MOOCs Search Engine HYPERMEDIA Systems Web-Based Services QUERY Expansion Probabilistic model information retrieval
下载PDF
基于大模型的电子信息领域知识图谱自动构建与检索技术
11
作者 谢明华 《电讯技术》 北大核心 2024年第8期1228-1234,共7页
当前电子信息领域积累的越来越多宝贵经验知识对知识使用技术提出了新的挑战。知识图谱(Knowledge Graph, KG)技术和大规模预训练语言模型(Large Language Model, LLM)技术在知识使用方面都各自存在缺陷,但两种技术的优缺点能够形成互... 当前电子信息领域积累的越来越多宝贵经验知识对知识使用技术提出了新的挑战。知识图谱(Knowledge Graph, KG)技术和大规模预训练语言模型(Large Language Model, LLM)技术在知识使用方面都各自存在缺陷,但两种技术的优缺点能够形成互补。因此,基于LLM技术,提出了应用于电子信息领域的知识图谱自动构建与检索增强问答技术。首先基于LLM的语义理解能力自动构建电子信息领域知识图谱,然后构建基于知识图谱和检索增强大模型的知识问答系统。在CoNLL2003数据集和构建的电子信息领域数据集上的实验证明了所方法具有较好质量,知识问答系统具有较好的实用效果。所提方法能够更好地满足从业人员从海量文档中提取相关知识,提高知识利用效率的迫切需求,为推动大模型结合知识图谱技术在电子信息垂直领域的落地应用提供参考。 展开更多
关键词 电子信息领域 知识图谱构建 检索增强 大模型
下载PDF
DynamicRetriever:A Pre-trained Model-based IR System Without an Explicit Index
12
作者 Yu-Jia Zhou Jing Yao +2 位作者 Zhi-Cheng Dou Ledell Wu Ji-Rong Wen 《Machine Intelligence Research》 EI CSCD 2023年第2期276-288,共13页
Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models... Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models are proposed,and they have demonstrated the power for improving search(especially,the ranking)quality.All these existing search methods follow a common paradigm,i.e.,index-retrieve-rerank,where they first build an index of all documents based on document terms(i.e.,sparse inverted index)or representation vectors(i.e.,dense vector index),then retrieve and rerank retrieved documents based on the similarity between the query and documents via ranking models.In this paper,we explore a new paradigm of information retrieval without an explicit index but only with a pre-trained model.Instead,all of the knowledge of the documents is encoded into model parameters,which can be regarded as a differentiable indexer and optimized in an end-to-end manner.Specifically,we propose a pre-trained model-based information retrieval(IR)system called DynamicRetriever,which directly returns document identifiers for a given query.Under such a framework,we implement two variants to explore how to train the model from scratch and how to combine the advantages of dense retrieval models.Compared with existing search methods,the model-based IR system parameterizes the traditional static index with a pre-training model,which converts the document semantic mapping into a dynamic and updatable process.Extensive experiments conducted on the public search benchmark Microsoft machine reading comprehension(MS MARCO)verify the effectiveness and potential of our proposed new paradigm for information retrieval. 展开更多
关键词 information retrieval(IR) document retrieval model-based IR pre-trained language model differentiable search index
原文传递
基于预训练模型的漏洞信息检索系统研究 被引量:1
13
作者 刘烨 杨良斌 《情报杂志》 CSSCI 北大核心 2024年第8期84-91,共8页
[研究目的]威胁情报中漏洞信息是指有关网络、系统、应用程序或供应链中存在的漏洞的信息。目前搜索引擎在漏洞信息检索上存在短板,利用预训练模型来构建漏洞检索系统可以提高检索效率。[研究方法]以公开的漏洞信息作为数据来源,构建了... [研究目的]威胁情报中漏洞信息是指有关网络、系统、应用程序或供应链中存在的漏洞的信息。目前搜索引擎在漏洞信息检索上存在短板,利用预训练模型来构建漏洞检索系统可以提高检索效率。[研究方法]以公开的漏洞信息作为数据来源,构建了一个问答数据集,对Tiny Bert进行增量预训练。使用模型对于每个查询向量化,并把漏洞信息构建成faiss向量数据库,利用HNSW索引进行多通道和单通道召回检索。然后对模型进行对比学习微调生成双塔和单塔模型,利用双塔召回和单塔精排构建了一个简易的知识检索系统。[研究结论]实验结果表明,预训练模型可以显著地提升检索性能,对比学习微调的双塔模型在构建的漏洞信息测试集中TOP1召回率为92.17%。通过漏洞信息领域的检索实践,对构建威胁情报的检索系统提供了参考。 展开更多
关键词 威胁情报 预训练模型 漏洞信息 多通道搜索技术 信息检索系统
下载PDF
基于多粒度语义融合的信息检索方法
14
作者 赵征宇 罗景 涂新辉 《计算机应用》 CSCD 北大核心 2024年第6期1775-1780,共6页
信息检索(IR)是一种通过特定的技术和方法组织、处理信息,以满足用户的信息需求的过程。近年来,基于预训练模型的稠密检索方法取得了巨大的成功;然而,这些方法只利用了文本和词语的向量表征计算查询与文档相关度,忽略了它们短语层面间... 信息检索(IR)是一种通过特定的技术和方法组织、处理信息,以满足用户的信息需求的过程。近年来,基于预训练模型的稠密检索方法取得了巨大的成功;然而,这些方法只利用了文本和词语的向量表征计算查询与文档相关度,忽略了它们短语层面间的语义信息。针对该问题,提出一种名为MSIR(Multi-Scale IR)的IR方法。所提方法通过融合查询与文档中多种不同粒度的语义信息提高IR性能。首先,构建查询和文档中词语、短语和文本这3个粒度的语义单元;其次,利用预训练模型对这3个语义单元分别进行编码获得它们的语义表征;最后,利用语义表征计算查询和文档相关度。在Corvid-19、TREC2019和Robust04这3个不同大小的经典数据集上进行了对比实验。与ColBERT(ranking model based on Contextualized late interaction over BERT(Bidirectional Encoder Representation from Transformers))相比,MSIR在Robust04数据集的P@10、P@20、NDCG@10和NDCG@20指标上均实现了约8%的提升,同时在Corvid-19和TREC2019数据集上也取得了一定的改进。实验结果表明,MSIR能够成功融合多种语义粒度,提升检索精度。 展开更多
关键词 语义融合 信息检索 稠密检索 预训练模型 文本检索
下载PDF
大语言模型增强的知识图谱问答研究进展综述
15
作者 冯拓宇 李伟平 +3 位作者 郭庆浪 王刚亮 张雨松 乔子剑 《计算机科学与探索》 CSCD 北大核心 2024年第11期2887-2900,共14页
知识图谱问答(knowledge graph question answering,KGQA)是一种通过处理用户提出的自然语言问题,从知识图谱中获取相关答案的技术。早期的知识图谱问答技术受到知识图谱规模、计算能力以及自然语言处理能力的限制,准确率较低。近年来,... 知识图谱问答(knowledge graph question answering,KGQA)是一种通过处理用户提出的自然语言问题,从知识图谱中获取相关答案的技术。早期的知识图谱问答技术受到知识图谱规模、计算能力以及自然语言处理能力的限制,准确率较低。近年来,随着人工智能技术的进步,特别是大语言模型(large language model,LLM)的发展,知识图谱问答技术的性能得到显著提升。大语言模型如GPT-3等已经被广泛应用于增强知识图谱问答的性能。为了更好地研究学习增强知识图谱问答的技术,对现有的各种大语言模型增强的知识图谱问答方法进行了归纳分析。总结了大语言模型和知识图谱问答的相关知识,即大语言模型的技术原理、训练方法,以及知识图谱、问答和知识图谱问答的基本概念。从语义解析和信息检索两个维度,综述了大语言模型增强知识图谱问答的现有方法,分析了方法所解决的问题及其局限性。收集整理了大语言模型增强知识图谱问答的相关资源和评测方法,并对现有方法的性能表现进行了总结。最后针对现有方法的局限性,分析并提出了未来的重点研究方向。 展开更多
关键词 大语言模型 知识图谱问答 语义解析 信息检索
下载PDF
基于提示学习的轻量化代码生成方法
16
作者 徐一然 周宇 《计算机科学》 CSCD 北大核心 2024年第6期61-67,共7页
代码自动生成是提高软件开发效率的有效途径之一,已有的研究一般将代码生成作为一项序列到序列的任务,并且大规模预训练语言模型的微调过程往往伴随着高昂的算力开销。文中提出了一种基于提示学习的轻量化代码生成方法(Prompt Learning ... 代码自动生成是提高软件开发效率的有效途径之一,已有的研究一般将代码生成作为一项序列到序列的任务,并且大规模预训练语言模型的微调过程往往伴随着高昂的算力开销。文中提出了一种基于提示学习的轻量化代码生成方法(Prompt Learning based Parameter-Efficient Code Generation,PPECG),该方法通过查询代码语料库中与当前需求最相似的结果作为提示,指导预训练语言模型进行代码生成,并且在该过程中固定模型的绝大多数参数以实现减少算力开销的目的。为了验证PPECG的有效性,文中选取了两个代码生成数据集,分别是CONCODE和Solidity4CG,通过计算生成结果的BLEU,CodeBLEU以及Exact Match值来验证PPECG的有效性,实验结果表明,PPECG有效地减少了微调时的显存开销,且在上述指标上基本接近甚至优于目前的SOTA方法,能够较好地完成代码生成的任务。 展开更多
关键词 代码生成 提示学习 预训练语言模型 信息检索 智能合约
下载PDF
User’s Relevance of PIR System Based on Cloud Models 被引量:1
17
作者 康海燕 樊孝忠 《Journal of Beijing Institute of Technology》 EI CAS 2006年第2期181-185,共5页
A new method to evaluate fuzzily user's relevance on the basis of cloud models has been proposed. All factors of personalized information retrieval system are taken into account in this method. So using this method f... A new method to evaluate fuzzily user's relevance on the basis of cloud models has been proposed. All factors of personalized information retrieval system are taken into account in this method. So using this method for personalized information retrieval (PIR) system can efficiently judge multi-value relevance, such as quite relevant, comparatively relevant, commonly relevant, basically relevant and completely non-relevant, and realize a kind of transform of qualitative concepts and quantity and improve accuracy of relevance judgements in PIR system. Experimental data showed that the method is practical and valid. Evaluation results are more accurate and approach to the fact better. 展开更多
关键词 personalized information retrieval (PIR) cloud model user's relevance fuzzy evaluation
下载PDF
基于语义网的数字图书馆信息检索系统框架模型 被引量:1
18
作者 杨鸿 《科技创新与应用》 2024年第16期120-123,共4页
面对互联网技术的成熟化、普及化发展,数字图书馆成为了为人们提供信息资源的重要机构。然而面对数据异构化、分散化发展的大环境,数字图书馆的传统信息检索系统与用户日益提高的信息检索需求不再适应,难以正确判断出用户的检索意图,存... 面对互联网技术的成熟化、普及化发展,数字图书馆成为了为人们提供信息资源的重要机构。然而面对数据异构化、分散化发展的大环境,数字图书馆的传统信息检索系统与用户日益提高的信息检索需求不再适应,难以正确判断出用户的检索意图,存在检索效率低、检测结果不够准确的问题。基于此,该文结合语义网技术,对数字图书馆信息检索系统框架模型进行模块、流程、系统结构设计,并在此基础上给出领域本体集成与构建、语义相似度算法优化的方法,旨在为数字图书馆信息检索系统科学建设提供参考,从而最大化展现出数字图书馆的信息资源利用价值。 展开更多
关键词 语义网 数字图书馆 信息检索系统 信息资源利用价值 系统框架模型
下载PDF
A New Indexing Method Based on Word Proximity for Chinese Text Retrieval 被引量:1
19
作者 杜林 孙玉芳 《Journal of Computer Science & Technology》 SCIE EI CSCD 2000年第3期280-286,共7页
This paper proposed a novel text representation and matching scheme for Chinese text retrieval. At present, the indexing methods of Chinese retrieval systems are either character-based or word-based. The character-bas... This paper proposed a novel text representation and matching scheme for Chinese text retrieval. At present, the indexing methods of Chinese retrieval systems are either character-based or word-based. The character-based indexing methods, such as bi-gram or tri-gram indexing, have high false drops due to the mismatches between queries and documents. On the other hand, it's difficult to efficiently identify all the proper nouns, terminology of different domains, and phrases in the word-based indexing systems. The new indexing method uses both proximity and mutual information of the word pairs to represent the text content so as to overcome the high false drop, new word and phrase problems that exist in the character-based and word-based systems. The evaluation results indicate that the average query precision of proximity-based indexing is 5.2% higher than the best results of TREC-5. 展开更多
关键词 information retrieval vector space model automatic indexing proximity-based indexing
原文传递
历史建筑多模态检索方法研究
20
作者 袁嘉梦 陈浪 +1 位作者 陈维亚 骆汉宾 《土木建筑工程信息技术》 2024年第4期7-13,共7页
在HBIM(Historic Building Information Modeling)数据库中进行信息查询面临三个问题:一是没有普适性的规则判断建筑之间的相似性;二是未考虑建筑本身所包含的历史文化信息;三是查询文本多基于关键词,难以检索到关键词未包含的信息。针... 在HBIM(Historic Building Information Modeling)数据库中进行信息查询面临三个问题:一是没有普适性的规则判断建筑之间的相似性;二是未考虑建筑本身所包含的历史文化信息;三是查询文本多基于关键词,难以检索到关键词未包含的信息。针对以上问题,提出了一种面向历史建筑的多模态检索方法,用户能通过输入图像或自然语言文本数据,检索到与输入特征相符的建筑,并以列表形式进行排序。在以图像检索建筑时,利用“dino_vit16”模型对图像进行特征提取,所提出的图像-建筑检索方法检索精度达90.08%;在文本检索建筑时则基于CLIP(Contrastive Language-Image Pre-training)模型建立图像和文本的关联,研究了图文相似度和文本相似度权重的取值,选择m=0.6,n=0.4作为权重的最佳配置。实验证明所提出的文本-建筑检索算法对于包含某种外观特征查询语句的检索效果最好,对于描述某种功能和建筑风格的查询语句检索效果最差,而当查询语句中包含4个以上的混合特征,能够描述出建筑的基本面貌时,可以准确地检索到符合条件的建筑。 展开更多
关键词 历史建筑 HBIM VIT 相似性度量 多模态检索
下载PDF
上一页 1 2 29 下一页 到第
使用帮助 返回顶部