期刊文献+
共找到22,592篇文章
< 1 2 250 >
每页显示 20 50 100
Research on Cross-Language Retrieval Using Bilingual Word Vectors in Different Languages
1
作者 Yulong Li Dong Zhou 《国际计算机前沿大会会议论文集》 2019年第1期462-465,共4页
Bilingual word vectors have been exploited a lot in cross-language information retrieval research. However, most of the research is currently focused on similar language pairs. There are very few studies exploring the... Bilingual word vectors have been exploited a lot in cross-language information retrieval research. However, most of the research is currently focused on similar language pairs. There are very few studies exploring the impact of using bilingual word vectors for cross-language information retrieval in long-distance language pairs. In this paper, it systematically analyzes the retrieval performance of various European languages (English, German, Italian, French, Finnish, Dutch) as well as Asian languages (Chinese, Japanese) in the adhoc task of CLEF 2002–2003 campaign. Genetic proximity was used to visually represent the relationships between languages and compare their crosslingual retrieval performance in various settings. The results show that the differences in language vocabulary would dramatically affect the retrieval performance. At the same time, the term by term translation retrieval method performs slightly better than the simple vector addition retrieval methods. It proves that the translation-based retrieval model can still maintain its advantage under the new semantic scheme. 展开更多
关键词 cross-language information retrieval BILINGUAL word EMBEDDING Genetic PROXIMITY Language PAIRS
下载PDF
Database Search Behaviors: Insight from a Survey of Information Retrieval Practices
2
作者 Babita Trivedi Brijender Dahiya +2 位作者 Anjali Maan Rajesh Giri Vinod Prasad 《Intelligent Information Management》 2024年第5期195-218,共24页
This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, catego... This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns. 展开更多
关键词 Information retrieval Database Search User Behavior Patterns
下载PDF
A Deep-Learning and Transfer-Learning Hybrid Aerosol Retrieval Algorithm for FY4-AGRI:Development and Verification over Asia
3
作者 Disong Fu Hongrong Shi +9 位作者 Christian AGueymard Dazhi Yang Yu Zheng Huizheng Che Xuehua Fan Xinlei Han Lin Gao Jianchun Bian Minzheng Duan Xiangao Xia 《Engineering》 SCIE EI CAS CSCD 2024年第7期164-174,共11页
The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral b... The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events. 展开更多
关键词 Aerosol optical depth retrieval algorithm Deep learning Transfer learning Advanced Geosynchronous Radiation IMAGER
下载PDF
Importance-aware 3D volume visualization for medical content-based image retrieval-a preliminary study
4
作者 Mingjian LI Younhyun JUNG +1 位作者 Michael FULHAM Jinman KIM 《虚拟现实与智能硬件(中英文)》 EI 2024年第1期71-81,共11页
Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based di... Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based diagnosis,teaching,and research.Although the retrieval accuracy has largely improved,there has been limited development toward visualizing important image features that indicate the similarity of retrieved images.Despite the prevalence of 3D volumetric data in medical imaging such as computed tomography(CT),current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images.Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information,including the size,shape,and spatial relations of multiple structures.This process is time-consuming and reliant on users'experience.Methods In this study,we proposed an importance-aware 3D volume visualization method.The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process.We then integrated the proposed visualization into a CBIR system,thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.Results Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography(PETCT)images of a non-small cell lung cancer dataset. 展开更多
关键词 Volume visualization DVR Medical CBIR retrieval Medical images
下载PDF
Orbit Weighting Scheme in the Context of Vector Space Information Retrieval
5
作者 Ahmad Ababneh Yousef Sanjalawe +2 位作者 Salam Fraihat Salam Al-E’mari Hamzah Alqudah 《Computers, Materials & Continua》 SCIE EI 2024年第7期1347-1379,共33页
This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schem... This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies. 展开更多
关键词 Information retrieval orbit weighting scheme semantic text analysis Tf-Idf weighting scheme vector space model
下载PDF
A Survey of Crime Scene Investigation Image Retrieval Using Deep Learning
6
作者 Ying Liu Aodong Zhou +1 位作者 Jize Xue Zhijie Xu 《Journal of Beijing Institute of Technology》 EI CAS 2024年第4期271-286,共16页
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep... Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field. 展开更多
关键词 crime scene investigation(CSI)image image retrieval deep learning
下载PDF
Comparison between ozonesonde measurements and satellite retrievals over Beijing,China 被引量:2
7
作者 Jinqiang Zhang Yuejian Xuan +5 位作者 Jianchun Bian Holger Vomel Yunshu Zeng Zhixuan Bai Dan Li Hongbin Chen 《Atmospheric and Oceanic Science Letters》 CSCD 2024年第1期14-20,共7页
从2013年开始,作者团队使用自主研发电化学原理臭氧探空仪在华北平原北京地区进行每周一次观测.本研究首次使用2013-2019年期间北京地区臭氧探空数据评估Aqua卫星搭载大气红外探测仪(AIRS)和Aura卫星搭载微波临边探测器(MLS)反演垂直臭... 从2013年开始,作者团队使用自主研发电化学原理臭氧探空仪在华北平原北京地区进行每周一次观测.本研究首次使用2013-2019年期间北京地区臭氧探空数据评估Aqua卫星搭载大气红外探测仪(AIRS)和Aura卫星搭载微波临边探测器(MLS)反演垂直臭氧廓线,并对比臭氧探空,AIRS和Aura卫星搭载臭氧监测仪(OMI)臭氧柱总量结果.尽管臭氧探空与卫星反演垂直臭氧廓线在局部高度处差异较大,但整体来说两者较为接近(相对偏差大多<10%).臭氧探空,AIRS和OMI三种仪器测量臭氧柱总量的年变化特征较为一致,其年均臭氧柱总量分别为351.8±18.4 DU,348.8±19.5 DU和336.9±14.2 DU.后续对国内多站点观测数据分析将有助于进一步理解臭氧探空与卫星反演臭氧资料在不同区域的一致性. 展开更多
关键词 臭氧探空 卫星反演 垂直臭氧廓线 臭氧柱总量 华北平原
下载PDF
Region-Aware Fashion Contrastive Learning for Unified Attribute Recognition and Composed Retrieval
8
作者 WANG Kangping ZHAO Mingbo 《Journal of Donghua University(English Edition)》 CAS 2024年第4期405-415,共11页
Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me... Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts. 展开更多
关键词 attribute recognition image retrieval contrastive language-image pre-training(CLIP) image text matching transformer
下载PDF
A Visual Indoor Localization Method Based on Efficient Image Retrieval
9
作者 Mengyan Lyu Xinxin Guo +1 位作者 Kunpeng Zhang Liye Zhang 《Journal of Computer and Communications》 2024年第2期47-66,共20页
The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor l... The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method. 展开更多
关键词 Visual Indoor Positioning Feature Point Matching Image retrieval Position Calculation Five-Point Method
下载PDF
Query Expansion Using Wikipedia and a Concept Base in Cross-language Information Retrieval
10
作者 Pham Huy Anh Yukawa Takashi 《Computer Technology and Application》 2013年第10期522-531,共10页
The present paper describes the use of online free language resources for translating and expanding queries in CLIR (cross-language information retrieval). In a previous study, we proposed method queries that were t... The present paper describes the use of online free language resources for translating and expanding queries in CLIR (cross-language information retrieval). In a previous study, we proposed method queries that were translated by two machine translation systems on the Language Gridem. The queries were then expanded using an online dictionary to translate compound words or word phrases. A concept base was used to compare back translation words with the original query in order to delete mistranslated words. In order to evaluate the proposed method, we constructed a CLIR system and used the science documents of the NTCIR1 dataset. The proposed method achieved high precision. However~ proper nouns (names of people and places) appear infrequently in science documents. In information retrieval, proper nouns present unique problems. Since proper nouns are usually unknown words, they are difficult to find in monolingual dictionaries, not to mention bilingual dictionaries. Furthermore, the initial query of the user is not always the best description of the desired information. In order to solve this problem, and to create a better query representation, query expansion is often proposed as a solution. Wikipedia was used to translate compound words or word phrases. It was also used to expand queries together with a concept base. The NTCIRI and NTCIR 6 datasets were used to evaluate the proposed method. In the proposed method, the CLIR system was implemented with a high rate of precision. The proposed syst had a higher ranking than the NTCIRI and NTCIR6 participation systems. 展开更多
关键词 cross-language inlbrmation retrieval CLIR language resources concept base language grid Wikipedia.
下载PDF
Bilingual Dictionary Approach for Malay-English Cross-Language Information Retrieval
11
作者 Nurjannaton Hidayah Rais Muhamad Taufik Abdullah Rabiah Abdul Kadir 《通讯和计算机(中英文版)》 2011年第5期354-360,共7页
关键词 跨语言信息检索 双语词典 英语 查询转换 语言翻译 翻译方法 语言表达 自动翻译
下载PDF
Image Retrieval Based on Vision Transformer and Masked Learning 被引量:5
12
作者 李锋 潘煌圣 +1 位作者 盛守祥 王国栋 《Journal of Donghua University(English Edition)》 CAS 2023年第5期539-547,共9页
Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number... Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number of labeled data,which limits the application.Self-supervised learning is a more general approach in unlabeled scenarios.A method of fine-tuning feature extraction networks based on masked learning is proposed.Masked autoencoders(MAE)are used in the fine-tune vision transformer(ViT)model.In addition,the scheme of extracting image descriptors is discussed.The encoder of the MAE uses the ViT to extract global features and performs self-supervised fine-tuning by reconstructing masked area pixels.The method works well on category-level image retrieval datasets with marked improvements in instance-level datasets.For the instance-level datasets Oxford5k and Paris6k,the retrieval accuracy of the base model is improved by 7%and 17%compared to that of the original model,respectively. 展开更多
关键词 content-based image retrieval vision transformer masked autoencoder feature extraction
下载PDF
Image Retrieval with Text Manipulation by Local Feature Modification 被引量:2
13
作者 查剑宏 燕彩蓉 +1 位作者 张艳婷 王俊 《Journal of Donghua University(English Edition)》 CAS 2023年第4期404-409,共6页
The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the qu... The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1). 展开更多
关键词 image retrieval text manipulation ATTENTION local feature modification
下载PDF
Toward Fine-grained Image Retrieval with Adaptive Deep Learning for Cultural Heritage Image 被引量:2
14
作者 Sathit Prasomphan 《Computer Systems Science & Engineering》 SCIE EI 2023年第2期1295-1307,共13页
Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scal... Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval. 展开更多
关键词 Fine-grained image adaptive deep learning cultural heritage image retrieval
下载PDF
Triplet Label Based Image Retrieval Using Deep Learning in Large Database 被引量:1
15
作者 K.Nithya V.Rajamani 《Computer Systems Science & Engineering》 SCIE EI 2023年第3期2655-2666,共12页
Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wi... Recent days,Image retrieval has become a tedious process as the image database has grown very larger.The introduction of Machine Learning(ML)and Deep Learning(DL)made this process more comfortable.In these,the pair-wise label similarity is used tofind the matching images from the database.But this method lacks of limited propose code and weak execution of misclassified images.In order to get-rid of the above problem,a novel triplet based label that incorporates context-spatial similarity measure is proposed.A Point Attention Based Triplet Network(PABTN)is introduced to study propose code that gives maximum discriminative ability.To improve the performance of ranking,a corre-lating resolutions for the classification,triplet labels based onfindings,a spatial-attention mechanism and Region Of Interest(ROI)and small trial information loss containing a new triplet cross-entropy loss are used.From the experimental results,it is shown that the proposed technique exhibits better results in terms of mean Reciprocal Rank(mRR)and mean Average Precision(mAP)in the CIFAR-10 and NUS-WIPE datasets. 展开更多
关键词 Image retrieval deep learning point attention based triplet network correlating resolutions classification region of interest
下载PDF
An Efficient Long Short-Term Memory Model for Digital Cross-Language Summarization
16
作者 Y.C.A.Padmanabha Reddy Shyam Sunder Reddy Kasireddy +2 位作者 Nageswara Rao Sirisala Ramu Kuchipudi Purnachand Kollapudi 《Computers, Materials & Continua》 SCIE EI 2023年第3期6389-6409,共21页
The rise of social networking enables the development of multilingual Internet-accessible digital documents in several languages.The digital document needs to be evaluated physically through the Cross-Language Text Su... The rise of social networking enables the development of multilingual Internet-accessible digital documents in several languages.The digital document needs to be evaluated physically through the Cross-Language Text Summarization(CLTS)involved in the disparate and generation of the source documents.Cross-language document processing is involved in the generation of documents from disparate language sources toward targeted documents.The digital documents need to be processed with the contextual semantic data with the decoding scheme.This paper presented a multilingual crosslanguage processing of the documents with the abstractive and summarising of the documents.The proposed model is represented as the Hidden Markov Model LSTM Reinforcement Learning(HMMlstmRL).First,the developed model uses the Hidden Markov model for the computation of keywords in the cross-language words for the clustering.In the second stage,bi-directional long-short-term memory networks are used for key word extraction in the cross-language process.Finally,the proposed HMMlstmRL uses the voting concept in reinforcement learning for the identification and extraction of the keywords.The performance of the proposed HMMlstmRL is 2%better than that of the conventional bi-direction LSTM model. 展开更多
关键词 Text summarization reinforcement learning hidden markov model cross-language MULTILINGUAL
下载PDF
Learning Noise-Assisted Robust Image Features for Fine-Grained Image Retrieval
17
作者 Vidit Kumar Hemant Petwal +1 位作者 Ajay Krishan Gairola Pareshwar Prasad Barmola 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期2711-2724,共14页
Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fin... Fine-grained image search is one of the most challenging tasks in computer vision that aims to retrieve similar images at the fine-grained level for a given query image.The key objective is to learn discriminative fine-grained features by training deep models such that similar images are clustered,and dissimilar images are separated in the low embedding space.Previous works primarily focused on defining local structure loss functions like triplet loss,pairwise loss,etc.However,training via these approaches takes a long training time,and they have poor accuracy.Additionally,representations learned through it tend to tighten up in the embedded space and lose generalizability to unseen classes.This paper proposes a noise-assisted representation learning method for fine-grained image retrieval to mitigate these issues.In the proposed work,class manifold learning is performed in which positive pairs are created with noise insertion operation instead of tightening class clusters.And other instances are treated as negatives within the same cluster.Then a loss function is defined to penalize when the distance between instances of the same class becomes too small relative to the noise pair in that class in embedded space.The proposed approach is validated on CARS-196 and CUB-200 datasets and achieved better retrieval results(85.38%recall@1 for CARS-196%and 70.13%recall@1 for CUB-200)compared to other existing methods. 展开更多
关键词 Convolutional network zero-shot learning fine-grained image retrieval image representation image retrieval intra-class diversity feature learning
下载PDF
OSAP‐Loss:Efficient optimization of average precision via involving samples after positive ones towards remote sensing image retrieval
18
作者 Xin Yuan Xin Xu +4 位作者 Xiao Wang Kai Zhang Liang Liao Zheng Wang Chia‐Wen Lin 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第4期1191-1212,共22页
In existing remote sensing image retrieval(RSIR)datasets,the number of images among different classes varies dramatically,which leads to a severe class imbalance problem.Some studies propose to train the model with th... In existing remote sensing image retrieval(RSIR)datasets,the number of images among different classes varies dramatically,which leads to a severe class imbalance problem.Some studies propose to train the model with the ranking‐based metric(e.g.,average precision[AP]),because AP is robust to class imbalance.However,current AP‐based methods overlook an important issue:only optimising samples ranking before each positive sample,which is limited by the definition of AP and is prone to local optimum.To achieve global optimisation of AP,a novel method,namely Optimising Samples after positive ones&AP loss(OSAP‐Loss)is proposed in this study.Specifically,a novel superior ranking function is designed to make the AP loss differentiable while providing a tighter upper bound.Then,a novel loss called Optimising Samples after Positive ones(OSP)loss is proposed to involve all positive and negative samples ranking after each positive one and to provide a more flexible optimisation strategy for each sample.Finally,a graphics processing unit memory‐free mechanism is developed to thoroughly address the non‐decomposability of AP optimisation.Extensive experimental results on RSIR as well as conventional image retrieval datasets show the superiority and competitive performance of OSAP‐Loss compared to the state‐of‐the‐art. 展开更多
关键词 computer vision image retrieval metric learning
下载PDF
TECMH:Transformer-Based Cross-Modal Hashing For Fine-Grained Image-Text Retrieval
19
作者 Qiqi Li Longfei Ma +2 位作者 Zheng Jiang Mingyong Li Bo Jin 《Computers, Materials & Continua》 SCIE EI 2023年第5期3713-3728,共16页
In recent years,cross-modal hash retrieval has become a popular research field because of its advantages of high efficiency and low storage.Cross-modal retrieval technology can be applied to search engines,crossmodalm... In recent years,cross-modal hash retrieval has become a popular research field because of its advantages of high efficiency and low storage.Cross-modal retrieval technology can be applied to search engines,crossmodalmedical processing,etc.The existing main method is to use amulti-label matching paradigm to finish the retrieval tasks.However,such methods do not use fine-grained information in the multi-modal data,which may lead to suboptimal results.To avoid cross-modal matching turning into label matching,this paper proposes an end-to-end fine-grained cross-modal hash retrieval method,which can focus more on the fine-grained semantic information of multi-modal data.First,the method refines the image features and no longer uses multiple labels to represent text features but uses BERT for processing.Second,this method uses the inference capabilities of the transformer encoder to generate global fine-grained features.Finally,in order to better judge the effect of the fine-grained model,this paper uses the datasets in the image text matching field instead of the traditional label-matching datasets.This article experiment on Microsoft COCO(MS-COCO)and Flickr30K datasets and compare it with the previous classicalmethods.The experimental results show that this method can obtain more advanced results in the cross-modal hash retrieval field. 展开更多
关键词 Deep learning cross-modal retrieval hash learning TRANSFORMER
下载PDF
ViT2CMH:Vision Transformer Cross-Modal Hashing for Fine-Grained Vision-Text Retrieval
20
作者 Mingyong Li Qiqi Li +1 位作者 Zheng Jiang Yan Ma 《Computer Systems Science & Engineering》 SCIE EI 2023年第8期1401-1414,共14页
In recent years,the development of deep learning has further improved hash retrieval technology.Most of the existing hashing methods currently use Convolutional Neural Networks(CNNs)and Recurrent Neural Networks(RNNs)... In recent years,the development of deep learning has further improved hash retrieval technology.Most of the existing hashing methods currently use Convolutional Neural Networks(CNNs)and Recurrent Neural Networks(RNNs)to process image and text information,respectively.This makes images or texts subject to local constraints,and inherent label matching cannot capture finegrained information,often leading to suboptimal results.Driven by the development of the transformer model,we propose a framework called ViT2CMH mainly based on the Vision Transformer to handle deep Cross-modal Hashing tasks rather than CNNs or RNNs.Specifically,we use a BERT network to extract text features and use the vision transformer as the image network of the model.Finally,the features are transformed into hash codes for efficient and fast retrieval.We conduct extensive experiments on Microsoft COCO(MS-COCO)and Flickr30K,comparing with baselines of some hashing methods and image-text matching methods,showing that our method has better performance. 展开更多
关键词 Hash learning cross-modal retrieval fine-grained matching TRANSFORMER
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部