AIM:To assess the feasibility and safety of the use of soehendra stent retriever as a new technique for biliary access in endoscopic ultrasound-guided biliary drainage.METHODS:The medical records and endoscopic report...AIM:To assess the feasibility and safety of the use of soehendra stent retriever as a new technique for biliary access in endoscopic ultrasound-guided biliary drainage.METHODS:The medical records and endoscopic reports of the patients who underwent endoscopic ultrasound-guided biliary drainage(EUS-BD) owing to failed endoscopic retrograde cholangiopancreatography in our institute between June 2011 and January 2014 were collected and reviewed.All the procedures were performed in the endoscopic suite under intravenous sedation with propofol and full anaesthetic monitoring.Then we used the Soehendra stent retriever as new equipment for neo-tract creation and dilation when performing EUS-BD procedures.The patients were observed in the recovery room for 1-2 h and transferred to the regular ward,patients' clinical data were reviewed and analysed,clinical outcomes were defined by using several different criteria.Data were analysed by using SPSS 13 and presented as percentages,means,and medians.RESULTS:A total of 12 patients were enrolled.The most common indications for EUS-BD in this series were failed common bile duct cannulation,duodenal obstruction,failed selective intrahepatic duct cannulation,and surgical altered anatomy for 50%,25%,16.7%,and 8.3%,respectively.Seven patients underwent EUS-guided hepaticogastrostomy(58.3%),and 5 underwent EUS-guided choledochoduodenostomy(41.7%).The technical success rate was 100%,while the clinical success rate was 91.7%.Major and minor complications occurred in 16.6% and 33.3% of patients,respectively,but there were no procedurerelated death.CONCLUSION:Soehendra stent retriever could be used as an alternative instrument for biliary access in endoscopic ultrasound guided biliary drainage.展开更多
Bilioenteric or pancreatoenteric anastomotic stric-tures often occur after surgery for a pancreaticobiliary disorder. Therapeutic endoscopic retrograde cholan-giopancreatography using balloon enteroscopy has been show...Bilioenteric or pancreatoenteric anastomotic stric-tures often occur after surgery for a pancreaticobiliary disorder. Therapeutic endoscopic retrograde cholan-giopancreatography using balloon enteroscopy has been shown to be feasible and effective in patients with such strictures. However, when a benign anas-tomotic stricture is severe, a dilation catheter cannot pass through the stricture despite successful insertion of the guidewire. We report on the usefulness of the Soehendra Stent Retriever over a guidewire for dilating a severe bilioenteric or pancreatoenteric anastomotic stricture under short double-balloon enteroscopy, in two patients with surgically altered anatomies.展开更多
Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models...Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models are proposed,and they have demonstrated the power for improving search(especially,the ranking)quality.All these existing search methods follow a common paradigm,i.e.,index-retrieve-rerank,where they first build an index of all documents based on document terms(i.e.,sparse inverted index)or representation vectors(i.e.,dense vector index),then retrieve and rerank retrieved documents based on the similarity between the query and documents via ranking models.In this paper,we explore a new paradigm of information retrieval without an explicit index but only with a pre-trained model.Instead,all of the knowledge of the documents is encoded into model parameters,which can be regarded as a differentiable indexer and optimized in an end-to-end manner.Specifically,we propose a pre-trained model-based information retrieval(IR)system called DynamicRetriever,which directly returns document identifiers for a given query.Under such a framework,we implement two variants to explore how to train the model from scratch and how to combine the advantages of dense retrieval models.Compared with existing search methods,the model-based IR system parameterizes the traditional static index with a pre-training model,which converts the document semantic mapping into a dynamic and updatable process.Extensive experiments conducted on the public search benchmark Microsoft machine reading comprehension(MS MARCO)verify the effectiveness and potential of our proposed new paradigm for information retrieval.展开更多
The distinctive conditions present on the north and south slopes of Mount Qomolangma,along with the intricate variations in the underlying surfaces,result in notable variations in the surface energy flux patterns of t...The distinctive conditions present on the north and south slopes of Mount Qomolangma,along with the intricate variations in the underlying surfaces,result in notable variations in the surface energy flux patterns of the two slopes.In this paper,data from TESEBS(Topographical Enhanced Surface Energy Balance System),remote sensing data from eight cloud-free scenarios,and observational data from nine stations are utilized to examine the fluctuations in the surface heat flux on both slopes.The inclusion of MCD43A3 satellite data enhances the surface albedo,contributing to more accurate simulation outcomes.The model results are validated using observational data.The RMSEs of the net radiation,ground heat,sensible heat,and latent heat flux are 40.73,17.09,33.26,and 30.91 W m^(−2),respectively.The net radiation flux is greater on the south slope and exhibits a rapid decline from summer to autumn.Due to the influence of the monsoon,on the north slope,the maximum sensible heat flux occurs in the pre-monsoon period in summer and the maximum latent heat flux occurs during the monsoon.The south slope experiences the highest latent heat flux in summer.The dominant flux on the north slope is sensible heat,while it is latent heat on the south slope.The seasonal variations in the ground heat flux are more pronounced on the south slope than on the north slope.Except in summer,the ground heat flux on the north slope surpasses that on the south slope.展开更多
The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral b...The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events.展开更多
With the development of the hyperspectral remote sensing technique,extensive chemical weathering profiles have been identified on Mars.These weathering sequences,formed through precipitation-driven leaching processes,...With the development of the hyperspectral remote sensing technique,extensive chemical weathering profiles have been identified on Mars.These weathering sequences,formed through precipitation-driven leaching processes,can reflect the paleoenvironments and paleoclimates during pedogenic processes.The specific composition and stratigraphic profiles mirror the mineralogical and chemical trends observed in weathered basalts on Hainan Island in south China.In this study,we investigated the laboratory reflectance spectra of a 53-m-long drilling core of a thick basaltic weathering profile collected from Hainan Island.We established a quantitative spectral model by combining the genetic algorithm and partial least squares regression(GA-PLSR)to predict the chemical properties(SiO2,Al2O3,Fe2O3)and index of laterization(IOL).The entire sample set was divided into a calibration set of 25 samples and a validation set of 12 samples.Specifically,the GA was used to select the spectral subsets for each composition,which were then input into the PLSR model to derive the chemical concentration.The coefficient of determination(R2)values on the validation set for SiO2,Al2O3,Fe2O3,and the IOL were greater than 0.9.In addition,the effects of various spectral preprocessing techniques on the model accuracy were evaluated.We found that the spectral derivative treatment boosted the prediction accuracy of the GA-PLSR model.The improvement achieved with the second derivative was more pronounced than when using the first derivative.The quantitative model developed in this work has the potential to estimate the contents of similar weathering basalt products,and thus infer the degree of alteration and provide insights into paleoclimatic conditions.Moreover,the informative bands selected by the GA can serve as a guideline for designing spectral channels for the next generation of spectrometers.展开更多
Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management....Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management. Tool development in this regard will help researchers quickly identify variety information. This study photographed apricot fruits outdoors and indoors and constructed a dataset that can precisely classify the fruits using a U-net model (F-score:99%), which helps to obtain the fruit's size, shape, and color features. Meanwhile, a variety search engine was constructed, which can search and identify variety from the database according to the above features. Besides, a mobile and web application (ApricotView) was developed, and the construction mode can be also applied to other varieties of fruit trees.Additionally, we have collected four difficult-to-identify seed datasets and used the VGG16 model for training, with an accuracy of 97%, which provided an important basis for ApricotView. To address the difficulties in data collection bottlenecking apricot phenomics research, we developed the first apricot database platform of its kind (ApricotDIAP, http://apricotdiap.com/) to accumulate, manage, and publicize scientific data of apricot.展开更多
The developed system for eye and face detection using Convolutional Neural Networks(CNN)models,followed by eye classification and voice-based assistance,has shown promising potential in enhancing accessibility for ind...The developed system for eye and face detection using Convolutional Neural Networks(CNN)models,followed by eye classification and voice-based assistance,has shown promising potential in enhancing accessibility for individuals with visual impairments.The modular approach implemented in this research allows for a seamless flow of information and assistance between the different components of the system.This research significantly contributes to the field of accessibility technology by integrating computer vision,natural language processing,and voice technologies.By leveraging these advancements,the developed system offers a practical and efficient solution for assisting blind individuals.The modular design ensures flexibility,scalability,and ease of integration with existing assistive technologies.However,it is important to acknowledge that further research and improvements are necessary to enhance the system’s accuracy and usability.Fine-tuning the CNN models and expanding the training dataset can improve eye and face detection as well as eye classification capabilities.Additionally,incorporating real-time responses through sophisticated natural language understanding techniques and expanding the knowledge base of ChatGPT can enhance the system’s ability to provide comprehensive and accurate responses.Overall,this research paves the way for the development of more advanced and robust systems for assisting visually impaired individuals.By leveraging cutting-edge technologies and integrating them into amodular framework,this research contributes to creating a more inclusive and accessible society for individuals with visual impairments.Future work can focus on refining the system,addressing its limitations,and conducting user studies to evaluate its effectiveness and impact in real-world scenarios.展开更多
Recently,deep learning has yielded transformative success across optics and photonics,especially in optical metrology.Deep neural networks (DNNs) with a fully convolutional architecture (e.g.,U-Net and its derivatives...Recently,deep learning has yielded transformative success across optics and photonics,especially in optical metrology.Deep neural networks (DNNs) with a fully convolutional architecture (e.g.,U-Net and its derivatives) have been widely implemented in an end-to-end manner to accomplish various optical metrology tasks,such as fringe denoising,phase unwrapping,and fringe analysis.However,the task of training a DNN to accurately identify an image-to-image transform from massive input and output data pairs seems at best naive,as the physical laws governing the image formation or other domain expertise pertaining to the measurement have not yet been fully exploited in current deep learning practice.To this end,we introduce a physics-informed deep learning method for fringe pattern analysis (PI-FPA) to overcome this limit by integrating a lightweight DNN with a learning-enhanced Fourier transform profilometry (Le FTP) module.By parameterizing conventional phase retrieval methods,the Le FTP module embeds the prior knowledge in the network structure and the loss function to directly provide reliable phase results for new types of samples,while circumventing the requirement of collecting a large amount of high-quality data in supervised learning methods.Guided by the initial phase from Le FTP,the phase recovery ability of the lightweight DNN is enhanced to further improve the phase accuracy at a low computational cost compared with existing end-to-end networks.Experimental results demonstrate that PI-FPA enables more accurate and computationally efficient single-shot phase retrieval,exhibiting its excellent generalization to various unseen objects during training.The proposed PI-FPA presents that challenging issues in optical metrology can be potentially overcome through the synergy of physics-priors-based traditional tools and data-driven learning approaches,opening new avenues to achieve fast and accurate single-shot 3D imaging.展开更多
This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schem...This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies.展开更多
This exploration acquaints a momentous methodology with custom chatbot improvement that focuses on pro-ficiency close by viability.We accomplish this by joining three key innovations:LangChain,Retrieval Augmented Gene...This exploration acquaints a momentous methodology with custom chatbot improvement that focuses on pro-ficiency close by viability.We accomplish this by joining three key innovations:LangChain,Retrieval Augmented Generation(RAG),and enormous language models(LLMs)tweaked with execution proficient strategies like LoRA and QLoRA.LangChain takes into consideration fastidious fitting of chatbots to explicit purposes,guaranteeing engaged and important collaborations with clients.RAG’s web scratching capacities engage these chatbots to get to a tremendous store of data,empowering them to give exhaustive and enlightening reactions to requests.This recovered data is then decisively woven into reaction age utilizing LLMs that have been calibrated with an emphasis on execution productivity.This combination approach offers a triple advantage:further developed viability,upgraded client experience,and extended admittance to data.Chatbots become proficient at taking care of client questions precisely and productively,while instructive and logically pertinent reactions make a more regular and drawing in cooperation for clients.At last,web scratching enables chatbots to address a more extensive assortment of requests by conceding them admittance to a more extensive information base.By digging into the complexities of execution proficient LLM calibrating and underlining the basic job of web-scratched information,this examination offers a critical commitment to propelling custom chatbot plan and execution.The subsequent chatbots feature the monstrous capability of these advancements in making enlightening,easy to understand,and effective conversational specialists,eventually changing the manner in which clients cooperate with chatbots.展开更多
Under the influence of anthropogenic and climate change,the problems caused by urban heat island(UHI)has become increasingly prominent.In order to promote urban sustainable development and improve the quality of human...Under the influence of anthropogenic and climate change,the problems caused by urban heat island(UHI)has become increasingly prominent.In order to promote urban sustainable development and improve the quality of human settlements,it is significant for exploring the evolution characteristics of urban thermal environment and analyzing its driving forces.Taking the Landsat series images as the basic data sources,the winter land surface temperature(LST)of the rapid urbanization area of Fuzhou City in China was quantitatively retrieved from 2001 to 2021.Combing comprehensively the standard deviation ellipse model,profile analysis and GeoDetector model,the spatio-temporal evolution characteristics and influencing factors of the winter urban thermal environment were systematically analyzed.The results showed that the winter LST presented an increasing trend in the study area during 2001–2021,and the winter LST of the central urban regions was significantly higher than the suburbs.There was a strong UHI effect from 2001 to 2021with an expansion trend from the central urban regions to the suburbs and coastal areas in space scale.The LST of green lands and wetlands are significantly lower than croplands,artificial surface and unvegetated lands.Vegetation and water bodies had a significant mitigation effect on UHI,especially in the micro-scale.The winter UHI had been jointly driven by the underlying surface and socio-economic factors in a nonlinear or two-factor interactive enhancement mode,and socio-economic factors had played a leading role.This research could provide data support and decision-making references for rationally planning urban layout and promoting sustainable urban development.展开更多
Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me...Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.展开更多
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep...Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.展开更多
Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based di...Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based diagnosis,teaching,and research.Although the retrieval accuracy has largely improved,there has been limited development toward visualizing important image features that indicate the similarity of retrieved images.Despite the prevalence of 3D volumetric data in medical imaging such as computed tomography(CT),current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images.Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information,including the size,shape,and spatial relations of multiple structures.This process is time-consuming and reliant on users'experience.Methods In this study,we proposed an importance-aware 3D volume visualization method.The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process.We then integrated the proposed visualization into a CBIR system,thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.Results Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography(PETCT)images of a non-small cell lung cancer dataset.展开更多
This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, catego...This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.展开更多
Objective:This paper aims to evaluate the implementation effect of the diversified course assessment method reform.Methods:A diversified assessment method was implemented for 196 undergraduate nursing students.Student...Objective:This paper aims to evaluate the implementation effect of the diversified course assessment method reform.Methods:A diversified assessment method was implemented for 196 undergraduate nursing students.Students’mastery of key knowledge in“Nursing Research”was assessed through group reports on topic selection and literature retrieval,as well as the proposition level of the final examination.Results:81.6%of the students agreed with the course assessment method,and 97.9%believed studying“Nursing Research”would be helpful for future scientific research applications.Conclusion:Diversified assessment methods can help improve undergraduate nursing students’scientific research skills and comprehensive quality.展开更多
The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor l...The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.展开更多
Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number...Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number of labeled data,which limits the application.Self-supervised learning is a more general approach in unlabeled scenarios.A method of fine-tuning feature extraction networks based on masked learning is proposed.Masked autoencoders(MAE)are used in the fine-tune vision transformer(ViT)model.In addition,the scheme of extracting image descriptors is discussed.The encoder of the MAE uses the ViT to extract global features and performs self-supervised fine-tuning by reconstructing masked area pixels.The method works well on category-level image retrieval datasets with marked improvements in instance-level datasets.For the instance-level datasets Oxford5k and Paris6k,the retrieval accuracy of the base model is improved by 7%and 17%compared to that of the original model,respectively.展开更多
Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scal...Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval.展开更多
文摘AIM:To assess the feasibility and safety of the use of soehendra stent retriever as a new technique for biliary access in endoscopic ultrasound-guided biliary drainage.METHODS:The medical records and endoscopic reports of the patients who underwent endoscopic ultrasound-guided biliary drainage(EUS-BD) owing to failed endoscopic retrograde cholangiopancreatography in our institute between June 2011 and January 2014 were collected and reviewed.All the procedures were performed in the endoscopic suite under intravenous sedation with propofol and full anaesthetic monitoring.Then we used the Soehendra stent retriever as new equipment for neo-tract creation and dilation when performing EUS-BD procedures.The patients were observed in the recovery room for 1-2 h and transferred to the regular ward,patients' clinical data were reviewed and analysed,clinical outcomes were defined by using several different criteria.Data were analysed by using SPSS 13 and presented as percentages,means,and medians.RESULTS:A total of 12 patients were enrolled.The most common indications for EUS-BD in this series were failed common bile duct cannulation,duodenal obstruction,failed selective intrahepatic duct cannulation,and surgical altered anatomy for 50%,25%,16.7%,and 8.3%,respectively.Seven patients underwent EUS-guided hepaticogastrostomy(58.3%),and 5 underwent EUS-guided choledochoduodenostomy(41.7%).The technical success rate was 100%,while the clinical success rate was 91.7%.Major and minor complications occurred in 16.6% and 33.3% of patients,respectively,but there were no procedurerelated death.CONCLUSION:Soehendra stent retriever could be used as an alternative instrument for biliary access in endoscopic ultrasound guided biliary drainage.
文摘Bilioenteric or pancreatoenteric anastomotic stric-tures often occur after surgery for a pancreaticobiliary disorder. Therapeutic endoscopic retrograde cholan-giopancreatography using balloon enteroscopy has been shown to be feasible and effective in patients with such strictures. However, when a benign anas-tomotic stricture is severe, a dilation catheter cannot pass through the stricture despite successful insertion of the guidewire. We report on the usefulness of the Soehendra Stent Retriever over a guidewire for dilating a severe bilioenteric or pancreatoenteric anastomotic stricture under short double-balloon enteroscopy, in two patients with surgically altered anatomies.
基金supported by National Natural Science Foundation of China(Nos.61872370 and 61832017)Beijing Outstanding Young Scientist Program(No.BJJWZYJH012019100020098)Beijing Academy of Artificial Intelligence(BAAI),the Outstanding Innovative Talents Cultivation Funded Programs 2021 of Renmin University of China,and Intelligent Social Governance Platform,Major Innovation&Planning Interdisciplinary Platform for the“Double-First Class”Initiative,Renmin University of China.
文摘Web search provides a promising way for people to obtain information and has been extensively studied.With the surge of deep learning and large-scale pre-training techniques,various neural information retrieval models are proposed,and they have demonstrated the power for improving search(especially,the ranking)quality.All these existing search methods follow a common paradigm,i.e.,index-retrieve-rerank,where they first build an index of all documents based on document terms(i.e.,sparse inverted index)or representation vectors(i.e.,dense vector index),then retrieve and rerank retrieved documents based on the similarity between the query and documents via ranking models.In this paper,we explore a new paradigm of information retrieval without an explicit index but only with a pre-trained model.Instead,all of the knowledge of the documents is encoded into model parameters,which can be regarded as a differentiable indexer and optimized in an end-to-end manner.Specifically,we propose a pre-trained model-based information retrieval(IR)system called DynamicRetriever,which directly returns document identifiers for a given query.Under such a framework,we implement two variants to explore how to train the model from scratch and how to combine the advantages of dense retrieval models.Compared with existing search methods,the model-based IR system parameterizes the traditional static index with a pre-training model,which converts the document semantic mapping into a dynamic and updatable process.Extensive experiments conducted on the public search benchmark Microsoft machine reading comprehension(MS MARCO)verify the effectiveness and potential of our proposed new paradigm for information retrieval.
基金financially supported by the National Natural Science Foundation of China[grant number 42230610]the Second Tibetan Plateau Scientific Expedition and Research(STEP)program[grant number 2019QZKK0103]+1 种基金the Natural Science Foundation of Sichuan Province[grant number 2022NSFSC0217]the Scientific Research Project of Chengdu University of Information Technology[grant number KYTZ201721].
文摘The distinctive conditions present on the north and south slopes of Mount Qomolangma,along with the intricate variations in the underlying surfaces,result in notable variations in the surface energy flux patterns of the two slopes.In this paper,data from TESEBS(Topographical Enhanced Surface Energy Balance System),remote sensing data from eight cloud-free scenarios,and observational data from nine stations are utilized to examine the fluctuations in the surface heat flux on both slopes.The inclusion of MCD43A3 satellite data enhances the surface albedo,contributing to more accurate simulation outcomes.The model results are validated using observational data.The RMSEs of the net radiation,ground heat,sensible heat,and latent heat flux are 40.73,17.09,33.26,and 30.91 W m^(−2),respectively.The net radiation flux is greater on the south slope and exhibits a rapid decline from summer to autumn.Due to the influence of the monsoon,on the north slope,the maximum sensible heat flux occurs in the pre-monsoon period in summer and the maximum latent heat flux occurs during the monsoon.The south slope experiences the highest latent heat flux in summer.The dominant flux on the north slope is sensible heat,while it is latent heat on the south slope.The seasonal variations in the ground heat flux are more pronounced on the south slope than on the north slope.Except in summer,the ground heat flux on the north slope surpasses that on the south slope.
基金supported by the National Natural Science of Foundation of China(41825011,42030608,42105128,and 42075079)the Opening Foundation of Key Laboratory of Atmospheric Sounding,the CMA and the CMA Research Center on Meteorological Observation Engineering Technology(U2021Z03).
文摘The Advanced Geosynchronous Radiation Imager(AGRI)is a mission-critical instrument for the Fengyun series of satellites.AGRI acquires full-disk images every 15 min and views East Asia every 5 min through 14 spectral bands,enabling the detection of highly variable aerosol optical depth(AOD).Quantitative retrieval of AOD has hitherto been challenging,especially over land.In this study,an AOD retrieval algorithm is proposed that combines deep learning and transfer learning.The algorithm uses core concepts from both the Dark Target(DT)and Deep Blue(DB)algorithms to select features for the machinelearning(ML)algorithm,allowing for AOD retrieval at 550 nm over both dark and bright surfaces.The algorithm consists of two steps:①A baseline deep neural network(DNN)with skip connections is developed using 10 min Advanced Himawari Imager(AHI)AODs as the target variable,and②sunphotometer AODs from 89 ground-based stations are used to fine-tune the DNN parameters.Out-of-station validation shows that the retrieved AOD attains high accuracy,characterized by a coefficient of determination(R2)of 0.70,a mean bias error(MBE)of 0.03,and a percentage of data within the expected error(EE)of 70.7%.A sensitivity study reveals that the top-of-atmosphere reflectance at 650 and 470 nm,as well as the surface reflectance at 650 nm,are the two largest sources of uncertainty impacting the retrieval.In a case study of monitoring an extreme aerosol event,the AGRI AOD is found to be able to capture the detailed temporal evolution of the event.This work demonstrates the superiority of the transfer-learning technique in satellite AOD retrievals and the applicability of the retrieved AGRI AOD in monitoring extreme pollution events.
基金National Key Research and Development Project(Grant No.2019YFE0123300)National Natural Science Foundation of China(Grant Nos.42072337,42241111,and 42241129)+1 种基金Pandeng Program of National Space Science Center,Chinese Academy of Sciences.Xing Wu also acknowledges support from the Young Elite Scientists Sponsorship Program by the China Association for Science and Technology(Grant No.2022QNRC001)China Postdoctoral Science Foundation(Grant No.2021M700149).
文摘With the development of the hyperspectral remote sensing technique,extensive chemical weathering profiles have been identified on Mars.These weathering sequences,formed through precipitation-driven leaching processes,can reflect the paleoenvironments and paleoclimates during pedogenic processes.The specific composition and stratigraphic profiles mirror the mineralogical and chemical trends observed in weathered basalts on Hainan Island in south China.In this study,we investigated the laboratory reflectance spectra of a 53-m-long drilling core of a thick basaltic weathering profile collected from Hainan Island.We established a quantitative spectral model by combining the genetic algorithm and partial least squares regression(GA-PLSR)to predict the chemical properties(SiO2,Al2O3,Fe2O3)and index of laterization(IOL).The entire sample set was divided into a calibration set of 25 samples and a validation set of 12 samples.Specifically,the GA was used to select the spectral subsets for each composition,which were then input into the PLSR model to derive the chemical concentration.The coefficient of determination(R2)values on the validation set for SiO2,Al2O3,Fe2O3,and the IOL were greater than 0.9.In addition,the effects of various spectral preprocessing techniques on the model accuracy were evaluated.We found that the spectral derivative treatment boosted the prediction accuracy of the GA-PLSR model.The improvement achieved with the second derivative was more pronounced than when using the first derivative.The quantitative model developed in this work has the potential to estimate the contents of similar weathering basalt products,and thus infer the degree of alteration and provide insights into paleoclimatic conditions.Moreover,the informative bands selected by the GA can serve as a guideline for designing spectral channels for the next generation of spectrometers.
基金supported by the Fundamental Research Funds for the Central Non-profit Research Institution of the Chinese Academy of Forestry (Grant No.CAFYBB2020ZY003)the Key S&T Project of Inner Mongolia (Grant No.2021ZD0041-001-002)the Central Public-interest Scientific Institution Basal Research Fund (Grant No.11024316000202300001)。
文摘Apricot has a long history of cultivation and has many varieties and types. The traditional variety identification methods are timeconsuming and labor-consuming, posing grand challenges to apricot resource management. Tool development in this regard will help researchers quickly identify variety information. This study photographed apricot fruits outdoors and indoors and constructed a dataset that can precisely classify the fruits using a U-net model (F-score:99%), which helps to obtain the fruit's size, shape, and color features. Meanwhile, a variety search engine was constructed, which can search and identify variety from the database according to the above features. Besides, a mobile and web application (ApricotView) was developed, and the construction mode can be also applied to other varieties of fruit trees.Additionally, we have collected four difficult-to-identify seed datasets and used the VGG16 model for training, with an accuracy of 97%, which provided an important basis for ApricotView. To address the difficulties in data collection bottlenecking apricot phenomics research, we developed the first apricot database platform of its kind (ApricotDIAP, http://apricotdiap.com/) to accumulate, manage, and publicize scientific data of apricot.
文摘The developed system for eye and face detection using Convolutional Neural Networks(CNN)models,followed by eye classification and voice-based assistance,has shown promising potential in enhancing accessibility for individuals with visual impairments.The modular approach implemented in this research allows for a seamless flow of information and assistance between the different components of the system.This research significantly contributes to the field of accessibility technology by integrating computer vision,natural language processing,and voice technologies.By leveraging these advancements,the developed system offers a practical and efficient solution for assisting blind individuals.The modular design ensures flexibility,scalability,and ease of integration with existing assistive technologies.However,it is important to acknowledge that further research and improvements are necessary to enhance the system’s accuracy and usability.Fine-tuning the CNN models and expanding the training dataset can improve eye and face detection as well as eye classification capabilities.Additionally,incorporating real-time responses through sophisticated natural language understanding techniques and expanding the knowledge base of ChatGPT can enhance the system’s ability to provide comprehensive and accurate responses.Overall,this research paves the way for the development of more advanced and robust systems for assisting visually impaired individuals.By leveraging cutting-edge technologies and integrating them into amodular framework,this research contributes to creating a more inclusive and accessible society for individuals with visual impairments.Future work can focus on refining the system,addressing its limitations,and conducting user studies to evaluate its effectiveness and impact in real-world scenarios.
基金funded by National Key Research and Development Program of China (2022YFB2804603,2022YFB2804604)National Natural Science Foundation of China (62075096,62205147,U21B2033)+7 种基金China Postdoctoral Science Foundation (2023T160318,2022M711630,2022M721619)Jiangsu Funding Program for Excellent Postdoctoral Talent (2022ZB254)The Leading Technology of Jiangsu Basic Research Plan (BK20192003)The“333 Engineering”Research Project of Jiangsu Province (BRA2016407)The Jiangsu Provincial“One belt and one road”innovation cooperation project (BZ2020007)Open Research Fund of Jiangsu Key Laboratory of Spectral Imaging&Intelligent Sense (JSGP202105)Fundamental Research Funds for the Central Universities (30922010405,30921011208,30920032101,30919011222)National Major Scientific Instrument Development Project (62227818).
文摘Recently,deep learning has yielded transformative success across optics and photonics,especially in optical metrology.Deep neural networks (DNNs) with a fully convolutional architecture (e.g.,U-Net and its derivatives) have been widely implemented in an end-to-end manner to accomplish various optical metrology tasks,such as fringe denoising,phase unwrapping,and fringe analysis.However,the task of training a DNN to accurately identify an image-to-image transform from massive input and output data pairs seems at best naive,as the physical laws governing the image formation or other domain expertise pertaining to the measurement have not yet been fully exploited in current deep learning practice.To this end,we introduce a physics-informed deep learning method for fringe pattern analysis (PI-FPA) to overcome this limit by integrating a lightweight DNN with a learning-enhanced Fourier transform profilometry (Le FTP) module.By parameterizing conventional phase retrieval methods,the Le FTP module embeds the prior knowledge in the network structure and the loss function to directly provide reliable phase results for new types of samples,while circumventing the requirement of collecting a large amount of high-quality data in supervised learning methods.Guided by the initial phase from Le FTP,the phase recovery ability of the lightweight DNN is enhanced to further improve the phase accuracy at a low computational cost compared with existing end-to-end networks.Experimental results demonstrate that PI-FPA enables more accurate and computationally efficient single-shot phase retrieval,exhibiting its excellent generalization to various unseen objects during training.The proposed PI-FPA presents that challenging issues in optical metrology can be potentially overcome through the synergy of physics-priors-based traditional tools and data-driven learning approaches,opening new avenues to achieve fast and accurate single-shot 3D imaging.
文摘This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies.
文摘This exploration acquaints a momentous methodology with custom chatbot improvement that focuses on pro-ficiency close by viability.We accomplish this by joining three key innovations:LangChain,Retrieval Augmented Generation(RAG),and enormous language models(LLMs)tweaked with execution proficient strategies like LoRA and QLoRA.LangChain takes into consideration fastidious fitting of chatbots to explicit purposes,guaranteeing engaged and important collaborations with clients.RAG’s web scratching capacities engage these chatbots to get to a tremendous store of data,empowering them to give exhaustive and enlightening reactions to requests.This recovered data is then decisively woven into reaction age utilizing LLMs that have been calibrated with an emphasis on execution productivity.This combination approach offers a triple advantage:further developed viability,upgraded client experience,and extended admittance to data.Chatbots become proficient at taking care of client questions precisely and productively,while instructive and logically pertinent reactions make a more regular and drawing in cooperation for clients.At last,web scratching enables chatbots to address a more extensive assortment of requests by conceding them admittance to a more extensive information base.By digging into the complexities of execution proficient LLM calibrating and underlining the basic job of web-scratched information,this examination offers a critical commitment to propelling custom chatbot plan and execution.The subsequent chatbots feature the monstrous capability of these advancements in making enlightening,easy to understand,and effective conversational specialists,eventually changing the manner in which clients cooperate with chatbots.
基金Under the auspices of the Social Science and Humanity on Young Fund of the Ministry of Education of China(No.21YJCZH100)the Scientific Research Project on Outstanding Young of the Fujian Agriculture and Forestry University(No.XJQ201920)+1 种基金the Science and Technology Innovation Special Fund Project of Fujian Agriculture and Forestry University(No.CXZX2021032)the Forestry Peak Discipline Construction Project of Fujian Agriculture and Forestry University(No.72202200205)。
文摘Under the influence of anthropogenic and climate change,the problems caused by urban heat island(UHI)has become increasingly prominent.In order to promote urban sustainable development and improve the quality of human settlements,it is significant for exploring the evolution characteristics of urban thermal environment and analyzing its driving forces.Taking the Landsat series images as the basic data sources,the winter land surface temperature(LST)of the rapid urbanization area of Fuzhou City in China was quantitatively retrieved from 2001 to 2021.Combing comprehensively the standard deviation ellipse model,profile analysis and GeoDetector model,the spatio-temporal evolution characteristics and influencing factors of the winter urban thermal environment were systematically analyzed.The results showed that the winter LST presented an increasing trend in the study area during 2001–2021,and the winter LST of the central urban regions was significantly higher than the suburbs.There was a strong UHI effect from 2001 to 2021with an expansion trend from the central urban regions to the suburbs and coastal areas in space scale.The LST of green lands and wetlands are significantly lower than croplands,artificial surface and unvegetated lands.Vegetation and water bodies had a significant mitigation effect on UHI,especially in the micro-scale.The winter UHI had been jointly driven by the underlying surface and socio-economic factors in a nonlinear or two-factor interactive enhancement mode,and socio-economic factors had played a leading role.This research could provide data support and decision-making references for rationally planning urban layout and promoting sustainable urban development.
基金National Natural Science Foundation of China(No.61971121)。
文摘Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.
文摘Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.
文摘Background A medical content-based image retrieval(CBIR)system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image.CBIR is widely used in evidence-based diagnosis,teaching,and research.Although the retrieval accuracy has largely improved,there has been limited development toward visualizing important image features that indicate the similarity of retrieved images.Despite the prevalence of 3D volumetric data in medical imaging such as computed tomography(CT),current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images.Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information,including the size,shape,and spatial relations of multiple structures.This process is time-consuming and reliant on users'experience.Methods In this study,we proposed an importance-aware 3D volume visualization method.The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process.We then integrated the proposed visualization into a CBIR system,thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.Results Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography(PETCT)images of a non-small cell lung cancer dataset.
文摘This study examines the database search behaviors of individuals, focusing on gender differences and the impact of planning habits on information retrieval. Data were collected from a survey of 198 respondents, categorized by their discipline, schooling background, internet usage, and information retrieval preferences. Key findings indicate that females are more likely to plan their searches in advance and prefer structured methods of information retrieval, such as using library portals and leading university websites. Males, however, tend to use web search engines and self-archiving methods more frequently. This analysis provides valuable insights for educational institutions and libraries to optimize their resources and services based on user behavior patterns.
基金Nursing Research Outcome of the Pilot Project for Course Assessment Reform in Sanya University(Project number:SYJGKH2022138)。
文摘Objective:This paper aims to evaluate the implementation effect of the diversified course assessment method reform.Methods:A diversified assessment method was implemented for 196 undergraduate nursing students.Students’mastery of key knowledge in“Nursing Research”was assessed through group reports on topic selection and literature retrieval,as well as the proposition level of the final examination.Results:81.6%of the students agreed with the course assessment method,and 97.9%believed studying“Nursing Research”would be helpful for future scientific research applications.Conclusion:Diversified assessment methods can help improve undergraduate nursing students’scientific research skills and comprehensive quality.
文摘The task of indoor visual localization, utilizing camera visual information for user pose calculation, was a core component of Augmented Reality (AR) and Simultaneous Localization and Mapping (SLAM). Existing indoor localization technologies generally used scene-specific 3D representations or were trained on specific datasets, making it challenging to balance accuracy and cost when applied to new scenes. Addressing this issue, this paper proposed a universal indoor visual localization method based on efficient image retrieval. Initially, a Multi-Layer Perceptron (MLP) was employed to aggregate features from intermediate layers of a convolutional neural network, obtaining a global representation of the image. This approach ensured accurate and rapid retrieval of reference images. Subsequently, a new mechanism using Random Sample Consensus (RANSAC) was designed to resolve relative pose ambiguity caused by the essential matrix decomposition based on the five-point method. Finally, the absolute pose of the queried user image was computed, thereby achieving indoor user pose estimation. The proposed indoor localization method was characterized by its simplicity, flexibility, and excellent cross-scene generalization. Experimental results demonstrated a positioning error of 0.09 m and 2.14° on the 7Scenes dataset, and 0.15 m and 6.37° on the 12Scenes dataset. These results convincingly illustrated the outstanding performance of the proposed indoor localization method.
基金the Project of Introducing Urgently Needed Talents in Key Supporting Regions of Shandong Province,China(No.SDJQP20221805)。
文摘Deep convolutional neural networks(DCNNs)are widely used in content-based image retrieval(CBIR)because of the advantages in image feature extraction.However,the training of deep neural networks requires a large number of labeled data,which limits the application.Self-supervised learning is a more general approach in unlabeled scenarios.A method of fine-tuning feature extraction networks based on masked learning is proposed.Masked autoencoders(MAE)are used in the fine-tune vision transformer(ViT)model.In addition,the scheme of extracting image descriptors is discussed.The encoder of the MAE uses the ViT to extract global features and performs self-supervised fine-tuning by reconstructing masked area pixels.The method works well on category-level image retrieval datasets with marked improvements in instance-level datasets.For the instance-level datasets Oxford5k and Paris6k,the retrieval accuracy of the base model is improved by 7%and 17%compared to that of the original model,respectively.
基金This research was funded by King Mongkut’s University of Technology North Bangkok(Contract no.KMUTNB-62-KNOW-026).
文摘Fine-grained image classification is a challenging research topic because of the high degree of similarity among categories and the high degree of dissimilarity for a specific category caused by different poses and scales.A cul-tural heritage image is one of thefine-grained images because each image has the same similarity in most cases.Using the classification technique,distinguishing cultural heritage architecture may be difficult.This study proposes a cultural heri-tage content retrieval method using adaptive deep learning forfine-grained image retrieval.The key contribution of this research was the creation of a retrieval mod-el that could handle incremental streams of new categories while maintaining its past performance in old categories and not losing the old categorization of a cul-tural heritage image.The goal of the proposed method is to perform a retrieval task for classes.Incremental learning for new classes was conducted to reduce the re-training process.In this step,the original class is not necessary for re-train-ing which we call an adaptive deep learning technique.Cultural heritage in the case of Thai archaeological site architecture was retrieved through machine learn-ing and image processing.We analyze the experimental results of incremental learning forfine-grained images with images of Thai archaeological site architec-ture from world heritage provinces in Thailand,which have a similar architecture.Using afine-grained image retrieval technique for this group of cultural heritage images in a database can solve the problem of a high degree of similarity among categories and a high degree of dissimilarity for a specific category.The proposed method for retrieving the correct image from a database can deliver an average accuracy of 85 percent.Adaptive deep learning forfine-grained image retrieval was used to retrieve cultural heritage content,and it outperformed state-of-the-art methods infine-grained image retrieval.