Translation is an important medium of cultural communication.It is not a mere transfer of two languages,but the interaction of two cultures.Cultural misreading,which results from cultural discrepancy and translator’s...Translation is an important medium of cultural communication.It is not a mere transfer of two languages,but the interaction of two cultures.Cultural misreading,which results from cultural discrepancy and translator’s subjectivity,truly reflects where the blockade and conflict in the cultural communication is.Cultural misreading is an objective phenomenon that exists in the entire process of translation.This paper intends to make a comprehensive analysis and discussion on The History of the Former Han Dynasty:a Critical Translation with Annotations translated by Homer Hasenpflug Dubs.As for the reasons of cultural misreading,this paper divides them into three types—language,thinking habit,traditional culture.It is to be hoped that this paper will draw more attention from the translation circle to the phenomena,and make contribution to the development of literary translation.展开更多
This paper mainly studies the basic types of annotation and the analysis of its effective functional usage,so as to pay more attention to annotation in the translation of poetry and Fu.The annotation of this study bel...This paper mainly studies the basic types of annotation and the analysis of its effective functional usage,so as to pay more attention to annotation in the translation of poetry and Fu.The annotation of this study belongs to the category of paratext.Annotation is attributed to the paratext,undertakes its special function,enriches and perfects the paratext system.展开更多
Creating large-scale and well-annotated datasets to train AI algorithms is crucial for automated tumor detection and localization.However,with limited resources,it is challenging to determine the best type of annotati...Creating large-scale and well-annotated datasets to train AI algorithms is crucial for automated tumor detection and localization.However,with limited resources,it is challenging to determine the best type of annotations when annotating massive amounts of unlabeled data.To address this issue,we focus on polyps in colonoscopy videos and pancreatic tumors in abdominal CT scans;Both applications require significant effort and time for pixel-wise annotation due to the high dimensional nature of the data,involving either temporary or spatial dimensions.In this paper,we develop a new annotation strategy,termed Drag&Drop,which simplifies the annotation process to drag and drop.This annotation strategy is more efficient,particularly for temporal and volumetric imaging,than other types of weak annotations,such as per-pixel,bounding boxes,scribbles,ellipses and points.Furthermore,to exploit our Drag&Drop annotations,we develop a novel weakly supervised learning method based on the watershed algorithm.Experimental results show that our method achieves better detection and localization performance than alternative weak annotations and,more importantly,achieves similar performance to that trained on detailed per-pixel annotations.Interestingly,we find that,with limited resources,allocating weak annotations from a diverse patient population can foster models more robust to unseen images than allocating per-pixel annotations for a small set of images.In summary,this research proposes an efficient annotation strategy for tumor detection and localization that is less accurate than per-pixel annotations but useful for creating large-scale datasets for screening tumors in various medical modalities.展开更多
Single-cell RNA sequencing(scRNA-seq)is revolutionizing the study of complex and dynamic cellular mechanisms.However,cell type annotation remains a main challenge as it largely relies on a priori knowledge and manual ...Single-cell RNA sequencing(scRNA-seq)is revolutionizing the study of complex and dynamic cellular mechanisms.However,cell type annotation remains a main challenge as it largely relies on a priori knowledge and manual curation,which is cumbersome and subjective.The increasing number of scRNA-seq datasets,as well as numerous published genetic studies,has motivated us to build a comprehensive human cell type reference atlas.Here,we present decoding Cell type Specificity(deCS),an automatic cell type annotation method augmented by a comprehensive collection of human cell type expression profiles and marker genes.We used deCS to annotate scRNAseq data from various tissue types and systematically evaluated the annotation accuracy under different conditions,including reference panels,sequencing depth,and feature selection strategies.Our results demonstrate that expanding the references is critical for improving annotation accuracy.Compared to many existing state-of-the-art annotation tools,deCS significantly reduced computation time and increased accuracy.deCS can be integrated into the standard scRNA-seq analytical pipeline to enhance cell type annotation.Finally,we demonstrated the broad utility of deCS to identify trait-cell type associations in 51 human complex traits,providing deep insights into the cellular mechanisms underlying disease pathogenesis.展开更多
The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotatio...The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.展开更多
Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,ha...Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,has been widely used as a common reference in biological research.Although its genome assembly has been updated twice,the current gene annotation still lacks information on untranslated regions(UTRs)and alternative splicing(AS).Here,we constructed a high-quality gene annotation(JZSv3)using a full-length transcriptome acquired by nanopore sequencing,yielding a total of 59452 genes and 75684 transcripts.Additionally,we re-analyzed the previously reported transcriptome data related to the development of different tissues and cold response using JZSv3 as a reference,and found that 3843 out of 11908 differentially expressed genes(DEGs)underwent AS during the development of different tissues and 309 out of 903 cold-related genes underwent AS in response to cold stress.Meanwhile,we also identified many AS genes,including BolLHCB5 and BolHSP70,that displayed distinct expression patterns within variant transcripts of the same gene,highlighting the importance of JZSv3 as a pivotal reference for AS analysis.Overall,JZSv3 provides a valuable resource for exploring gene function,especially for obtaining a deeper understanding of AS regulation mechanisms.展开更多
●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS...●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction.展开更多
Objective:To investigate the effect of Guangdong Shenqu(GSQ)on intestinal flora structure in mice with food stagnation through 16S rDNA sequencing.Methods: Mice were randomly assigned to control,model,GSQ low-dose(GSQ...Objective:To investigate the effect of Guangdong Shenqu(GSQ)on intestinal flora structure in mice with food stagnation through 16S rDNA sequencing.Methods: Mice were randomly assigned to control,model,GSQ low-dose(GSQL),GSQ medium-dose(GSQM),GSQ high-dose(GSQH),and lacidophilin tablets(LAB)groups,with each group containing 10 mice.A food stagnation and internal heat mouse model was established through intragastric administration of a mixture of beeswax and olive oil(1:15).The control group was administered normal saline,and the model group was administered beeswax and olive oil to maintain a state.The GSQL(2 g/kg),GSQM(4 g/kg),GSQH(8 g/kg),and LAB groups(0.625 g/kg)were administered corresponding drugs for 5 d.After administration,16S rDNA sequencing was performed to assess gut microbiota in mouse fecal samples.Results: The model group exhibited significant intestinal flora changes.Following GSQ administration,the abundance and diversity index of the intestinal flora increased significantly,the number of bacterial species was regulated,andαandβdiversity were improved.GSQ administration increased the abundance of probiotics,including Clostridia,Lachnospirales,and Lactobacillus,whereas the abundance of conditional pathogenic bacteria,such as Allobaculum,Erysipelotrichaceae,and Bacteroides decreased.Functional prediction analysis indicated that the pathogenesis of food stagnation and GSQ intervention were primarily associated with carbohydrate,lipid,and amino acid metabolism,among other metabolic pathways.Conclusion: The digestive mechanism of GSQ may be attributed to its role in restoring diversity and abundance within the intestinal flora,thereby improving the composition and structure of the intestinal flora in mice and subsequently influencing the regulation of metabolic pathways.展开更多
As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in...As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.展开更多
Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames,start sites,splice sites,and related structural features.The source of these incon...Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames,start sites,splice sites,and related structural features.The source of these inconsistencies is often traced back to integration across text file formats designed to describe long read alignments and predicted gene structures.In addition,the majority of gene prediction frameworks do not provide robust downstream filtering to remove problematic gene annotations,nor do they represent these annotations in a format consistent with current file standards.These frameworks also lack consideration for functional attributes,such as the presence or absence of protein domains that can be used for gene model validation.To provide oversight to the increasing number of published genome annotations,we present a software package,the Gene Filtering,Analysis,and Conversion(gFACs),to filter,analyze,and convert predicted gene models and alignments.The software operates across a wide range of alignment,analysis,and gene prediction files with a flexible framework for defining gene models with reliable structural and functional attributes.gFACs supports common downstream applications,including genome browsers,and generates extensive details on the filtering process,including distributions that can be visualized to further assess the proposed gene space.gFACs is freely available and implemented in Perl with support from BioPerl libraries at https://gitlab.com/PlantGenomicsLab/gFACs.展开更多
Collaborative social annotation systems allow users to record and share their original keywords or tag attachments to Web resources such as Web pages, photos, or videos. These annotations are a method for organizing a...Collaborative social annotation systems allow users to record and share their original keywords or tag attachments to Web resources such as Web pages, photos, or videos. These annotations are a method for organizing and labeling information. They have the potential to help users navigate the Web and locate the needed resources. However, since annotations axe posted by users under no central control, there exist problems such as spare and synonymous annotations. To efficiently use annotation information to facilitate knowledge discovery from the Web, it is advantageous if we organize social annotations from semantic perspective and embed them into algorithms for knowledge discovery. This inspires the Web page recommendation with annotations, in which users and Web pages are clustered so that semantically similar items can be related. In this paper we propose four graphic models which cluster users, Web pages and annotations and recommend Web pages for given users by assigning items to the right cluster first. The algorithms are then compared to the classical collaborative filtering recommendation method on a real-world data set. Our result indicates that the graphic models provide better recommendation performance and are robust to fit for the real applications.展开更多
The discovery of novel cancer genes is one of the main goals in cancer research.Bioinformatics methods can be used to accelerate cancer gene discovery,which may help in the understanding of cancer and the development ...The discovery of novel cancer genes is one of the main goals in cancer research.Bioinformatics methods can be used to accelerate cancer gene discovery,which may help in the understanding of cancer and the development of drug targets.In this paper,we describe a classifier to predict potential cancer genes that we have developed by integrating multiple biological evidence,including protein-protein interaction network properties,and sequence and functional features.We detected 55 features that were significantly different between cancer genes and non-cancer genes.Fourteen cancer-associated features were chosen to train the classifier.Four machine learning methods,logistic regression,support vector machines(SVMs),BayesNet and decision tree,were explored in the classifier models to distinguish cancer genes from non-cancer genes.The prediction power of the different models was evaluated by 5-fold cross-validation.The area under the receiver operating characteristic curve for logistic regression,SVM,Baysnet and J48 tree models was 0.834,0.740,0.800 and 0.782,respectively.Finally,the logistic regression classifier with multiple biological features was applied to the genes in the Entrez database,and 1976 cancer gene candidates were identified.We found that the integrated prediction model performed much better than the models based on the individual biological evidence,and the network and functional features had stronger powers than the sequence features in predicting cancer genes.展开更多
Anterior segment eye diseases account for a significant proportion of presentations to eye clinics worldwide,including diseases associated with corneal pathologies,anterior chamber abnormalities(e.g.blood or inflammat...Anterior segment eye diseases account for a significant proportion of presentations to eye clinics worldwide,including diseases associated with corneal pathologies,anterior chamber abnormalities(e.g.blood or inflammation),and lens diseases.The construction of an automatic tool for segmentation of anterior segment eye lesions would greatly improve the efficiency of clinical care.With research on artificial intelligence progressing in recent years,deep learning models have shown their superiority in image classification and segmentation.The training and evaluation of deep learning models should be based on a large amount of data annotated with expertise;however,such data are relatively scarce in the domain of medicine.Herein,the authors developed a new medical image annotation system,called EyeHealer.It is a large-scale anterior eye segment dataset with both eye structures and lesions annotated at the pixel level.Comprehensive experiments were conducted to verify its performance in disease classification and eye lesion segmentation.The results showed that semantic segmentation models outperformed medical segmentation models.This paper describes the establishment of the system for automated classification and segmentation tasks.The dataset will be made publicly available to encourage future research in this area.展开更多
The most prosperous and important eras in the history of communication between China and foreign countries were the Han, Tang and Ming dynasties. During the Han and Tang dynasties China’s foreign relations were large...The most prosperous and important eras in the history of communication between China and foreign countries were the Han, Tang and Ming dynasties. During the Han and Tang dynasties China’s foreign relations were largely confined to Asia. Apart from Japan and a few South Asian countries there was almost no maritime contact with the outside world by China. The展开更多
Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The ma...Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.展开更多
At present days,object detection and tracking concepts have gained more importance among researchers and business people.Presently,deep learning(DL)approaches have been used for object tracking as it increases the per...At present days,object detection and tracking concepts have gained more importance among researchers and business people.Presently,deep learning(DL)approaches have been used for object tracking as it increases the perfor-mance and speed of the tracking process.This paper presents a novel robust DL based object detection and tracking algorithm using Automated Image Anno-tation with ResNet based Faster regional convolutional neural network(R-CNN)named(AIA-FRCNN)model.The AIA-RFRCNN method performs image anno-tation using a Discriminative Correlation Filter(DCF)with Channel and Spatial Reliability tracker(CSR)called DCF-CSRT model.The AIA-RFRCNN model makes use of Faster RCNN as an object detector and tracker,which involves region proposal network(RPN)and Fast R-CNN.The RPN is a full convolution network that concurrently predicts the bounding box and score of different objects.The RPN is a trained model used for the generation of the high-quality region proposals,which are utilized by Fast R-CNN for detection process.Besides,Residual Network(ResNet 101)model is used as a shared convolutional neural network(CNN)for the generation of feature maps.The performance of the ResNet 101 model is further improved by the use of Adam optimizer,which tunes the hyperparameters namely learning rate,batch size,momentum,and weight decay.Finally,softmax layer is applied to classify the images.The performance of the AIA-RFRCNN method has been assessed using a benchmark dataset and a detailed comparative analysis of the results takes place.The outcome of the experiments indicated the superior characteristics of the AIA-RFRCNN model under diverse aspects.展开更多
Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of ...Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of these folders deliver relevant indexing information.From the outcomes,it is dif-ficult to discover data that the user can be absorbed in.Therefore,in order to determine the significance of the data,it is important to identify the contents in an informative manner.Image annotation can be one of the greatest problematic domains in multimedia research and computer vision.Hence,in this paper,Adap-tive Convolutional Deep Learning Model(ACDLM)is developed for automatic image annotation.Initially,the databases are collected from the open-source system which consists of some labelled images(for training phase)and some unlabeled images{Corel 5 K,MSRC v2}.After that,the images are sent to the pre-processing step such as colour space quantization and texture color class map.The pre-processed images are sent to the segmentation approach for efficient labelling technique using J-image segmentation(JSEG).Thefinal step is an auto-matic annotation using ACDLM which is a combination of Convolutional Neural Network(CNN)and Honey Badger Algorithm(HBA).Based on the proposed classifier,the unlabeled images are labelled.The proposed methodology is imple-mented in MATLAB and performance is evaluated by performance metrics such as accuracy,precision,recall and F1_Measure.With the assistance of the pro-posed methodology,the unlabeled images are labelled.展开更多
文摘Translation is an important medium of cultural communication.It is not a mere transfer of two languages,but the interaction of two cultures.Cultural misreading,which results from cultural discrepancy and translator’s subjectivity,truly reflects where the blockade and conflict in the cultural communication is.Cultural misreading is an objective phenomenon that exists in the entire process of translation.This paper intends to make a comprehensive analysis and discussion on The History of the Former Han Dynasty:a Critical Translation with Annotations translated by Homer Hasenpflug Dubs.As for the reasons of cultural misreading,this paper divides them into three types—language,thinking habit,traditional culture.It is to be hoped that this paper will draw more attention from the translation circle to the phenomena,and make contribution to the development of literary translation.
基金This paper is sponsored by the Postgraduate Creative Foundation of Gannan Normal University entitled“A Study on Dynamic Equivalence of Ecological Translation Elements in Davis’English Translation of Tao Yuanming’s Works from the Perspective of Cultural Context”(“文化语境视域下戴维斯英译《陶渊明作品集》的生态翻译元素动态对等研究”,YCX21A004)“National Social Science Foundation of China western Project in2018”(“2018年国家社科基金西部项目”)entitled“A study of Chinese Ci fu in the English-speaking world”(“英语世界的中国辞赋研究”,18XZW017).
文摘This paper mainly studies the basic types of annotation and the analysis of its effective functional usage,so as to pay more attention to annotation in the translation of poetry and Fu.The annotation of this study belongs to the category of paratext.Annotation is attributed to the paratext,undertakes its special function,enriches and perfects the paratext system.
基金supported by the Lustgarten Foundation for Pancreatic Cancer Research and the Patrick J.McGovern Foundation Award.
文摘Creating large-scale and well-annotated datasets to train AI algorithms is crucial for automated tumor detection and localization.However,with limited resources,it is challenging to determine the best type of annotations when annotating massive amounts of unlabeled data.To address this issue,we focus on polyps in colonoscopy videos and pancreatic tumors in abdominal CT scans;Both applications require significant effort and time for pixel-wise annotation due to the high dimensional nature of the data,involving either temporary or spatial dimensions.In this paper,we develop a new annotation strategy,termed Drag&Drop,which simplifies the annotation process to drag and drop.This annotation strategy is more efficient,particularly for temporal and volumetric imaging,than other types of weak annotations,such as per-pixel,bounding boxes,scribbles,ellipses and points.Furthermore,to exploit our Drag&Drop annotations,we develop a novel weakly supervised learning method based on the watershed algorithm.Experimental results show that our method achieves better detection and localization performance than alternative weak annotations and,more importantly,achieves similar performance to that trained on detailed per-pixel annotations.Interestingly,we find that,with limited resources,allocating weak annotations from a diverse patient population can foster models more robust to unseen images than allocating per-pixel annotations for a small set of images.In summary,this research proposes an efficient annotation strategy for tumor detection and localization that is less accurate than per-pixel annotations but useful for creating large-scale datasets for screening tumors in various medical modalities.
基金supported by National Institutes of Health grants(Grant Nos.R01LM012806R,I01DE030122,and R01DE029818)support from Cancer Prevention and Research Institute of Texas(Grant Nos.CPRIT RP180734 and RP210045),United States.
文摘Single-cell RNA sequencing(scRNA-seq)is revolutionizing the study of complex and dynamic cellular mechanisms.However,cell type annotation remains a main challenge as it largely relies on a priori knowledge and manual curation,which is cumbersome and subjective.The increasing number of scRNA-seq datasets,as well as numerous published genetic studies,has motivated us to build a comprehensive human cell type reference atlas.Here,we present decoding Cell type Specificity(deCS),an automatic cell type annotation method augmented by a comprehensive collection of human cell type expression profiles and marker genes.We used deCS to annotate scRNAseq data from various tissue types and systematically evaluated the annotation accuracy under different conditions,including reference panels,sequencing depth,and feature selection strategies.Our results demonstrate that expanding the references is critical for improving annotation accuracy.Compared to many existing state-of-the-art annotation tools,deCS significantly reduced computation time and increased accuracy.deCS can be integrated into the standard scRNA-seq analytical pipeline to enhance cell type annotation.Finally,we demonstrated the broad utility of deCS to identify trait-cell type associations in 51 human complex traits,providing deep insights into the cellular mechanisms underlying disease pathogenesis.
基金the National Key R&D Program of China(2022YFB3402100)the National Science Fund for Distinguished Young Scholars of China(52025056)+4 种基金the National Natural Science Foundation of China(52305129)the China Postdoctoral Science Foundation(2023M732789)the China Postdoctoral Innovative Talents Support Program(BX20230290)the Open Foundation of Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment(2022JXKF JJ01)the Fundamental Research Funds for Central Universities。
文摘The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.
基金supported by the National Natural Science Foundation of China (Grant Nos.31972411,31722048,and 31630068)the Central Public-interest Scientific Institution Basal Research Fund (Grant No.Y2022PT23)+1 种基金the Innovation Program of the Chinese Academy of Agricultural Sciences,and the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops,Ministry of Agriculture and Rural Affairs,P.R.Chinasupported by NIFA,the Department of Agriculture,via UC-Berkeley,USA。
文摘Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,has been widely used as a common reference in biological research.Although its genome assembly has been updated twice,the current gene annotation still lacks information on untranslated regions(UTRs)and alternative splicing(AS).Here,we constructed a high-quality gene annotation(JZSv3)using a full-length transcriptome acquired by nanopore sequencing,yielding a total of 59452 genes and 75684 transcripts.Additionally,we re-analyzed the previously reported transcriptome data related to the development of different tissues and cold response using JZSv3 as a reference,and found that 3843 out of 11908 differentially expressed genes(DEGs)underwent AS during the development of different tissues and 309 out of 903 cold-related genes underwent AS in response to cold stress.Meanwhile,we also identified many AS genes,including BolLHCB5 and BolHSP70,that displayed distinct expression patterns within variant transcripts of the same gene,highlighting the importance of JZSv3 as a pivotal reference for AS analysis.Overall,JZSv3 provides a valuable resource for exploring gene function,especially for obtaining a deeper understanding of AS regulation mechanisms.
基金Supported by Natural Science Foundation of Fujian Province(No.2020J011084)Fujian Province Technology and Economy Integration Service Platform(No.2023XRH001)Fuzhou-Xiamen-Quanzhou National Independent Innovation Demonstration Zone Collaborative Innovation Platform(No.2022FX5)。
文摘●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction.
基金supported by the National Natural Science Foundation of China(81872995).
文摘Objective:To investigate the effect of Guangdong Shenqu(GSQ)on intestinal flora structure in mice with food stagnation through 16S rDNA sequencing.Methods: Mice were randomly assigned to control,model,GSQ low-dose(GSQL),GSQ medium-dose(GSQM),GSQ high-dose(GSQH),and lacidophilin tablets(LAB)groups,with each group containing 10 mice.A food stagnation and internal heat mouse model was established through intragastric administration of a mixture of beeswax and olive oil(1:15).The control group was administered normal saline,and the model group was administered beeswax and olive oil to maintain a state.The GSQL(2 g/kg),GSQM(4 g/kg),GSQH(8 g/kg),and LAB groups(0.625 g/kg)were administered corresponding drugs for 5 d.After administration,16S rDNA sequencing was performed to assess gut microbiota in mouse fecal samples.Results: The model group exhibited significant intestinal flora changes.Following GSQ administration,the abundance and diversity index of the intestinal flora increased significantly,the number of bacterial species was regulated,andαandβdiversity were improved.GSQ administration increased the abundance of probiotics,including Clostridia,Lachnospirales,and Lactobacillus,whereas the abundance of conditional pathogenic bacteria,such as Allobaculum,Erysipelotrichaceae,and Bacteroides decreased.Functional prediction analysis indicated that the pathogenesis of food stagnation and GSQ intervention were primarily associated with carbohydrate,lipid,and amino acid metabolism,among other metabolic pathways.Conclusion: The digestive mechanism of GSQ may be attributed to its role in restoring diversity and abundance within the intestinal flora,thereby improving the composition and structure of the intestinal flora in mice and subsequently influencing the regulation of metabolic pathways.
文摘As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.
基金supported by the National Science Foundation Plant Genome Research Program of the United States(Grant No.1444573)
文摘Published genomes frequently contain erroneous gene models that represent issues associated with identification of open reading frames,start sites,splice sites,and related structural features.The source of these inconsistencies is often traced back to integration across text file formats designed to describe long read alignments and predicted gene structures.In addition,the majority of gene prediction frameworks do not provide robust downstream filtering to remove problematic gene annotations,nor do they represent these annotations in a format consistent with current file standards.These frameworks also lack consideration for functional attributes,such as the presence or absence of protein domains that can be used for gene model validation.To provide oversight to the increasing number of published genome annotations,we present a software package,the Gene Filtering,Analysis,and Conversion(gFACs),to filter,analyze,and convert predicted gene models and alignments.The software operates across a wide range of alignment,analysis,and gene prediction files with a flexible framework for defining gene models with reliable structural and functional attributes.gFACs supports common downstream applications,including genome browsers,and generates extensive details on the filtering process,including distributions that can be visualized to further assess the proposed gene space.gFACs is freely available and implemented in Perl with support from BioPerl libraries at https://gitlab.com/PlantGenomicsLab/gFACs.
基金supported in part by the National Natural Science Foundation of China under Grant Nos. 60621001, 60875028,60875049, and 70890084the Chinese Ministry of Science and Technology under Grant No. 2006AA010106,the Chinese Academy of Sciences under Grant Nos. 2F05N01, 2F08N03 and 2F07C01
文摘Collaborative social annotation systems allow users to record and share their original keywords or tag attachments to Web resources such as Web pages, photos, or videos. These annotations are a method for organizing and labeling information. They have the potential to help users navigate the Web and locate the needed resources. However, since annotations axe posted by users under no central control, there exist problems such as spare and synonymous annotations. To efficiently use annotation information to facilitate knowledge discovery from the Web, it is advantageous if we organize social annotations from semantic perspective and embed them into algorithms for knowledge discovery. This inspires the Web page recommendation with annotations, in which users and Web pages are clustered so that semantically similar items can be related. In this paper we propose four graphic models which cluster users, Web pages and annotations and recommend Web pages for given users by assigning items to the right cluster first. The algorithms are then compared to the classical collaborative filtering recommendation method on a real-world data set. Our result indicates that the graphic models provide better recommendation performance and are robust to fit for the real applications.
基金supported by the National Natural Science Foundation of China (31000591,31000587,31171266)
文摘The discovery of novel cancer genes is one of the main goals in cancer research.Bioinformatics methods can be used to accelerate cancer gene discovery,which may help in the understanding of cancer and the development of drug targets.In this paper,we describe a classifier to predict potential cancer genes that we have developed by integrating multiple biological evidence,including protein-protein interaction network properties,and sequence and functional features.We detected 55 features that were significantly different between cancer genes and non-cancer genes.Fourteen cancer-associated features were chosen to train the classifier.Four machine learning methods,logistic regression,support vector machines(SVMs),BayesNet and decision tree,were explored in the classifier models to distinguish cancer genes from non-cancer genes.The prediction power of the different models was evaluated by 5-fold cross-validation.The area under the receiver operating characteristic curve for logistic regression,SVM,Baysnet and J48 tree models was 0.834,0.740,0.800 and 0.782,respectively.Finally,the logistic regression classifier with multiple biological features was applied to the genes in the Entrez database,and 1976 cancer gene candidates were identified.We found that the integrated prediction model performed much better than the models based on the individual biological evidence,and the network and functional features had stronger powers than the sequence features in predicting cancer genes.
基金This study was funded by the National Key Research and Development Program of China(Grant No.2017YFC1104600)Recruitment Program of Leading Talents of Guangdong Province(Grant No.2016LJ06Y375).
文摘Anterior segment eye diseases account for a significant proportion of presentations to eye clinics worldwide,including diseases associated with corneal pathologies,anterior chamber abnormalities(e.g.blood or inflammation),and lens diseases.The construction of an automatic tool for segmentation of anterior segment eye lesions would greatly improve the efficiency of clinical care.With research on artificial intelligence progressing in recent years,deep learning models have shown their superiority in image classification and segmentation.The training and evaluation of deep learning models should be based on a large amount of data annotated with expertise;however,such data are relatively scarce in the domain of medicine.Herein,the authors developed a new medical image annotation system,called EyeHealer.It is a large-scale anterior eye segment dataset with both eye structures and lesions annotated at the pixel level.Comprehensive experiments were conducted to verify its performance in disease classification and eye lesion segmentation.The results showed that semantic segmentation models outperformed medical segmentation models.This paper describes the establishment of the system for automated classification and segmentation tasks.The dataset will be made publicly available to encourage future research in this area.
文摘The most prosperous and important eras in the history of communication between China and foreign countries were the Han, Tang and Ming dynasties. During the Han and Tang dynasties China’s foreign relations were largely confined to Asia. Apart from Japan and a few South Asian countries there was almost no maritime contact with the outside world by China. The
基金King Saud University through Researchers Supporting Project number(RSP-2021/387),King Saud University,Riyadh,Saudi Arabia.
文摘Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.
文摘At present days,object detection and tracking concepts have gained more importance among researchers and business people.Presently,deep learning(DL)approaches have been used for object tracking as it increases the perfor-mance and speed of the tracking process.This paper presents a novel robust DL based object detection and tracking algorithm using Automated Image Anno-tation with ResNet based Faster regional convolutional neural network(R-CNN)named(AIA-FRCNN)model.The AIA-RFRCNN method performs image anno-tation using a Discriminative Correlation Filter(DCF)with Channel and Spatial Reliability tracker(CSR)called DCF-CSRT model.The AIA-RFRCNN model makes use of Faster RCNN as an object detector and tracker,which involves region proposal network(RPN)and Fast R-CNN.The RPN is a full convolution network that concurrently predicts the bounding box and score of different objects.The RPN is a trained model used for the generation of the high-quality region proposals,which are utilized by Fast R-CNN for detection process.Besides,Residual Network(ResNet 101)model is used as a shared convolutional neural network(CNN)for the generation of feature maps.The performance of the ResNet 101 model is further improved by the use of Adam optimizer,which tunes the hyperparameters namely learning rate,batch size,momentum,and weight decay.Finally,softmax layer is applied to classify the images.The performance of the AIA-RFRCNN method has been assessed using a benchmark dataset and a detailed comparative analysis of the results takes place.The outcome of the experiments indicated the superior characteristics of the AIA-RFRCNN model under diverse aspects.
文摘Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of these folders deliver relevant indexing information.From the outcomes,it is dif-ficult to discover data that the user can be absorbed in.Therefore,in order to determine the significance of the data,it is important to identify the contents in an informative manner.Image annotation can be one of the greatest problematic domains in multimedia research and computer vision.Hence,in this paper,Adap-tive Convolutional Deep Learning Model(ACDLM)is developed for automatic image annotation.Initially,the databases are collected from the open-source system which consists of some labelled images(for training phase)and some unlabeled images{Corel 5 K,MSRC v2}.After that,the images are sent to the pre-processing step such as colour space quantization and texture color class map.The pre-processed images are sent to the segmentation approach for efficient labelling technique using J-image segmentation(JSEG).Thefinal step is an auto-matic annotation using ACDLM which is a combination of Convolutional Neural Network(CNN)and Honey Badger Algorithm(HBA).Based on the proposed classifier,the unlabeled images are labelled.The proposed methodology is imple-mented in MATLAB and performance is evaluated by performance metrics such as accuracy,precision,recall and F1_Measure.With the assistance of the pro-posed methodology,the unlabeled images are labelled.