The method presented in this work is based on the fundamental concepts of Paraconsistent Annotated Logic with annotation of 2 values (PAL2v). The PAL2v is a non-classic Logics which admits contradiction and in this pa...The method presented in this work is based on the fundamental concepts of Paraconsistent Annotated Logic with annotation of 2 values (PAL2v). The PAL2v is a non-classic Logics which admits contradiction and in this paper we perform a study using mathematical interpretation in its representative lattice. This studies result in algorithms and equations give an effective treatment on signals of information that represent situations found in uncertainty knowledge database. From the obtained equations, algorithms are elaborated to be utilized in computation models of the uncertainty treatment Systems. We presented some results that were obtained of analyses done with one of the algorithms that compose the paraconsistent analyzing system of logical signals with the PAL2v Logic. The paraconsistent reasoning system built according to the PAL2v methodology notions reveals itself to be more efficient than the traditional ones, because it gets to offer an appropriate treatment to contradictory information.展开更多
The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotatio...The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.展开更多
We ascertain the modularity-like objective function whose optimization is equivalent to the maximum likelihood in annotated networks. We demonstrate that the modularity-like objective function is a lin- ear combinatio...We ascertain the modularity-like objective function whose optimization is equivalent to the maximum likelihood in annotated networks. We demonstrate that the modularity-like objective function is a lin- ear combination of modularity and conditional entropy. In contrast with statistical inference methods, in our method, the influence of the metadata is adjustable; when its influence is strong enough, the metadata can be recovered. Conversely, when it is weak, the detection may correspond to another partition. Between the two, there is a transition. This paper provides a concept for expanding the scope of modularity methods.展开更多
Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,ha...Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,has been widely used as a common reference in biological research.Although its genome assembly has been updated twice,the current gene annotation still lacks information on untranslated regions(UTRs)and alternative splicing(AS).Here,we constructed a high-quality gene annotation(JZSv3)using a full-length transcriptome acquired by nanopore sequencing,yielding a total of 59452 genes and 75684 transcripts.Additionally,we re-analyzed the previously reported transcriptome data related to the development of different tissues and cold response using JZSv3 as a reference,and found that 3843 out of 11908 differentially expressed genes(DEGs)underwent AS during the development of different tissues and 309 out of 903 cold-related genes underwent AS in response to cold stress.Meanwhile,we also identified many AS genes,including BolLHCB5 and BolHSP70,that displayed distinct expression patterns within variant transcripts of the same gene,highlighting the importance of JZSv3 as a pivotal reference for AS analysis.Overall,JZSv3 provides a valuable resource for exploring gene function,especially for obtaining a deeper understanding of AS regulation mechanisms.展开更多
●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS...●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction.展开更多
As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in...As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.展开更多
Next-generation sequencing(NGS) technologies generate thousands to millions of genetic variants per sample.Identification of potential disease-causal variants is labor intensive as it relies on filtering using various...Next-generation sequencing(NGS) technologies generate thousands to millions of genetic variants per sample.Identification of potential disease-causal variants is labor intensive as it relies on filtering using various annotation metrics and consideration of multiple pathogenicity prediction scores.We have developed VPOT(variant prioritization ordering tool),a python-based command line tool that allows researchers to create a single fully customizable pathogenicity ranking score from any number of annotation values,each with a user-defined weighting.The use of VPOT can be informative when analyzing entire cohorts,as variants in a cohort can be prioritized.VPOT also provides additional functions to allow variant filtering based on a candidate gene list or by affected status in a family pedigree.VPOT outperforms similar tools in terms of efficacy,flexibility,scalability,and computational performance.VPOT is freely available for public use at Git Hub(https://github.com/VCCRI/VPOT/).Documentation for installation along with a user tutorial,a default parameter file,and test data are provided.展开更多
This paper aims to investigate the effectiveness of rubric-referenced student self-assessment(SSA)on students’English essay writing by employing a two-group pre-post-quasi-experimental research design.The method was ...This paper aims to investigate the effectiveness of rubric-referenced student self-assessment(SSA)on students’English essay writing by employing a two-group pre-post-quasi-experimental research design.The method was tested on 54 students at a Chinese university.During a 17-week experiment,the experimental group(EG)received the rubric and annotated samples,while the comparison group(CG)received only the rubric in self-assessment.Data sources included students’scores in the pre-test and post-test and interviews.Quantitative findings indicated that the EG made significantly stronger progress than the CG in the post-test.Interview results suggested that annotation-based rubric-referenced SSA can help students understand the task requirements,initiate their self-regulatory behaviors,and improve their self-assessment confidence,although students still wanted to receive assistance from teachers partly due to the Confucian-heritage culture settings in China.The findings are discussed in terms of the design features of sample annotations within the framework of self-regulated learning(SRL),as well as the implications of using this method in the classroom.展开更多
Automatic web image annotation is a practical and effective way for both web image retrieval and image understanding. However, current annotation techniques make no further investigation of the statement-level syntact...Automatic web image annotation is a practical and effective way for both web image retrieval and image understanding. However, current annotation techniques make no further investigation of the statement-level syntactic correlation among the annotated words, therefore making it very difficult to render natural language interpretation for images such as "pandas eat bamboo". In this paper, we propose an approach to interpret image semantics through mining the visible and textual information hidden in images. This approach mainly consists of two parts: first the annotated words of target images are ranked according to two factors, namely the visual correlation and the pairwise co-occurrence; then the statement-level syntactic correlation among annotated words is explored and natural language interpretation for the target image is obtained. Experiments conducted on real-world web images show the effectiveness of the proposed approach.展开更多
In this paper we address the problem of geometric multi-model fitting using a few weakly annotated data points,which has been little studied so far.In weak annotating(WA),most manual annotations are supposed to be cor...In this paper we address the problem of geometric multi-model fitting using a few weakly annotated data points,which has been little studied so far.In weak annotating(WA),most manual annotations are supposed to be correct yet inevitably mixed with incorrect ones.Such WA data can naturally arise through interaction in various tasks.For example,in the case of homography estimation,one can easily annotate points on the same plane or object with a single label by observing the image.Motivated by this,we propose a novel method to make full use of WA data to boost multi-model fitting performance.Specifically,a graph for model proposal sampling is first constructed using the WA data,given the prior that WA data annotated with the same weak label has a high probability of belonging to the same model.By incorporating this prior knowledge into the calculation of edge probabilities,vertices(i.e.,data points)lying on or near the latent model are likely to be associated and further form a subset or cluster for effective proposal generation.Having generated proposals,α-expansion is used for labeling,and our method in return updates the proposals.This procedure works in an iterative way.Extensive experiments validate our method and show that it produces noticeably better results than state-of-the-art techniques in most cases.展开更多
Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The ma...Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.展开更多
At present days,object detection and tracking concepts have gained more importance among researchers and business people.Presently,deep learning(DL)approaches have been used for object tracking as it increases the per...At present days,object detection and tracking concepts have gained more importance among researchers and business people.Presently,deep learning(DL)approaches have been used for object tracking as it increases the perfor-mance and speed of the tracking process.This paper presents a novel robust DL based object detection and tracking algorithm using Automated Image Anno-tation with ResNet based Faster regional convolutional neural network(R-CNN)named(AIA-FRCNN)model.The AIA-RFRCNN method performs image anno-tation using a Discriminative Correlation Filter(DCF)with Channel and Spatial Reliability tracker(CSR)called DCF-CSRT model.The AIA-RFRCNN model makes use of Faster RCNN as an object detector and tracker,which involves region proposal network(RPN)and Fast R-CNN.The RPN is a full convolution network that concurrently predicts the bounding box and score of different objects.The RPN is a trained model used for the generation of the high-quality region proposals,which are utilized by Fast R-CNN for detection process.Besides,Residual Network(ResNet 101)model is used as a shared convolutional neural network(CNN)for the generation of feature maps.The performance of the ResNet 101 model is further improved by the use of Adam optimizer,which tunes the hyperparameters namely learning rate,batch size,momentum,and weight decay.Finally,softmax layer is applied to classify the images.The performance of the AIA-RFRCNN method has been assessed using a benchmark dataset and a detailed comparative analysis of the results takes place.The outcome of the experiments indicated the superior characteristics of the AIA-RFRCNN model under diverse aspects.展开更多
Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of ...Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of these folders deliver relevant indexing information.From the outcomes,it is dif-ficult to discover data that the user can be absorbed in.Therefore,in order to determine the significance of the data,it is important to identify the contents in an informative manner.Image annotation can be one of the greatest problematic domains in multimedia research and computer vision.Hence,in this paper,Adap-tive Convolutional Deep Learning Model(ACDLM)is developed for automatic image annotation.Initially,the databases are collected from the open-source system which consists of some labelled images(for training phase)and some unlabeled images{Corel 5 K,MSRC v2}.After that,the images are sent to the pre-processing step such as colour space quantization and texture color class map.The pre-processed images are sent to the segmentation approach for efficient labelling technique using J-image segmentation(JSEG).Thefinal step is an auto-matic annotation using ACDLM which is a combination of Convolutional Neural Network(CNN)and Honey Badger Algorithm(HBA).Based on the proposed classifier,the unlabeled images are labelled.The proposed methodology is imple-mented in MATLAB and performance is evaluated by performance metrics such as accuracy,precision,recall and F1_Measure.With the assistance of the pro-posed methodology,the unlabeled images are labelled.展开更多
In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual...In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds(575 h)if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.展开更多
Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summariza...Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.展开更多
Du Yu’s Annotation of Classics and Elucidations in "Spring and Autumn Annals" is considered as a milestone for the studies of Spring and Autumn Annals, In Du’s book, the Preface to the Spring and Autumn An...Du Yu’s Annotation of Classics and Elucidations in "Spring and Autumn Annals" is considered as a milestone for the studies of Spring and Autumn Annals, In Du’s book, the Preface to the Spring and Autumn Annals is a theoretical summary and revision for the previous studies on the Spring and Autumn Annals and the Tso Chuan. Its combination of classics and elucidations makes Tso Chuan a real elucidating study on Spring and Autumn Annals. Du expounds his philosophy and political ideas in this book.展开更多
文摘The method presented in this work is based on the fundamental concepts of Paraconsistent Annotated Logic with annotation of 2 values (PAL2v). The PAL2v is a non-classic Logics which admits contradiction and in this paper we perform a study using mathematical interpretation in its representative lattice. This studies result in algorithms and equations give an effective treatment on signals of information that represent situations found in uncertainty knowledge database. From the obtained equations, algorithms are elaborated to be utilized in computation models of the uncertainty treatment Systems. We presented some results that were obtained of analyses done with one of the algorithms that compose the paraconsistent analyzing system of logical signals with the PAL2v Logic. The paraconsistent reasoning system built according to the PAL2v methodology notions reveals itself to be more efficient than the traditional ones, because it gets to offer an appropriate treatment to contradictory information.
基金the National Key R&D Program of China(2022YFB3402100)the National Science Fund for Distinguished Young Scholars of China(52025056)+4 种基金the National Natural Science Foundation of China(52305129)the China Postdoctoral Science Foundation(2023M732789)the China Postdoctoral Innovative Talents Support Program(BX20230290)the Open Foundation of Hunan Provincial Key Laboratory of Health Maintenance for Mechanical Equipment(2022JXKF JJ01)the Fundamental Research Funds for Central Universities。
文摘The success of deep transfer learning in fault diagnosis is attributed to the collection of high-quality labeled data from the source domain.However,in engineering scenarios,achieving such high-quality label annotation is difficult and expensive.The incorrect label annotation produces two negative effects:1)the complex decision boundary of diagnosis models lowers the generalization performance on the target domain,and2)the distribution of target domain samples becomes misaligned with the false-labeled samples.To overcome these negative effects,this article proposes a solution called the label recovery and trajectory designable network(LRTDN).LRTDN consists of three parts.First,a residual network with dual classifiers is to learn features from cross-domain samples.Second,an annotation check module is constructed to generate a label anomaly indicator that could modify the abnormal labels of false-labeled samples in the source domain.With the training of relabeled samples,the complexity of diagnosis model is reduced via semi-supervised learning.Third,the adaptation trajectories are designed for sample distributions across domains.This ensures that the target domain samples are only adapted with the pure-labeled samples.The LRTDN is verified by two case studies,in which the diagnosis knowledge of bearings is transferred across different working conditions as well as different yet related machines.The results show that LRTDN offers a high diagnosis accuracy even in the presence of incorrect annotation.
基金This work was funded by the National Natural Science Foundation of China (Grant Nos. 11275186, 91024026, and FOM2014OF001).
文摘We ascertain the modularity-like objective function whose optimization is equivalent to the maximum likelihood in annotated networks. We demonstrate that the modularity-like objective function is a lin- ear combination of modularity and conditional entropy. In contrast with statistical inference methods, in our method, the influence of the metadata is adjustable; when its influence is strong enough, the metadata can be recovered. Conversely, when it is weak, the detection may correspond to another partition. Between the two, there is a transition. This paper provides a concept for expanding the scope of modularity methods.
基金supported by the National Natural Science Foundation of China (Grant Nos.31972411,31722048,and 31630068)the Central Public-interest Scientific Institution Basal Research Fund (Grant No.Y2022PT23)+1 种基金the Innovation Program of the Chinese Academy of Agricultural Sciences,and the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops,Ministry of Agriculture and Rural Affairs,P.R.Chinasupported by NIFA,the Department of Agriculture,via UC-Berkeley,USA。
文摘Brassica oleracea has been developed into many important crops,including cabbage,kale,cauliflower,broccoli and so on.The genome and gene annotation of cabbage(cultivar JZS),a representative morphotype of B.oleracea,has been widely used as a common reference in biological research.Although its genome assembly has been updated twice,the current gene annotation still lacks information on untranslated regions(UTRs)and alternative splicing(AS).Here,we constructed a high-quality gene annotation(JZSv3)using a full-length transcriptome acquired by nanopore sequencing,yielding a total of 59452 genes and 75684 transcripts.Additionally,we re-analyzed the previously reported transcriptome data related to the development of different tissues and cold response using JZSv3 as a reference,and found that 3843 out of 11908 differentially expressed genes(DEGs)underwent AS during the development of different tissues and 309 out of 903 cold-related genes underwent AS in response to cold stress.Meanwhile,we also identified many AS genes,including BolLHCB5 and BolHSP70,that displayed distinct expression patterns within variant transcripts of the same gene,highlighting the importance of JZSv3 as a pivotal reference for AS analysis.Overall,JZSv3 provides a valuable resource for exploring gene function,especially for obtaining a deeper understanding of AS regulation mechanisms.
基金Supported by Natural Science Foundation of Fujian Province(No.2020J011084)Fujian Province Technology and Economy Integration Service Platform(No.2023XRH001)Fuzhou-Xiamen-Quanzhou National Independent Innovation Demonstration Zone Collaborative Innovation Platform(No.2022FX5)。
文摘●AIM:To investigate a pioneering framework for the segmentation of meibomian glands(MGs),using limited annotations to reduce the workload on ophthalmologists and enhance the efficiency of clinical diagnosis.●METHODS:Totally 203 infrared meibomian images from 138 patients with dry eye disease,accompanied by corresponding annotations,were gathered for the study.A rectified scribble-supervised gland segmentation(RSSGS)model,incorporating temporal ensemble prediction,uncertainty estimation,and a transformation equivariance constraint,was introduced to address constraints imposed by limited supervision information inherent in scribble annotations.The viability and efficacy of the proposed model were assessed based on accuracy,intersection over union(IoU),and dice coefficient.●RESULTS:Using manual labels as the gold standard,RSSGS demonstrated outcomes with an accuracy of 93.54%,a dice coefficient of 78.02%,and an IoU of 64.18%.Notably,these performance metrics exceed the current weakly supervised state-of-the-art methods by 0.76%,2.06%,and 2.69%,respectively.Furthermore,despite achieving a substantial 80%reduction in annotation costs,it only lags behind fully annotated methods by 0.72%,1.51%,and 2.04%.●CONCLUSION:An innovative automatic segmentation model is developed for MGs in infrared eyelid images,using scribble annotation for training.This model maintains an exceptionally high level of segmentation accuracy while substantially reducing training costs.It holds substantial utility for calculating clinical parameters,thereby greatly enhancing the diagnostic efficiency of ophthalmologists in evaluating meibomian gland dysfunction.
文摘As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.
基金an Australian Postgraduate Award(University of New South Wales)to EI,Chain Reaction(The Ultimate Corporate Bike Challenge),the Office of Health and Medical Research,NSW Government,Australiathe National Health and Medical Research Council Principal Research Fellowship(Grant No.1135886)to SLD,NSW Government,Australiathe National Heart Foundation of Australia Future Leader Fellowship(Grant No.101204)to EG.
文摘Next-generation sequencing(NGS) technologies generate thousands to millions of genetic variants per sample.Identification of potential disease-causal variants is labor intensive as it relies on filtering using various annotation metrics and consideration of multiple pathogenicity prediction scores.We have developed VPOT(variant prioritization ordering tool),a python-based command line tool that allows researchers to create a single fully customizable pathogenicity ranking score from any number of annotation values,each with a user-defined weighting.The use of VPOT can be informative when analyzing entire cohorts,as variants in a cohort can be prioritized.VPOT also provides additional functions to allow variant filtering based on a candidate gene list or by affected status in a family pedigree.VPOT outperforms similar tools in terms of efficacy,flexibility,scalability,and computational performance.VPOT is freely available for public use at Git Hub(https://github.com/VCCRI/VPOT/).Documentation for installation along with a user tutorial,a default parameter file,and test data are provided.
基金supported by The Research Project of Philosophy and Social Science of Ministry of Education of China[Grant No.17YJC740102]Guangdong Provincial Teaching Award Nurturing Project(Name:Developing the Self-Assessment System of Writing for the National Quality Course of College English at South China University of Technology).
文摘This paper aims to investigate the effectiveness of rubric-referenced student self-assessment(SSA)on students’English essay writing by employing a two-group pre-post-quasi-experimental research design.The method was tested on 54 students at a Chinese university.During a 17-week experiment,the experimental group(EG)received the rubric and annotated samples,while the comparison group(CG)received only the rubric in self-assessment.Data sources included students’scores in the pre-test and post-test and interviews.Quantitative findings indicated that the EG made significantly stronger progress than the CG in the post-test.Interview results suggested that annotation-based rubric-referenced SSA can help students understand the task requirements,initiate their self-regulatory behaviors,and improve their self-assessment confidence,although students still wanted to receive assistance from teachers partly due to the Confucian-heritage culture settings in China.The findings are discussed in terms of the design features of sample annotations within the framework of self-regulated learning(SRL),as well as the implications of using this method in the classroom.
基金Project supported by the National Natural Science Foundation of China (Nos 60533090 and 60603096)the National High-Tech Research and Development Program (863) of China (No 2006AA 010107)
文摘Automatic web image annotation is a practical and effective way for both web image retrieval and image understanding. However, current annotation techniques make no further investigation of the statement-level syntactic correlation among the annotated words, therefore making it very difficult to render natural language interpretation for images such as "pandas eat bamboo". In this paper, we propose an approach to interpret image semantics through mining the visible and textual information hidden in images. This approach mainly consists of two parts: first the annotated words of target images are ranked according to two factors, namely the visual correlation and the pairwise co-occurrence; then the statement-level syntactic correlation among annotated words is explored and natural language interpretation for the target image is obtained. Experiments conducted on real-world web images show the effectiveness of the proposed approach.
基金supported in part by JSPS KAKENHI Grant JP18K17823supported in part by Deakin CY01-251301-F003-PJ03906-PG00447。
文摘In this paper we address the problem of geometric multi-model fitting using a few weakly annotated data points,which has been little studied so far.In weak annotating(WA),most manual annotations are supposed to be correct yet inevitably mixed with incorrect ones.Such WA data can naturally arise through interaction in various tasks.For example,in the case of homography estimation,one can easily annotate points on the same plane or object with a single label by observing the image.Motivated by this,we propose a novel method to make full use of WA data to boost multi-model fitting performance.Specifically,a graph for model proposal sampling is first constructed using the WA data,given the prior that WA data annotated with the same weak label has a high probability of belonging to the same model.By incorporating this prior knowledge into the calculation of edge probabilities,vertices(i.e.,data points)lying on or near the latent model are likely to be associated and further form a subset or cluster for effective proposal generation.Having generated proposals,α-expansion is used for labeling,and our method in return updates the proposals.This procedure works in an iterative way.Extensive experiments validate our method and show that it produces noticeably better results than state-of-the-art techniques in most cases.
基金King Saud University through Researchers Supporting Project number(RSP-2021/387),King Saud University,Riyadh,Saudi Arabia.
文摘Daily newspapers publish a tremendous amount of information disseminated through the Internet.Freely available and easily accessible large online repositories are not indexed and are in an un-processable format.The major hindrance in developing and evaluating existing/new monolingual text in an image is that it is not linked and indexed.There is no method to reuse the online news images because of the unavailability of standardized benchmark corpora,especially for South Asian languages.The corpus is a vital resource for developing and evaluating text in an image to reuse local news systems in general and specifically for the Urdu language.Lack of indexing,primarily semantic indexing of the daily news items,makes news items impracticable for any querying.Moreover,the most straightforward search facility does not support these unindexed news resources.Our study addresses this gap by associating and marking the newspaper images with one of the widely spoken but under-resourced languages,i.e.,Urdu.The present work proposed a method to build a benchmark corpus of news in image form by introducing a web crawler.The corpus is then semantically linked and annotated with daily news items.Two techniques are proposed for image annotation,free annotation and fixed cross examination annotation.The second technique got higher accuracy.Build news ontology in protégéusing OntologyWeb Language(OWL)language and indexed the annotations under it.The application is also built and linked with protégéso that the readers and journalists have an interface to query the news items directly.Similarly,news items linked together will provide complete coverage and bring together different opinions at a single location for readers to do the analysis themselves.
文摘At present days,object detection and tracking concepts have gained more importance among researchers and business people.Presently,deep learning(DL)approaches have been used for object tracking as it increases the perfor-mance and speed of the tracking process.This paper presents a novel robust DL based object detection and tracking algorithm using Automated Image Anno-tation with ResNet based Faster regional convolutional neural network(R-CNN)named(AIA-FRCNN)model.The AIA-RFRCNN method performs image anno-tation using a Discriminative Correlation Filter(DCF)with Channel and Spatial Reliability tracker(CSR)called DCF-CSRT model.The AIA-RFRCNN model makes use of Faster RCNN as an object detector and tracker,which involves region proposal network(RPN)and Fast R-CNN.The RPN is a full convolution network that concurrently predicts the bounding box and score of different objects.The RPN is a trained model used for the generation of the high-quality region proposals,which are utilized by Fast R-CNN for detection process.Besides,Residual Network(ResNet 101)model is used as a shared convolutional neural network(CNN)for the generation of feature maps.The performance of the ResNet 101 model is further improved by the use of Adam optimizer,which tunes the hyperparameters namely learning rate,batch size,momentum,and weight decay.Finally,softmax layer is applied to classify the images.The performance of the AIA-RFRCNN method has been assessed using a benchmark dataset and a detailed comparative analysis of the results takes place.The outcome of the experiments indicated the superior characteristics of the AIA-RFRCNN model under diverse aspects.
文摘Every day,websites and personal archives create more and more photos.The size of these archives is immeasurable.The comfort of use of these huge digital image gatherings donates to their admiration.However,not all of these folders deliver relevant indexing information.From the outcomes,it is dif-ficult to discover data that the user can be absorbed in.Therefore,in order to determine the significance of the data,it is important to identify the contents in an informative manner.Image annotation can be one of the greatest problematic domains in multimedia research and computer vision.Hence,in this paper,Adap-tive Convolutional Deep Learning Model(ACDLM)is developed for automatic image annotation.Initially,the databases are collected from the open-source system which consists of some labelled images(for training phase)and some unlabeled images{Corel 5 K,MSRC v2}.After that,the images are sent to the pre-processing step such as colour space quantization and texture color class map.The pre-processed images are sent to the segmentation approach for efficient labelling technique using J-image segmentation(JSEG).Thefinal step is an auto-matic annotation using ACDLM which is a combination of Convolutional Neural Network(CNN)and Honey Badger Algorithm(HBA).Based on the proposed classifier,the unlabeled images are labelled.The proposed methodology is imple-mented in MATLAB and performance is evaluated by performance metrics such as accuracy,precision,recall and F1_Measure.With the assistance of the pro-posed methodology,the unlabeled images are labelled.
文摘In machine learning,sentiment analysis is a technique to find and analyze the sentiments hidden in the text.For sentiment analysis,annotated data is a basic requirement.Generally,this data is manually annotated.Manual annotation is time consuming,costly and laborious process.To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis.Dataset is created from the reviews of ten most popular songs on YouTube.Reviews of five aspects—voice,video,music,lyrics and song,are extracted.An N-Gram based technique is proposed.Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds(575 h)if it was annotated manually.For the validation of the proposed technique,a sub-dataset—Voice,is annotated manually as well as with the proposed technique.Cohen’s Kappa statistics is used to evaluate the degree of agreement between the two annotations.The high Kappa value(i.e.,0.9571%)shows the high level of agreement between the two.This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost.This research also contributes in consolidating the guidelines for the manual annotation process.
基金The National Natural Science Foundation of China(No.61133012)the Humanity and Social Science Foundation of the Ministry of Education(No.12YJCZH274)+1 种基金the Humanity and Social Science Foundation of Jiangxi Province(No.XW1502,TQ1503)the Science and Technology Project of Jiangxi Science and Technology Department(No.20121BBG70050,20142BBG70011)
文摘Dealing with issues such as too simple image features and word noise inference in product image sentence anmotation, a product image sentence annotation model focusing on image feature learning and key words summarization is described. Three kernel descriptors such as gradient, shape, and color are extracted, respectively. Feature late-fusion is executed in turn by the multiple kernel learning model to obtain more discriminant image features. Absolute rank and relative rank of the tag-rank model are used to boost the key words' weights. A new word integration algorithm named word sequence blocks building (WSBB) is designed to create N-gram word sequences. Sentences are generated according to the N-gram word sequences and predefined templates. Experimental results show that both the BLEU-1 scores and BLEU-2 scores of the sentences are superior to those of the state-of-art baselines.
文摘Du Yu’s Annotation of Classics and Elucidations in "Spring and Autumn Annals" is considered as a milestone for the studies of Spring and Autumn Annals, In Du’s book, the Preface to the Spring and Autumn Annals is a theoretical summary and revision for the previous studies on the Spring and Autumn Annals and the Tso Chuan. Its combination of classics and elucidations makes Tso Chuan a real elucidating study on Spring and Autumn Annals. Du expounds his philosophy and political ideas in this book.