Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved throu...Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.展开更多
As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in...As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.展开更多
针对专利技术主题识别效率偏低、识别难度大等问题,文章提出了FPC-Kmeans++(Kmeans plus plus with feature phrase clusters)专利聚类分析与技术主题识别方法,该方法创新性地使用特征短语替代传统的分词结果,作为专利数据分析的基础。...针对专利技术主题识别效率偏低、识别难度大等问题,文章提出了FPC-Kmeans++(Kmeans plus plus with feature phrase clusters)专利聚类分析与技术主题识别方法,该方法创新性地使用特征短语替代传统的分词结果,作为专利数据分析的基础。文章以无人机专利为例,对该方法进行了实证检验。实验结果表明,相较于传统的Kmeans++(Kmeans plus plus)和LDAKmeans++(Kmeans plus plus with Latent Dirichlet Allocation)方法,该方法能更精确地判断出最佳主题数和得到层次更鲜明的聚类效果,展现了其在专利主题识别上的优势。并且,相较于其他对比算法,文章提出的NER-FPP(Named Entity Recognition with Feature Phrase Probability)算法在专利特征短语提取上效果最好,F1值分数最高,达到了93.36%。展开更多
[ Objective] The aim was to study the protein polymorphism in the blood of Tibetan Mastiff, and provide some theoretical basis for resource protection and reasonable development and utilization of Tibetan Mastiff vari...[ Objective] The aim was to study the protein polymorphism in the blood of Tibetan Mastiff, and provide some theoretical basis for resource protection and reasonable development and utilization of Tibetan Mastiff varieties. [ Method] A total of 103 blood samples were taken from four populations of Hequ Tibetan Mastiff, Qinhai Tibetan Mastiff, Tibetan Spaniel and native dogs of Qinghai. Seven blood protein Iocus(Tf, Po, Sα2, Hb, AIb, Pr and Amy)were investigated by using vertical polyacrylamide gel electrophoresis with discontinuous buffer system. Then the genetic variation during different populations was analyzed. [ Result] Genetic variations were observed in Tf, Sα2 and Po in four populations, others were not polymorphic. There were three alleles at the locus of Tf and Po, two alleles at the loci of Sα2. Effective number of alleles and Nei's average expected heterozygosity were 1. 532 4 and 0.230 3 relatively, all higher in Tibetan Mastiff than other populations. [ Conclusion] Protein locus in blood of Tibetan Mastiff existed in genetic variation.展开更多
Some influential factors of anther culture were studied preliminarily by conducting anther culture of the restorers of new cytoplasmic male sterile (NER). Several results were obtain from this experiment and they we...Some influential factors of anther culture were studied preliminarily by conducting anther culture of the restorers of new cytoplasmic male sterile (NER). Several results were obtain from this experiment and they were listed as follow:① MS cultrure medium with such hormones as 2,4-D 2 mg/L,6-BA 0.5 mg/L, NAA 0.5 mg/L was the best suitable for callus induction of NER. ②The difference of induction rate was significantly different between different plant age groups. From the 110th day to 141th day,the induction rate was increased with the increase of age and the difference of induction rate reached 0.01 significant difference level. The induction rate reached the highest value in the 141th day then it declined gradually. ③The combined use of 2, 4-D and 6-BA with proper increase of 2,4-D was good for inducing callus. ④The green plantlet induction rate of NER was increased when the concentration of 6-BA increased from 2 mg/L to 4 mg/L. Adding ZT from 0.5 mg/L to 2 mg/L. 6-BA would led 2.47% increase of green plantlet olantlet induction rate.展开更多
文摘Text classification,by automatically categorizing texts,is one of the foundational elements of natural language processing applications.This study investigates how text classification performance can be improved through the integration of entity-relation information obtained from the Wikidata(Wikipedia database)database and BERTbased pre-trained Named Entity Recognition(NER)models.Focusing on a significant challenge in the field of natural language processing(NLP),the research evaluates the potential of using entity and relational information to extract deeper meaning from texts.The adopted methodology encompasses a comprehensive approach that includes text preprocessing,entity detection,and the integration of relational information.Experiments conducted on text datasets in both Turkish and English assess the performance of various classification algorithms,such as Support Vector Machine,Logistic Regression,Deep Neural Network,and Convolutional Neural Network.The results indicate that the integration of entity-relation information can significantly enhance algorithmperformance in text classification tasks and offer new perspectives for information extraction and semantic analysis in NLP applications.Contributions of this work include the utilization of distant supervised entity-relation information in Turkish text classification,the development of a Turkish relational text classification approach,and the creation of a relational database.By demonstrating potential performance improvements through the integration of distant supervised entity-relation information into Turkish text classification,this research aims to support the effectiveness of text-based artificial intelligence(AI)tools.Additionally,it makes significant contributions to the development ofmultilingual text classification systems by adding deeper meaning to text content,thereby providing a valuable addition to current NLP studies and setting an important reference point for future research.
文摘As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.
文摘针对专利技术主题识别效率偏低、识别难度大等问题,文章提出了FPC-Kmeans++(Kmeans plus plus with feature phrase clusters)专利聚类分析与技术主题识别方法,该方法创新性地使用特征短语替代传统的分词结果,作为专利数据分析的基础。文章以无人机专利为例,对该方法进行了实证检验。实验结果表明,相较于传统的Kmeans++(Kmeans plus plus)和LDAKmeans++(Kmeans plus plus with Latent Dirichlet Allocation)方法,该方法能更精确地判断出最佳主题数和得到层次更鲜明的聚类效果,展现了其在专利主题识别上的优势。并且,相较于其他对比算法,文章提出的NER-FPP(Named Entity Recognition with Feature Phrase Probability)算法在专利特征短语提取上效果最好,F1值分数最高,达到了93.36%。
基金Supported by Foundation of Gansu Technology Committee (GKC-97-27-5)Youth Foundation of Tianshui Normal University (X4-25)~~
文摘[ Objective] The aim was to study the protein polymorphism in the blood of Tibetan Mastiff, and provide some theoretical basis for resource protection and reasonable development and utilization of Tibetan Mastiff varieties. [ Method] A total of 103 blood samples were taken from four populations of Hequ Tibetan Mastiff, Qinhai Tibetan Mastiff, Tibetan Spaniel and native dogs of Qinghai. Seven blood protein Iocus(Tf, Po, Sα2, Hb, AIb, Pr and Amy)were investigated by using vertical polyacrylamide gel electrophoresis with discontinuous buffer system. Then the genetic variation during different populations was analyzed. [ Result] Genetic variations were observed in Tf, Sα2 and Po in four populations, others were not polymorphic. There were three alleles at the locus of Tf and Po, two alleles at the loci of Sα2. Effective number of alleles and Nei's average expected heterozygosity were 1. 532 4 and 0.230 3 relatively, all higher in Tibetan Mastiff than other populations. [ Conclusion] Protein locus in blood of Tibetan Mastiff existed in genetic variation.
基金Supported by the National 863 Project of Tenth-five Year Plan(2001AA2411042004AA241104)+1 种基金Key Breeding Project of Sichuan Province and(200107001-16-01)Key Quality Project of Sichuan Province(200107001-1-7-4)~~
文摘Some influential factors of anther culture were studied preliminarily by conducting anther culture of the restorers of new cytoplasmic male sterile (NER). Several results were obtain from this experiment and they were listed as follow:① MS cultrure medium with such hormones as 2,4-D 2 mg/L,6-BA 0.5 mg/L, NAA 0.5 mg/L was the best suitable for callus induction of NER. ②The difference of induction rate was significantly different between different plant age groups. From the 110th day to 141th day,the induction rate was increased with the increase of age and the difference of induction rate reached 0.01 significant difference level. The induction rate reached the highest value in the 141th day then it declined gradually. ③The combined use of 2, 4-D and 6-BA with proper increase of 2,4-D was good for inducing callus. ④The green plantlet induction rate of NER was increased when the concentration of 6-BA increased from 2 mg/L to 4 mg/L. Adding ZT from 0.5 mg/L to 2 mg/L. 6-BA would led 2.47% increase of green plantlet olantlet induction rate.