As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects in...As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japane...Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japanese Sign Language(JSL)for communication.However,existing JSL recognition systems have faced significant performance limitations due to inherent complexities.In response to these challenges,we present a novel JSL recognition system that employs a strategic fusion approach,combining joint skeleton-based handcrafted features and pixel-based deep learning features.Our system incorporates two distinct streams:the first stream extracts crucial handcrafted features,emphasizing the capture of hand and body movements within JSL gestures.Simultaneously,a deep learning-based transfer learning stream captures hierarchical representations of JSL gestures in the second stream.Then,we concatenated the critical information of the first stream and the hierarchy of the second stream features to produce the multiple levels of the fusion features,aiming to create a comprehensive representation of the JSL gestures.After reducing the dimensionality of the feature,a feature selection approach and a kernel-based support vector machine(SVM)were used for the classification.To assess the effectiveness of our approach,we conducted extensive experiments on our Lab JSL dataset and a publicly available Arabic sign language(ArSL)dataset.Our results unequivocally demonstrate that our fusion approach significantly enhances JSL recognition accuracy and robustness compared to individual feature sets or traditional recognition methods.展开更多
Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learn...Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.展开更多
In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This me...In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy.展开更多
Sentiment analysis, the meta field of Natural Language Processing (NLP), attempts to analyze and identify thesentiments in the opinionated text data. People share their judgments, reactions, and feedback on the intern...Sentiment analysis, the meta field of Natural Language Processing (NLP), attempts to analyze and identify thesentiments in the opinionated text data. People share their judgments, reactions, and feedback on the internetusing various languages. Urdu is one of them, and it is frequently used worldwide. Urdu-speaking people prefer tocommunicate on social media in Roman Urdu (RU), an English scripting style with the Urdu language dialect.Researchers have developed versatile lexical resources for features-rich comprehensive languages, but limitedlinguistic resources are available to facilitate the sentiment classification of Roman Urdu. This effort encompassesextracting subjective expressions in Roman Urdu and determining the implied opinionated text polarity. Theprimary sources of the dataset are Daraz (an e-commerce platform), Google Maps, and the manual effort. Thecontributions of this study include a Bilingual Roman Urdu Language Detector (BRULD) and a Roman UrduSpelling Checker (RUSC). These integrated modules accept the user input, detect the text language, correct thespellings, categorize the sentiments, and return the input sentence’s orientation with a sentiment intensity score.The developed system gains strength with each input experience gradually. The results show that the languagedetector gives an accuracy of 97.1% on a close domain dataset, with an overall sentiment classification accuracy of94.3%.展开更多
The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classificatio...The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.展开更多
Objective This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models(LLMs)in the field of orthopedics to explore optimization strategies for the applic...Objective This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models(LLMs)in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.Methods This research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons(AAOS)and authoritative orthopedic publications.A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge,disease diagnosis,fracture classification,treatment options,and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4,ChatGLM,and Spark LLM,with their generated responses recorded.The overall quality,accuracy,and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.Results Compared with their unoptimized LLMs,the optimized version of GPT-4 showed improvements of 15.3%in overall quality,12.5%in accuracy,and 12.8%in comprehensiveness;ChatGLM showed improvements of 24.8%,16.1%,and 19.6%,respectively;and Spark LLM showed improvements of 6.5%,14.5%,and 24.7%,respectively.Conclusion The optimization of knowledge bases significantly enhances the quality,accuracy,and comprehensiveness of the responses provided by the 3 models in the orthopedic field.Therefore,knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.展开更多
This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like r...This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like reasoning+action and retrieval-augmented generation to improve accuracy and reliability.Emphasizing that simple question and answer testing is insufficient,it calls for more nuanced evaluation methods to truly gauge large language models’capabilities in clinical applications.展开更多
Foreign language teaching practice is developing rapidly,but research on foreign language teacher learning is currently relatively fragmented and unstructured.The book Foreign Language Teacher Learning,written by Prof...Foreign language teaching practice is developing rapidly,but research on foreign language teacher learning is currently relatively fragmented and unstructured.The book Foreign Language Teacher Learning,written by Professor Kang Yan from Capital Normal University,published in September 2022,makes a systematic introduction to foreign language teacher learning,which to some extent makes up for this shortcoming.Her book presents the lineage of foreign language teacher learning research at home and abroad,analyzes both theoretical and practical aspects,reviews the cuttingedge research results,and foresees the future development trend,painting a complete research picture for researchers in the field of foreign language teaching and teacher education as well as front-line teachers interested in foreign language teacher learning.This is an important inspiration for conducting foreign language teacher learning research in the future.And this paper makes a review of the book from aspects such as its content,major characteristics,contributions and limitations.展开更多
The problematic use of social media has numerous negative impacts on individuals'daily lives,interpersonal relationships,physical and mental health,and more.Currently,there are few methods and tools to alleviate p...The problematic use of social media has numerous negative impacts on individuals'daily lives,interpersonal relationships,physical and mental health,and more.Currently,there are few methods and tools to alleviate problematic social media,and their potential is yet to be fully realized.Emerging large language models(LLMs)are becoming increasingly popular for providing information and assistance to people and are being applied in many aspects of life.In mitigating problematic social media use,LLMs such as ChatGPT can play a positive role by serving as conversational partners and outlets for users,providing personalized information and resources,monitoring and intervening in problematic social media use,and more.In this process,we should recognize both the enormous potential and endless possibilities of LLMs such as ChatGPT,leveraging their advantages to better address problematic social media use,while also acknowledging the limitations and potential pitfalls of ChatGPT technology,such as errors,limitations in issue resolution,privacy and security concerns,and potential overreliance.When we leverage the advantages of LLMs to address issues in social media usage,we must adopt a cautious and ethical approach,being vigilant of the potential adverse effects that LLMs may have in addressing problematic social media use to better harness technology to serve individuals and society.展开更多
In this presentation,the author explores strategies and activities that create a welcoming and nurturing environment for LGBTQIA+foreign language students at the college level in the United States.Indeed,as it is comm...In this presentation,the author explores strategies and activities that create a welcoming and nurturing environment for LGBTQIA+foreign language students at the college level in the United States.Indeed,as it is commonly stated,there is a clear lack of visibility,acceptance/tolerance,and inclusion of LGBTQIA+voices(active and proactive)in foreign language teaching and learning(pedagogical material as well as in-class attitudes and delivery).The reasons for this deficiency are diverse since they cover not only outward discrimination against and/or indifference(passive-aggressive behavior)toward the LGBTQIA+community and its plight,but they also illustrate the sheer,physical inexistence of quality material that can be used to create,present,and foster a welcoming and all-encompassing classroom environment that addresses the diverse spectrum of the LGBTQIA+community while its members try to learn a foreign language and(positively)negotiate how to express their very identity in their new language of choice.Moreover,the scant material that is available in the United States as well as in other Western countries where the LGBTQIA+community by and large is accepted-at least from the legal point of view-hardly represents and covers the subject matter in a satisfactory way.To address this gap,the author will provide some examples in European Portuguese(EP)and Italian(IT),followed by a courtesy translation in English,to illustrate some positive activities and strategies for including LGBTQIA+-related subject matter in the foreign language classroom at the college level.It is his desire that instructors of other foreign languages duplicate in their respective vernaculars the strategies and activities herein presented in order to create a welcoming and safe classroom environment that embraces the LGBTQIA+community as it goes through the enjoyable journey of learning a new language and culture.展开更多
Continuous development of technology provides an opportunity to incorporate feedback in online assessments.The mode of online instruction during the pandemic was the most significant survival change.Technology enabled...Continuous development of technology provides an opportunity to incorporate feedback in online assessments.The mode of online instruction during the pandemic was the most significant survival change.Technology enabled every teacher and student to enter a virtual classroom to make sense of education.Feedback is part of language instruction and is a powerful key to improving students’learning performance.Feedback plays an influential and crucial role in teaching and learning.Feedback is an invaluable,ultimate learning tool for learners that aids them in not committing the same error again and creates impetus.Thus,knowing about formative exam feedback is students’right because quality feedback allures them.Given students’eagerness,providing feedback is considered a good practice to be followed by all the teaching faculty.Apropos of online feedback,the present study attempts to study how pedagogical agents provide online feedback in language assessments.The study also considers the characteristics of pedagogical conversational agents that are suitable for providing feedback in online language assessment.Simply put,the study encapsulates that screen agents play an essential role in students’motivation and acceptability of learning through feedback.展开更多
This study examines the writing abilities of Iranian intermediate Korean learners,specifically their performance in the compositional writing section of the Test of Proficiency in Korean(TOPIK).By utilizing the many F...This study examines the writing abilities of Iranian intermediate Korean learners,specifically their performance in the compositional writing section of the Test of Proficiency in Korean(TOPIK).By utilizing the many FACETS-Rasch Model,we meticulously analyzed nine writing samples from the 52nd TOPIK.These samples were evaluated using a modified rubric ranging from 0 to 3 based on predefined criteria for written composition.The results underscored that the sections on“appropriateness of spacing and spelling,”“relevance of vocabulary,”and“content diversity”presented the most significant challenges for the learners.On the other hand,“the quantity of writing”emerged as the least challenging aspect.These findings reveal substantial disparities in various aspects of writing proficiency among learners.The study not only pinpoints issue areas in writing skills,but also underscores the necessity of customized teaching strategies within the TOPIK framework to address these weaknesses.Consequently,it offers valuable insights that could bolster the effectiveness of teaching writing to Korean language learners.The findings of this study are not only significant for the field of language education,but also contribute to a deeper understanding of the challenges faced by intermediate learners and provide a roadmap for improving language instruction.展开更多
Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, a...Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks.展开更多
This paper explores the lexical association patterns of English as a second language and their relationship with language proficiency.Through the vocabulary association test,the study analyzes the differences in vocab...This paper explores the lexical association patterns of English as a second language and their relationship with language proficiency.Through the vocabulary association test,the study analyzes the differences in vocabulary association between learners with different language levels.The participants were 100 non-native English-speaking un-dergraduate students from a top 200 university,such as the University of Nottingham,and a university outside the top 200,such as the University of Aberdeen;the two groups of learners differed in their vocabulary size and learning style.It was found that the two groups of learners differed significantly in vocabulary size,language background,and learning experience.In addition,the study raises three core questions:first,learners’lexical association patterns,second,dif-ferences in association among learners with different language proficiency levels,and third,other variables that affect vocabulary association ability.The limitations of the study are that reaction time was not measured and the influence of native language background on word association was not fully considered;future research should further explore these aspects.展开更多
With the rapid development of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. These models have great potential to enha...With the rapid development of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. These models have great potential to enhance database query systems, enabling more intuitive and semantic query mechanisms. Our model leverages LLM’s deep learning architecture to interpret and process natural language queries and translate them into accurate database queries. The system integrates an LLM-powered semantic parser that translates user input into structured queries that can be understood by the database management system. First, the user query is pre-processed, the text is normalized, and the ambiguity is removed. This is followed by semantic parsing, where the LLM interprets the pre-processed text and identifies key entities and relationships. This is followed by query generation, which converts the parsed information into a structured query format and tailors it to the target database schema. Finally, there is query execution and feedback, where the resulting query is executed on the database and the results are returned to the user. The system also provides feedback mechanisms to improve and optimize future query interpretations. By using advanced LLMs for model implementation and fine-tuning on diverse datasets, the experimental results show that the proposed method significantly improves the accuracy and usability of database queries, making data retrieval easy for users without specialized knowledge.展开更多
From the perspective of linguistic musicology,this paper discusses the musical features of Yulinling:Farewell in Autumn,a representative work of Yong Liu,a famous lyricist of the Song Dynasty.Through a systematic anal...From the perspective of linguistic musicology,this paper discusses the musical features of Yulinling:Farewell in Autumn,a representative work of Yong Liu,a famous lyricist of the Song Dynasty.Through a systematic analysis of the core elements of the poem,such as rhyme,timbre,key,and rhythm,it reveals the intrinsic connection between poetry and music and proposes a new way of thinking about the modern interpretation of ancient poetic songs.This paper not only enriches the understanding of Yong Liu’s works but also provides new theoretical support for modern music education of classical poetry.In addition,this paper explores the practical application value of poetic musicality analysis in cultural inheritance and modern adaptation of classical poems,providing new insights into the contemporary inheritance of cultural heritage.展开更多
The challenge of transitioning from temporary humanitarian settlements to more sustainable human settlements is due to a significant increase in the number of forcibly displaced people over recent decades, difficultie...The challenge of transitioning from temporary humanitarian settlements to more sustainable human settlements is due to a significant increase in the number of forcibly displaced people over recent decades, difficulties in providing social services that meet the required standards, and the prolongation of emergencies. Despite this challenging context, short-term considerations continue to guide their planning and management rather than more integrated, longer-term perspectives, thus preventing viable, sustainable development. Over the years, the design of humanitarian settlements has not been adapted to local contexts and perspectives, nor to the dynamics of urbanization and population growth and data. In addition, the current approach to temporary settlement harms the environment and can strain limited resources. Inefficient land use and ad hoc development models have compounded difficulties and generated new challenges. As a result, living conditions in settlements have deteriorated over the last few decades and continue to pose new challenges. The stakes are such that major shortcomings have emerged along the way, leading to disruption, budget overruns in a context marked by a steady decline in funding. However, some attempts have been made to shift towards more sustainable approaches, but these have mainly focused on vague, sector-oriented themes, failing to consider systematic and integration views. This study is a contribution in addressing these shortcomings by designing a model-driving solution, emphasizing an integrated system conceptualized as a system of systems. This paper proposes a new methodology for designing an integrated and sustainable human settlement model, based on Model-Based Systems Engineering and a Systems Modeling Language to provide valuable insights toward sustainable solutions for displaced populations aligning with the United Nations 2030 agenda for sustainable development.展开更多
To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select...To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select the appropriate language phrase set according to their own situation,give the preference information of the weight of each key indicator,and then transform the multi-granularity language information through consistency.On this basis,the sequential optimization technology of the approximately ideal scheme is introduced to obtain the weight coefficient of each key indicator.Subsequently,the weighted average operator is used to aggregate the preference information of each alternative scheme with the relative importance of decision-makers and the weight of key indicators in sequence,and the comprehensive evaluation value of each scheme is obtained to determine the optimal scheme.Lastly,the effectiveness and practicability of the method are verified by taking the earthwork collapse accident in the construction of a reservoir as an example.展开更多
文摘As Natural Language Processing(NLP)continues to advance,driven by the emergence of sophisticated large language models such as ChatGPT,there has been a notable growth in research activity.This rapid uptake reflects increasing interest in the field and induces critical inquiries into ChatGPT’s applicability in the NLP domain.This review paper systematically investigates the role of ChatGPT in diverse NLP tasks,including information extraction,Name Entity Recognition(NER),event extraction,relation extraction,Part of Speech(PoS)tagging,text classification,sentiment analysis,emotion recognition and text annotation.The novelty of this work lies in its comprehensive analysis of the existing literature,addressing a critical gap in understanding ChatGPT’s adaptability,limitations,and optimal application.In this paper,we employed a systematic stepwise approach following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses(PRISMA)framework to direct our search process and seek relevant studies.Our review reveals ChatGPT’s significant potential in enhancing various NLP tasks.Its adaptability in information extraction tasks,sentiment analysis,and text classification showcases its ability to comprehend diverse contexts and extract meaningful details.Additionally,ChatGPT’s flexibility in annotation tasks reducesmanual efforts and accelerates the annotation process,making it a valuable asset in NLP development and research.Furthermore,GPT-4 and prompt engineering emerge as a complementary mechanism,empowering users to guide the model and enhance overall accuracy.Despite its promising potential,challenges persist.The performance of ChatGP Tneeds tobe testedusingmore extensivedatasets anddiversedata structures.Subsequently,its limitations in handling domain-specific language and the need for fine-tuning in specific applications highlight the importance of further investigations to address these issues.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.
基金supported by the Competitive Research Fund of the University of Aizu,Japan.
文摘Sign language recognition is vital for enhancing communication accessibility among the Deaf and hard-of-hearing communities.In Japan,approximately 360,000 individualswith hearing and speech disabilities rely on Japanese Sign Language(JSL)for communication.However,existing JSL recognition systems have faced significant performance limitations due to inherent complexities.In response to these challenges,we present a novel JSL recognition system that employs a strategic fusion approach,combining joint skeleton-based handcrafted features and pixel-based deep learning features.Our system incorporates two distinct streams:the first stream extracts crucial handcrafted features,emphasizing the capture of hand and body movements within JSL gestures.Simultaneously,a deep learning-based transfer learning stream captures hierarchical representations of JSL gestures in the second stream.Then,we concatenated the critical information of the first stream and the hierarchy of the second stream features to produce the multiple levels of the fusion features,aiming to create a comprehensive representation of the JSL gestures.After reducing the dimensionality of the feature,a feature selection approach and a kernel-based support vector machine(SVM)were used for the classification.To assess the effectiveness of our approach,we conducted extensive experiments on our Lab JSL dataset and a publicly available Arabic sign language(ArSL)dataset.Our results unequivocally demonstrate that our fusion approach significantly enhances JSL recognition accuracy and robustness compared to individual feature sets or traditional recognition methods.
基金This Research is funded by Researchers Supporting Project Number(RSPD2024R947),King Saud University,Riyadh,Saudi Arabia.
文摘Software project outcomes heavily depend on natural language requirements,often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements.Researchers are exploring machine learning to predict software bugs,but a more precise and general approach is needed.Accurate bug prediction is crucial for software evolution and user training,prompting an investigation into deep and ensemble learning methods.However,these studies are not generalized and efficient when extended to other datasets.Therefore,this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems.The methods involved feature selection,which is used to reduce the dimensionality and redundancy of features and select only the relevant ones;transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets,and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model.Four National Aeronautics and Space Administration(NASA)and four Promise datasets are used in the study,showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve(AUC-ROC)values when different classifiers were combined.It reveals that using an amalgam of techniques such as those used in this study,feature selection,transfer learning,and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing,useful end mode.
基金supported by the National Natural Science Foundation of China Project(No.62302540),please visit their website at https://www.nsfc.gov.cn/(accessed on 18 June 2024)The Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020),Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 18 June 2024)Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422),you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 18 June 2024).
文摘In response to the challenges of generating Attribute-Based Access Control(ABAC)policies,this paper proposes a deep learning-based method to automatically generate ABAC policies from natural language documents.This method is aimed at organizations such as companies and schools that are transitioning from traditional access control models to the ABAC model.The manual retrieval and analysis involved in this transition are inefficient,prone to errors,and costly.Most organizations have high-level specifications defined for security policies that include a set of access control policies,which often exist in the form of natural language documents.Utilizing this rich source of information,our method effectively identifies and extracts the necessary attributes and rules for access control from natural language documents,thereby constructing and optimizing access control policies.This work transforms the problem of policy automation generation into two tasks:extraction of access control statements andmining of access control attributes.First,the Chat General Language Model(ChatGLM)isemployed to extract access control-related statements from a wide range of natural language documents by constructing unique prompts and leveraging the model’s In-Context Learning to contextualize the statements.Then,the Iterated Dilated-Convolutions-Conditional Random Field(ID-CNN-CRF)model is used to annotate access control attributes within these extracted statements,including subject attributes,object attributes,and action attributes,thus reassembling new access control policies.Experimental results show that our method,compared to baseline methods,achieved the highest F1 score of 0.961,confirming the model’s effectiveness and accuracy.
文摘Sentiment analysis, the meta field of Natural Language Processing (NLP), attempts to analyze and identify thesentiments in the opinionated text data. People share their judgments, reactions, and feedback on the internetusing various languages. Urdu is one of them, and it is frequently used worldwide. Urdu-speaking people prefer tocommunicate on social media in Roman Urdu (RU), an English scripting style with the Urdu language dialect.Researchers have developed versatile lexical resources for features-rich comprehensive languages, but limitedlinguistic resources are available to facilitate the sentiment classification of Roman Urdu. This effort encompassesextracting subjective expressions in Roman Urdu and determining the implied opinionated text polarity. Theprimary sources of the dataset are Daraz (an e-commerce platform), Google Maps, and the manual effort. Thecontributions of this study include a Bilingual Roman Urdu Language Detector (BRULD) and a Roman UrduSpelling Checker (RUSC). These integrated modules accept the user input, detect the text language, correct thespellings, categorize the sentiments, and return the input sentence’s orientation with a sentiment intensity score.The developed system gains strength with each input experience gradually. The results show that the languagedetector gives an accuracy of 97.1% on a close domain dataset, with an overall sentiment classification accuracy of94.3%.
基金funded by the Informatization Plan of Chinese Academy of Sciences(Grant No.CASWX2021SF-0102)the National Key R&D Program of China(Grant Nos.2022YFA1603903,2022YFA1403800,and 2021YFA0718700)+1 种基金the National Natural Science Foundation of China(Grant Nos.11925408,11921004,and 12188101)the Chinese Academy of Sciences(Grant No.XDB33000000)。
文摘The exponential growth of literature is constraining researchers’access to comprehensive information in related fields.While natural language processing(NLP)may offer an effective solution to literature classification,it remains hindered by the lack of labelled dataset.In this article,we introduce a novel method for generating literature classification models through semi-supervised learning,which can generate labelled dataset iteratively with limited human input.We apply this method to train NLP models for classifying literatures related to several research directions,i.e.,battery,superconductor,topological material,and artificial intelligence(AI)in materials science.The trained NLP‘battery’model applied on a larger dataset different from the training and testing dataset can achieve F1 score of 0.738,which indicates the accuracy and reliability of this scheme.Furthermore,our approach demonstrates that even with insufficient data,the not-well-trained model in the first few cycles can identify the relationships among different research fields and facilitate the discovery and understanding of interdisciplinary directions.
基金supported by the National Natural Science Foundation of China(Grant No.81974355 and No.82172524).
文摘Objective This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models(LLMs)in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.Methods This research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons(AAOS)and authoritative orthopedic publications.A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge,disease diagnosis,fracture classification,treatment options,and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4,ChatGLM,and Spark LLM,with their generated responses recorded.The overall quality,accuracy,and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.Results Compared with their unoptimized LLMs,the optimized version of GPT-4 showed improvements of 15.3%in overall quality,12.5%in accuracy,and 12.8%in comprehensiveness;ChatGLM showed improvements of 24.8%,16.1%,and 19.6%,respectively;and Spark LLM showed improvements of 6.5%,14.5%,and 24.7%,respectively.Conclusion The optimization of knowledge bases significantly enhances the quality,accuracy,and comprehensiveness of the responses provided by the 3 models in the orthopedic field.Therefore,knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.
文摘This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like reasoning+action and retrieval-augmented generation to improve accuracy and reliability.Emphasizing that simple question and answer testing is insufficient,it calls for more nuanced evaluation methods to truly gauge large language models’capabilities in clinical applications.
文摘Foreign language teaching practice is developing rapidly,but research on foreign language teacher learning is currently relatively fragmented and unstructured.The book Foreign Language Teacher Learning,written by Professor Kang Yan from Capital Normal University,published in September 2022,makes a systematic introduction to foreign language teacher learning,which to some extent makes up for this shortcoming.Her book presents the lineage of foreign language teacher learning research at home and abroad,analyzes both theoretical and practical aspects,reviews the cuttingedge research results,and foresees the future development trend,painting a complete research picture for researchers in the field of foreign language teaching and teacher education as well as front-line teachers interested in foreign language teacher learning.This is an important inspiration for conducting foreign language teacher learning research in the future.And this paper makes a review of the book from aspects such as its content,major characteristics,contributions and limitations.
文摘The problematic use of social media has numerous negative impacts on individuals'daily lives,interpersonal relationships,physical and mental health,and more.Currently,there are few methods and tools to alleviate problematic social media,and their potential is yet to be fully realized.Emerging large language models(LLMs)are becoming increasingly popular for providing information and assistance to people and are being applied in many aspects of life.In mitigating problematic social media use,LLMs such as ChatGPT can play a positive role by serving as conversational partners and outlets for users,providing personalized information and resources,monitoring and intervening in problematic social media use,and more.In this process,we should recognize both the enormous potential and endless possibilities of LLMs such as ChatGPT,leveraging their advantages to better address problematic social media use,while also acknowledging the limitations and potential pitfalls of ChatGPT technology,such as errors,limitations in issue resolution,privacy and security concerns,and potential overreliance.When we leverage the advantages of LLMs to address issues in social media usage,we must adopt a cautious and ethical approach,being vigilant of the potential adverse effects that LLMs may have in addressing problematic social media use to better harness technology to serve individuals and society.
文摘In this presentation,the author explores strategies and activities that create a welcoming and nurturing environment for LGBTQIA+foreign language students at the college level in the United States.Indeed,as it is commonly stated,there is a clear lack of visibility,acceptance/tolerance,and inclusion of LGBTQIA+voices(active and proactive)in foreign language teaching and learning(pedagogical material as well as in-class attitudes and delivery).The reasons for this deficiency are diverse since they cover not only outward discrimination against and/or indifference(passive-aggressive behavior)toward the LGBTQIA+community and its plight,but they also illustrate the sheer,physical inexistence of quality material that can be used to create,present,and foster a welcoming and all-encompassing classroom environment that addresses the diverse spectrum of the LGBTQIA+community while its members try to learn a foreign language and(positively)negotiate how to express their very identity in their new language of choice.Moreover,the scant material that is available in the United States as well as in other Western countries where the LGBTQIA+community by and large is accepted-at least from the legal point of view-hardly represents and covers the subject matter in a satisfactory way.To address this gap,the author will provide some examples in European Portuguese(EP)and Italian(IT),followed by a courtesy translation in English,to illustrate some positive activities and strategies for including LGBTQIA+-related subject matter in the foreign language classroom at the college level.It is his desire that instructors of other foreign languages duplicate in their respective vernaculars the strategies and activities herein presented in order to create a welcoming and safe classroom environment that embraces the LGBTQIA+community as it goes through the enjoyable journey of learning a new language and culture.
文摘Continuous development of technology provides an opportunity to incorporate feedback in online assessments.The mode of online instruction during the pandemic was the most significant survival change.Technology enabled every teacher and student to enter a virtual classroom to make sense of education.Feedback is part of language instruction and is a powerful key to improving students’learning performance.Feedback plays an influential and crucial role in teaching and learning.Feedback is an invaluable,ultimate learning tool for learners that aids them in not committing the same error again and creates impetus.Thus,knowing about formative exam feedback is students’right because quality feedback allures them.Given students’eagerness,providing feedback is considered a good practice to be followed by all the teaching faculty.Apropos of online feedback,the present study attempts to study how pedagogical agents provide online feedback in language assessments.The study also considers the characteristics of pedagogical conversational agents that are suitable for providing feedback in online language assessment.Simply put,the study encapsulates that screen agents play an essential role in students’motivation and acceptability of learning through feedback.
文摘This study examines the writing abilities of Iranian intermediate Korean learners,specifically their performance in the compositional writing section of the Test of Proficiency in Korean(TOPIK).By utilizing the many FACETS-Rasch Model,we meticulously analyzed nine writing samples from the 52nd TOPIK.These samples were evaluated using a modified rubric ranging from 0 to 3 based on predefined criteria for written composition.The results underscored that the sections on“appropriateness of spacing and spelling,”“relevance of vocabulary,”and“content diversity”presented the most significant challenges for the learners.On the other hand,“the quantity of writing”emerged as the least challenging aspect.These findings reveal substantial disparities in various aspects of writing proficiency among learners.The study not only pinpoints issue areas in writing skills,but also underscores the necessity of customized teaching strategies within the TOPIK framework to address these weaknesses.Consequently,it offers valuable insights that could bolster the effectiveness of teaching writing to Korean language learners.The findings of this study are not only significant for the field of language education,but also contribute to a deeper understanding of the challenges faced by intermediate learners and provide a roadmap for improving language instruction.
文摘Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks.
文摘This paper explores the lexical association patterns of English as a second language and their relationship with language proficiency.Through the vocabulary association test,the study analyzes the differences in vocabulary association between learners with different language levels.The participants were 100 non-native English-speaking un-dergraduate students from a top 200 university,such as the University of Nottingham,and a university outside the top 200,such as the University of Aberdeen;the two groups of learners differed in their vocabulary size and learning style.It was found that the two groups of learners differed significantly in vocabulary size,language background,and learning experience.In addition,the study raises three core questions:first,learners’lexical association patterns,second,dif-ferences in association among learners with different language proficiency levels,and third,other variables that affect vocabulary association ability.The limitations of the study are that reaction time was not measured and the influence of native language background on word association was not fully considered;future research should further explore these aspects.
文摘With the rapid development of artificial intelligence, large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. These models have great potential to enhance database query systems, enabling more intuitive and semantic query mechanisms. Our model leverages LLM’s deep learning architecture to interpret and process natural language queries and translate them into accurate database queries. The system integrates an LLM-powered semantic parser that translates user input into structured queries that can be understood by the database management system. First, the user query is pre-processed, the text is normalized, and the ambiguity is removed. This is followed by semantic parsing, where the LLM interprets the pre-processed text and identifies key entities and relationships. This is followed by query generation, which converts the parsed information into a structured query format and tailors it to the target database schema. Finally, there is query execution and feedback, where the resulting query is executed on the database and the results are returned to the user. The system also provides feedback mechanisms to improve and optimize future query interpretations. By using advanced LLMs for model implementation and fine-tuning on diverse datasets, the experimental results show that the proposed method significantly improves the accuracy and usability of database queries, making data retrieval easy for users without specialized knowledge.
基金2021 Ministry of Education Humanities and Social Sciences Research Youth Fund Project“Research on Chinese Classical Poetry and Ballads”Phased Results Perspectives of Linguistic Musicology(1912-2012)(21YJC760077)。
文摘From the perspective of linguistic musicology,this paper discusses the musical features of Yulinling:Farewell in Autumn,a representative work of Yong Liu,a famous lyricist of the Song Dynasty.Through a systematic analysis of the core elements of the poem,such as rhyme,timbre,key,and rhythm,it reveals the intrinsic connection between poetry and music and proposes a new way of thinking about the modern interpretation of ancient poetic songs.This paper not only enriches the understanding of Yong Liu’s works but also provides new theoretical support for modern music education of classical poetry.In addition,this paper explores the practical application value of poetic musicality analysis in cultural inheritance and modern adaptation of classical poems,providing new insights into the contemporary inheritance of cultural heritage.
文摘The challenge of transitioning from temporary humanitarian settlements to more sustainable human settlements is due to a significant increase in the number of forcibly displaced people over recent decades, difficulties in providing social services that meet the required standards, and the prolongation of emergencies. Despite this challenging context, short-term considerations continue to guide their planning and management rather than more integrated, longer-term perspectives, thus preventing viable, sustainable development. Over the years, the design of humanitarian settlements has not been adapted to local contexts and perspectives, nor to the dynamics of urbanization and population growth and data. In addition, the current approach to temporary settlement harms the environment and can strain limited resources. Inefficient land use and ad hoc development models have compounded difficulties and generated new challenges. As a result, living conditions in settlements have deteriorated over the last few decades and continue to pose new challenges. The stakes are such that major shortcomings have emerged along the way, leading to disruption, budget overruns in a context marked by a steady decline in funding. However, some attempts have been made to shift towards more sustainable approaches, but these have mainly focused on vague, sector-oriented themes, failing to consider systematic and integration views. This study is a contribution in addressing these shortcomings by designing a model-driving solution, emphasizing an integrated system conceptualized as a system of systems. This paper proposes a new methodology for designing an integrated and sustainable human settlement model, based on Model-Based Systems Engineering and a Systems Modeling Language to provide valuable insights toward sustainable solutions for displaced populations aligning with the United Nations 2030 agenda for sustainable development.
文摘To effectively deal with fuzzy and uncertain information in public engineering emergencies,an emergency decision-making method based on multi-granularity language information is proposed.Firstly,decision makers select the appropriate language phrase set according to their own situation,give the preference information of the weight of each key indicator,and then transform the multi-granularity language information through consistency.On this basis,the sequential optimization technology of the approximately ideal scheme is introduced to obtain the weight coefficient of each key indicator.Subsequently,the weighted average operator is used to aggregate the preference information of each alternative scheme with the relative importance of decision-makers and the weight of key indicators in sequence,and the comprehensive evaluation value of each scheme is obtained to determine the optimal scheme.Lastly,the effectiveness and practicability of the method are verified by taking the earthwork collapse accident in the construction of a reservoir as an example.