期刊文献+
共找到107篇文章
< 1 2 6 >
每页显示 20 50 100
Smaller & Smarter: Score-Driven Network Chaining of Smaller Language Models
1
作者 Gunika Dhingra Siddansh Chawla +1 位作者 Vijay K. Madisetti Arshdeep Bahga 《Journal of Software Engineering and Applications》 2024年第1期23-42,共20页
With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily meas... With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study. 展开更多
关键词 Large language models (LLMs) Smaller language models (SLMs) FINANCE NETWORKING Supervisor model Scoring Function
下载PDF
DeBERTa-GRU: Sentiment Analysis for Large Language Model
2
作者 Adel Assiri Abdu Gumaei +2 位作者 Faisal Mehmood Touqeer Abbas Sami Ullah 《Computers, Materials & Continua》 SCIE EI 2024年第6期4219-4236,共18页
Modern technological advancements have made social media an essential component of daily life.Social media allow individuals to share thoughts,emotions,and ideas.Sentiment analysis plays the function of evaluating whe... Modern technological advancements have made social media an essential component of daily life.Social media allow individuals to share thoughts,emotions,and ideas.Sentiment analysis plays the function of evaluating whether the sentiment of the text is positive,negative,neutral,or any other personal emotion to understand the sentiment context of the text.Sentiment analysis is essential in business and society because it impacts strategic decision-making.Sentiment analysis involves challenges due to lexical variation,an unlabeled dataset,and text distance correlations.The execution time increases due to the sequential processing of the sequence models.However,the calculation times for the Transformer models are reduced because of the parallel processing.This study uses a hybrid deep learning strategy to combine the strengths of the Transformer and Sequence models while ignoring their limitations.In particular,the proposed model integrates the Decoding-enhanced with Bidirectional Encoder Representations from Transformers(BERT)attention(DeBERTa)and the Gated Recurrent Unit(GRU)for sentiment analysis.Using the Decoding-enhanced BERT technique,the words are mapped into a compact,semantic word embedding space,and the Gated Recurrent Unit model can capture the distance contextual semantics correctly.The proposed hybrid model achieves F1-scores of 97%on the Twitter Large Language Model(LLM)dataset,which is much higher than the performance of new techniques. 展开更多
关键词 DeBERTa GRU Naive Bayes LSTM sentiment analysis large language model
下载PDF
Enhancing Relational Triple Extraction in Specific Domains:Semantic Enhancement and Synergy of Large Language Models and Small Pre-Trained Language Models
3
作者 Jiakai Li Jianpeng Hu Geng Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第5期2481-2503,共23页
In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple e... In the process of constructing domain-specific knowledge graphs,the task of relational triple extraction plays a critical role in transforming unstructured text into structured information.Existing relational triple extraction models facemultiple challenges when processing domain-specific data,including insufficient utilization of semantic interaction information between entities and relations,difficulties in handling challenging samples,and the scarcity of domain-specific datasets.To address these issues,our study introduces three innovative components:Relation semantic enhancement,data augmentation,and a voting strategy,all designed to significantly improve the model’s performance in tackling domain-specific relational triple extraction tasks.We first propose an innovative attention interaction module.This method significantly enhances the semantic interaction capabilities between entities and relations by integrating semantic information fromrelation labels.Second,we propose a voting strategy that effectively combines the strengths of large languagemodels(LLMs)and fine-tuned small pre-trained language models(SLMs)to reevaluate challenging samples,thereby improving the model’s adaptability in specific domains.Additionally,we explore the use of LLMs for data augmentation,aiming to generate domain-specific datasets to alleviate the scarcity of domain data.Experiments conducted on three domain-specific datasets demonstrate that our model outperforms existing comparative models in several aspects,with F1 scores exceeding the State of the Art models by 2%,1.6%,and 0.6%,respectively,validating the effectiveness and generalizability of our approach. 展开更多
关键词 Relational triple extraction semantic interaction large language models data augmentation specific domains
下载PDF
Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models
4
作者 Zheyi Chen Liuchang Xu +5 位作者 Hongting Zheng Luyao Chen Amr Tolba Liang Zhao Keping Yu Hailin Feng 《Computers, Materials & Continua》 SCIE EI 2024年第8期1753-1808,共56页
Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the ... Since the 1950s,when the Turing Test was introduced,there has been notable progress in machine language intelligence.Language modeling,crucial for AI development,has evolved from statistical to neural models over the last two decades.Recently,transformer-based Pre-trained Language Models(PLM)have excelled in Natural Language Processing(NLP)tasks by leveraging large-scale training corpora.Increasing the scale of these models enhances performance significantly,introducing abilities like context learning that smaller models lack.The advancement in Large Language Models,exemplified by the development of ChatGPT,has made significant impacts both academically and industrially,capturing widespread societal interest.This survey provides an overview of the development and prospects from Large Language Models(LLM)to Large Multimodal Models(LMM).It first discusses the contributions and technological advancements of LLMs in the field of natural language processing,especially in text generation and language understanding.Then,it turns to the discussion of LMMs,which integrates various data modalities such as text,images,and sound,demonstrating advanced capabilities in understanding and generating cross-modal content,paving new pathways for the adaptability and flexibility of AI systems.Finally,the survey highlights the prospects of LMMs in terms of technological development and application potential,while also pointing out challenges in data integration,cross-modal understanding accuracy,providing a comprehensive perspective on the latest developments in this field. 展开更多
关键词 Artificial intelligence large language models large multimodal models foundation models
下载PDF
LKPNR: Large Language Models and Knowledge Graph for Personalized News Recommendation Framework
5
作者 Hao Chen Runfeng Xie +4 位作者 Xiangyang Cui Zhou Yan Xin Wang Zhanwei Xuan Kai Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第6期4283-4296,共14页
Accurately recommending candidate news to users is a basic challenge of personalized news recommendation systems.Traditional methods are usually difficult to learn and acquire complex semantic information in news text... Accurately recommending candidate news to users is a basic challenge of personalized news recommendation systems.Traditional methods are usually difficult to learn and acquire complex semantic information in news texts,resulting in unsatisfactory recommendation results.Besides,these traditional methods are more friendly to active users with rich historical behaviors.However,they can not effectively solve the long tail problem of inactive users.To address these issues,this research presents a novel general framework that combines Large Language Models(LLM)and Knowledge Graphs(KG)into traditional methods.To learn the contextual information of news text,we use LLMs’powerful text understanding ability to generate news representations with rich semantic information,and then,the generated news representations are used to enhance the news encoding in traditional methods.In addition,multi-hops relationship of news entities is mined and the structural information of news is encoded using KG,thus alleviating the challenge of long-tail distribution.Experimental results demonstrate that compared with various traditional models,on evaluation indicators such as AUC,MRR,nDCG@5 and nDCG@10,the framework significantly improves the recommendation performance.The successful integration of LLM and KG in our framework has established a feasible way for achieving more accurate personalized news recommendation.Our code is available at https://github.com/Xuan-ZW/LKPNR. 展开更多
关键词 Large language models news recommendation knowledge graphs(KG)
下载PDF
Potential use of large language models for mitigating students’problematic social media use:ChatGPT as an example
6
作者 Xin-Qiao Liu Zi-Ru Zhang 《World Journal of Psychiatry》 SCIE 2024年第3期334-341,共8页
The problematic use of social media has numerous negative impacts on individuals'daily lives,interpersonal relationships,physical and mental health,and more.Currently,there are few methods and tools to alleviate p... The problematic use of social media has numerous negative impacts on individuals'daily lives,interpersonal relationships,physical and mental health,and more.Currently,there are few methods and tools to alleviate problematic social media,and their potential is yet to be fully realized.Emerging large language models(LLMs)are becoming increasingly popular for providing information and assistance to people and are being applied in many aspects of life.In mitigating problematic social media use,LLMs such as ChatGPT can play a positive role by serving as conversational partners and outlets for users,providing personalized information and resources,monitoring and intervening in problematic social media use,and more.In this process,we should recognize both the enormous potential and endless possibilities of LLMs such as ChatGPT,leveraging their advantages to better address problematic social media use,while also acknowledging the limitations and potential pitfalls of ChatGPT technology,such as errors,limitations in issue resolution,privacy and security concerns,and potential overreliance.When we leverage the advantages of LLMs to address issues in social media usage,we must adopt a cautious and ethical approach,being vigilant of the potential adverse effects that LLMs may have in addressing problematic social media use to better harness technology to serve individuals and society. 展开更多
关键词 Problematic use of social media Social media Large language models ChatGPT Chatbots
下载PDF
Evaluating Privacy Leakage and Memorization Attacks on Large Language Models (LLMs) in Generative AI Applications
7
作者 Harshvardhan Aditya Siddansh Chawla +6 位作者 Gunika Dhingra Parijat Rai Saumil Sood Tanmay Singh Zeba Mohsin Wase Arshdeep Bahga Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2024年第5期421-447,共27页
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor... The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks. 展开更多
关键词 Large language models PII Leakage Privacy Memorization OVERFITTING Membership Inference Attack (MIA)
下载PDF
Security Vulnerability Analyses of Large Language Models (LLMs) through Extension of the Common Vulnerability Scoring System (CVSS) Framework
8
作者 Alicia Biju Vishnupriya Ramesh Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2024年第5期340-358,共19页
Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, a... Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks. 展开更多
关键词 Common Vulnerability Scoring System (CVSS) Large language models (LLMs) DALL-E Prompt Injections Training Data Poisoning CVSS Metrics
下载PDF
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
9
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER Pre-trained language model
下载PDF
Joint On-Demand Pruning and Online Distillation in Automatic Speech Recognition Language Model Optimization
10
作者 Soonshin Seo Ji-Hwan Kim 《Computers, Materials & Continua》 SCIE EI 2023年第12期2833-2856,共24页
Automatic speech recognition(ASR)systems have emerged as indispensable tools across a wide spectrum of applications,ranging from transcription services to voice-activated assistants.To enhance the performance of these... Automatic speech recognition(ASR)systems have emerged as indispensable tools across a wide spectrum of applications,ranging from transcription services to voice-activated assistants.To enhance the performance of these systems,it is important to deploy efficient models capable of adapting to diverse deployment conditions.In recent years,on-demand pruning methods have obtained significant attention within the ASR domain due to their adaptability in various deployment scenarios.However,these methods often confront substantial trade-offs,particularly in terms of unstable accuracy when reducing the model size.To address challenges,this study introduces two crucial empirical findings.Firstly,it proposes the incorporation of an online distillation mechanism during on-demand pruning training,which holds the promise of maintaining more consistent accuracy levels.Secondly,it proposes the utilization of the Mogrifier long short-term memory(LSTM)language model(LM),an advanced iteration of the conventional LSTM LM,as an effective alternative for pruning targets within the ASR framework.Through rigorous experimentation on the ASR system,employing the Mogrifier LSTM LM and training it using the suggested joint on-demand pruning and online distillation method,this study provides compelling evidence.The results exhibit that the proposed methods significantly outperform a benchmark model trained solely with on-demand pruning methods.Impressively,the proposed strategic configuration successfully reduces the parameter count by approximately 39%,all the while minimizing trade-offs. 展开更多
关键词 Automatic speech recognition neural language model Mogrifier long short-term memory PRUNING DISTILLATION efficient deployment OPTIMIZATION joint training
下载PDF
Statistical Language Model for Chinese Text Proofreading
11
作者 张仰森 曹元大 《Journal of Beijing Institute of Technology》 EI CAS 2003年第4期441-445,共5页
Statistical language modeling techniques are investigated so as to construct a language model for Chinese text proofreading. After the defects of n-gram model are analyzed, a novel statistical language model for Chine... Statistical language modeling techniques are investigated so as to construct a language model for Chinese text proofreading. After the defects of n-gram model are analyzed, a novel statistical language model for Chinese text proofreading is proposed. This model takes full account of the information located before and after the target word wi, and the relationship between un-neighboring words w_i and w_j in linguistic environment(LE). First, the word association degree between w_i and w_j is defined by using the distance-weighted factor, w_j is l words apart from w_i in the LE, then Bayes formula is used to calculate the LE related degree of word w_i, and lastly, the LE related degree is taken as criterion to predict the reasonability of word w_i that appears in context. Comparing the proposed model with the traditional n-gram in a Chinese text automatic error detection system, the experiments results show that the error detection recall rate and precision rate of the system have been improved. 展开更多
关键词 statistical language model N-GRAM linguistic environment text proofreading
下载PDF
Language Model Using Differentiable Neural Computer Based on Forget Gate-Based Memory Deallocation
12
作者 Donghyun Lee Hosung Park +4 位作者 Soonshin Seo Changmin Kim Hyunsoo Son Gyujin Kim Ji-Hwan Kim 《Computers, Materials & Continua》 SCIE EI 2021年第7期537-551,共15页
A differentiable neural computer(DNC)is analogous to the Von Neumann machine with a neural network controller that interacts with an external memory through an attention mechanism.Such DNC’s offer a generalized metho... A differentiable neural computer(DNC)is analogous to the Von Neumann machine with a neural network controller that interacts with an external memory through an attention mechanism.Such DNC’s offer a generalized method for task-specific deep learning models and have demonstrated reliability with reasoning problems.In this study,we apply a DNC to a language model(LM)task.The LM task is one of the reasoning problems,because it can predict the next word using the previous word sequence.However,memory deallocation is a problem in DNCs as some information unrelated to the input sequence is not allocated and remains in the external memory,which degrades performance.Therefore,we propose a forget gatebased memory deallocation(FMD)method,which searches for the minimum value of elements in a forget gate-based retention vector.The forget gatebased retention vector indicates the retention degree of information stored in each external memory address.In experiments,we applied our proposed NTM architecture to LM tasks as a task-specific example and to rescoring for speech recognition as a general-purpose example.For LM tasks,we evaluated DNC using the Penn Treebank and enwik8 LM tasks.Although it does not yield SOTA results in LM tasks,the FMD method exhibits relatively improved performance compared with DNC in terms of bits-per-character.For the speech recognition rescoring tasks,FMD again showed a relative improvement using the LibriSpeech data in terms of word error rate. 展开更多
关键词 Forget gate-based memory deallocation differentiable neural computer language model forget gate-based retention vector
下载PDF
A Bit Progress on Word-Based Language Model
13
作者 陈勇 陈国评 《Journal of Shanghai University(English Edition)》 CAS 2003年第2期148-155,共8页
A good language model is essential to a postprocessing algorithm for recognition systems. In the past, researchers have presented various language models, such as character based language models, word based language m... A good language model is essential to a postprocessing algorithm for recognition systems. In the past, researchers have presented various language models, such as character based language models, word based language model, syntactical rules language model, hybrid models, etc . The word N gram model is by far an effective and efficient model, but one has to address the problem of data sparseness in establishing the model. Katz and Kneser et al. respectively presented effective remedies to solve this challenging problem. In this study, we proposed an improvement to their methods by incorporating Chinese language specific information or Chinese word class information into the system. 展开更多
关键词 language model pattern recognition Chinese character recognition.
下载PDF
Six-Writings multimodal processing with pictophonetic coding to enhance Chinese language models
14
作者 Li WEIGANG Mayara Chew MARINHO +1 位作者 Denise Leyi LI Vitor Vasconcelos DE OLIVEIRA 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第1期84-105,共22页
While large language models(LLMs)have made significant strides in natural language processing(NLP),they continue to face challenges in adequately addressing the intricacies of the Chinese language in certain scenarios... While large language models(LLMs)have made significant strides in natural language processing(NLP),they continue to face challenges in adequately addressing the intricacies of the Chinese language in certain scenarios.We propose a framework called Six-Writings multimodal processing(SWMP)to enable direct integration of Chinese NLP(CNLP)with morphological and semantic elements.The first part of SWMP,known as Six-Writings pictophonetic coding(SWPC),is introduced with a suitable level of granularity for radicals and components,enabling effective representation of Chinese characters and words.We conduct several experimental scenarios,including the following:(1)We establish an experimental database consisting of images and SWPC for Chinese characters,enabling dual-mode processing and matrix generation for CNLP.(2)We characterize various generative modes of Chinese words,such as thousands of Chinese idioms,used as question-and-answer(Q&A)prompt functions,facilitating analogies by SWPC.The experiments achieve 100%accuracy in answering all questions in the Chinese morphological data set(CA8-Mor-10177).(3)A fine-tuning mechanism is proposed to refine word embedding results using SWPC,resulting in an average relative error of≤25%for 39.37%of the questions in the Chinese wOrd Similarity data set(COS960).The results demonstrate that SWMP/SWPC methods effectively capture the distinctive features of Chinese and offer a promising mechanism to enhance CNLP with better efficiency. 展开更多
关键词 Chinese language model Chinese natural language processing(CNLP) Generative language model Multimodal processing Six-Writings
原文传递
The Life Cycle of Knowledge in Big Language Models:A Survey 被引量:1
15
作者 Boxi Cao Hongyu Lin +1 位作者 Xianpei Han Le Sun 《Machine Intelligence Research》 EI CSCD 2024年第2期217-238,共22页
Knowledge plays a critical role in artificial intelligence.Recently,the extensive success of pre-trained language models(PLMs)has raised significant attention about how knowledge can be acquired,maintained,updated and... Knowledge plays a critical role in artificial intelligence.Recently,the extensive success of pre-trained language models(PLMs)has raised significant attention about how knowledge can be acquired,maintained,updated and used by language models.Despite the enormous amount of related studies,there is still a lack of a unified view of how knowledge circulates within language models throughout the learning,tuning,and application processes,which may prevent us from further understanding the connections between current progress or realizing existing limitations.In this survey,we revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods,and investigating how knowledge circulates when it is built,maintained and used.To this end,we systematically review existing studies of each period of the knowledge life cycle,summarize the main challenges and current limitations,and discuss future directions. 展开更多
关键词 Pre-trained language model knowledge acquisition knowledge representation knowledge probing knowledge editing knowledge application
原文传递
FAIR Enough:Develop and Assess a FAIR-Compliant Dataset for Large Language Model Training?
16
作者 Shaina Raza Shardul Ghuge +2 位作者 Chen Ding Elham Dolatabadi Deval Pandya 《Data Intelligence》 EI 2024年第2期559-585,共27页
The rapid evolution of Large Language Models(LLMs) highlights the necessity for ethical considerations and data integrity in AI development, particularly emphasizing the role of FAIR(Findable, Accessible, Interoperabl... The rapid evolution of Large Language Models(LLMs) highlights the necessity for ethical considerations and data integrity in AI development, particularly emphasizing the role of FAIR(Findable, Accessible, Interoperable, Reusable) data principles. While these principles are crucial for ethical data stewardship, their specific application in the context of LLM training data remains an under-explored area. This research gap is the focus of our study, which begins with an examination of existing literature to underline the importance of FAIR principles in managing data for LLM training. Building upon this, we propose a novel frame-work designed to integrate FAIR principles into the LLM development lifecycle. A contribution of our work is the development of a comprehensive checklist intended to guide researchers and developers in applying FAIR data principles consistently across the model development process. The utility and effectiveness of our frame-work are validated through a case study on creating a FAIR-compliant dataset aimed at detecting and mitigating biases in LLMs. We present this framework to the community as a tool to foster the creation of technologically advanced, ethically grounded, and socially responsible AI models. 展开更多
关键词 Responsible Al Large language models FAIR data principles Ethical Al Biases
原文传递
RecBERT:Semantic recommendation engine with large language model enhanced query segmentation for k-nearest neighbors ranking retrieval
17
作者 Richard Wu 《Intelligent and Converged Networks》 EI 2024年第1期42-52,共11页
The increasing amount of user traffic on Internet discussion forums has led to a huge amount of unstructured natural language data in the form of user comments.Most modern recommendation systems rely on manual tagging... The increasing amount of user traffic on Internet discussion forums has led to a huge amount of unstructured natural language data in the form of user comments.Most modern recommendation systems rely on manual tagging,relying on administrators to label the features of a class,or story,which a user comment corresponds to.Another common approach is to use pre-trained word embeddings to compare class descriptions for textual similarity,then use a distance metric such as cosine similarity or Euclidean distance to find top k neighbors.However,neither approach is able to fully utilize this user-generated unstructured natural language data,reducing the scope of these recommendation systems.This paper studies the application of domain adaptation on a transformer for the set of user comments to be indexed,and the use of simple contrastive learning for the sentence transformer fine-tuning process to generate meaningful semantic embeddings for the various user comments that apply to each class.In order to match a query containing content from multiple user comments belonging to the same class,the construction of a subquery channel for computing class-level similarity is proposed.This channel uses query segmentation of the aggregate query into subqueries,performing k-nearest neighbors(KNN)search on each individual subquery.RecBERT achieves state-of-the-art performance,outperforming other state-of-the-art models in accuracy,precision,recall,and F1 score for classifying comments between four and eight classes,respectively.RecBERT outperforms the most precise state-of-the-art model(distilRoBERTa)in precision by 6.97%for matching comments between eight classes. 展开更多
关键词 sentence transformer simple contrastive learning large language models query segmentation k-nearest neighbors
原文传递
Kindergarten Teachers’Language Modeling Behaviors in Daily Activities and Free Play
18
作者 WU Qiong HU Biying GUAN Lin 《Frontiers of Education in China》 2024年第1期1-17,共17页
Teachers’language modeling behaviors,including frequent conversation,open-ended questions,repetition and extension,self-and parallel talks,and advanced language,have significantly impacted young children’s language ... Teachers’language modeling behaviors,including frequent conversation,open-ended questions,repetition and extension,self-and parallel talks,and advanced language,have significantly impacted young children’s language learning and development.This study examined 60 classrooms from 20 kindergartens in Guangzhou,China,and analyzed 62 films of daily activities and 57 videos of free play.It aims to address the research gap in existing research that pays little attention to teachers’language modeling behaviors in daily activities and free play.The results indicate that the more frequent teachers’language modeling behaviors,the larger the vocabulary young children use and the better their performance in lexical richness.However,such behaviors in daily activities and free play are infrequent and superficial,failing to guide young children’s language development effectively.To optimize teachers’language modeling behaviors in daily activities and free play,they are expected to establish positive emotional bonds with young children in a kind and respectful manner and receive training.Teachers are also encouraged to frequently communicate and engage in dialogues with young children,create contexts that facilitate the use of language,increase the frequency of stimuli for vocabulary learning,and guide and encourage young children’s advanced language. 展开更多
关键词 teachers’language modeling behaviors young children daily activities free play
原文传递
Evaluating the role of large language models in inflammatory bowel disease patient information
19
作者 Eun Jeong Gong Chang Seok Bang 《World Journal of Gastroenterology》 SCIE CAS 2024年第29期3538-3540,共3页
This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like r... This letter evaluates the article by Gravina et al on ChatGPT’s potential in providing medical information for inflammatory bowel disease patients.While promising,it highlights the need for advanced techniques like reasoning+action and retrieval-augmented generation to improve accuracy and reliability.Emphasizing that simple question and answer testing is insufficient,it calls for more nuanced evaluation methods to truly gauge large language models’capabilities in clinical applications. 展开更多
关键词 Crohn’s disease Ulcerative colitis Inflammatory bowel disease Chat generative pre-trained transformer Large language model Artificial intelligence
下载PDF
Large language models for human–robot interaction:A review 被引量:5
20
作者 Ceng Zhang Junxin Chen +2 位作者 Jiatong Li Yanhong Peng Zebing Mao 《Biomimetic Intelligence & Robotics》 EI 2023年第4期1-15,共15页
The fusion of large language models and robotic systems has introduced a transformative paradigm in human–robot interaction,offering unparalleled capabilities in natural language understanding and task execution.This... The fusion of large language models and robotic systems has introduced a transformative paradigm in human–robot interaction,offering unparalleled capabilities in natural language understanding and task execution.This review paper offers a comprehensive analysis of this nascent but rapidly evolving domain,spotlighting the recent advances of Large Language Models(LLMs)in enhancing their structures and performances,particularly in terms of multimodal input handling,high-level reasoning,and plan generation.Moreover,it probes the current methodologies that integrate LLMs into robotic systems for complex task completion,from traditional probabilistic models to the utilization of value functions and metrics for optimal decision-making.Despite these advancements,the paper also reveals the formidable challenges that confront the field,such as contextual understanding,data privacy and ethical considerations.To our best knowledge,this is the first study to comprehensively analyze the advances and considerations of LLMs in Human–Robot Interaction(HRI)based on recent progress,which provides potential avenues for further research. 展开更多
关键词 Large language models Human-robot interaction Task completion Considerations and challenges
原文传递
上一页 1 2 6 下一页 到第
使用帮助 返回顶部