Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, a...Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks.展开更多
This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assist...This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assistance to simulate real threats. We introduce a comprehensive, multi-tiered defense framework named GUARDIAN (Guardrails for Upholding Ethics in Language Models) comprising a system prompt filter, pre-processing filter leveraging a toxic classifier and ethical prompt generator, and pre-display filter using the model itself for output screening. Extensive testing on Meta’s Llama-2 model demonstrates the capability to block 100% of attack prompts. The approach also auto-suggests safer prompt alternatives, thereby bolstering language model security. Quantitatively evaluated defense layers and an ethical substitution mechanism represent key innovations to counter sophisticated attacks. The integrated methodology not only fortifies smaller LLMs against emerging cyber threats but also guides the broader application of LLMs in a secure and ethical manner.展开更多
在过去20年中,语言建模(Language models,LM)已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural language processing,NLP)领域下游的关键技术受到广泛关注.近年来,大语言模型(Large language models,LLMs),例如Ch...在过去20年中,语言建模(Language models,LM)已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural language processing,NLP)领域下游的关键技术受到广泛关注.近年来,大语言模型(Large language models,LLMs),例如ChatGPT等技术,取得了显著进展,对人工智能乃至其他领域的变革和发展产生了深远的影响.鉴于LLMs迅猛的发展,本文首先对LLMs相关技术架构和模型规模等方面的演进历程进行了全面综述,总结了模型训练方法、优化技术以及评估手段.随后,分析了LLMs在教育、医疗、金融、工业等领域的应用现状,同时讨论了它们的优势和局限性.此外,还探讨了大语言模型针对社会伦理、隐私和安全等方面引发的安全性与一致性问题及技术措施.最后,展望了大语言模型未来的研究趋势,包括模型的规模与效能、多模态处理、社会影响等方面的发展方向.本文通过全面分析当前研究状况和未来走向,旨在为研究者提供关于大语言模型的深刻见解和启发,以推动该领域的进一步发展.展开更多
Artificial Intelligence (AI) is transforming organizational dynamics, and revolutionizing corporate leadership practices. This research paper delves into the question of how AI influences corporate leadership, examini...Artificial Intelligence (AI) is transforming organizational dynamics, and revolutionizing corporate leadership practices. This research paper delves into the question of how AI influences corporate leadership, examining both its advantages and disadvantages. Positive impacts of AI are evident in communication, feedback systems, tracking mechanisms, and decision-making processes within organizations. AI-powered communication tools, as exemplified by Slack, facilitate seamless collaboration, transcending geographical barriers. Feedback systems, like Adobe’s Performance Management System, employ AI algorithms to provide personalized development opportunities, enhancing employee growth. AI-based tracking systems optimize resource allocation, as exemplified by studies like “AI-Based Tracking Systems: Enhancing Efficiency and Accountability.” Additionally, AI-powered decision support, demonstrated during the COVID-19 pandemic, showcases the capability to navigate complex challenges and maintain resilience. However, AI adoption poses challenges in human resources, potentially leading to job displacement and necessitating upskilling efforts. Managing AI errors becomes crucial, as illustrated by instances like Amazon’s biased recruiting tool. Data privacy concerns also arise, emphasizing the need for robust security measures. The proposed solution suggests leveraging Local Machine Learning Models (LLMs) to address data privacy issues. Approaches such as federated learning, on-device learning, differential privacy, and homomorphic encryption offer promising strategies. By exploring the evolving dynamics of AI and leadership, this research advocates for responsible AI adoption and proposes LLMs as a potential solution, fostering a balanced integration of AI benefits while mitigating associated risks in corporate settings.展开更多
With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily meas...With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.展开更多
大型语言模型(Large-scale Language Models,LLMs)在自然语言处理(Natural Language Processing,NLP)领域取得了显著的突破。民族语言学作为一门研究人类语言多样性、演变及其与文化关系的学科,与大型语言模型技术的结合将为语言学研究...大型语言模型(Large-scale Language Models,LLMs)在自然语言处理(Natural Language Processing,NLP)领域取得了显著的突破。民族语言学作为一门研究人类语言多样性、演变及其与文化关系的学科,与大型语言模型技术的结合将为语言学研究带来新的可能。通过深入分析大型语言模型技术在民族语言学研究领域的应用与影响,从民族语言资源建设、语言文本生成、语言翻译与对话系统、语言特征分析与挖掘、语言的演变与历史研究这5个方面入手,揭示大型语言模型技术在民族语言学研究领域所具有的广泛应用前景和深远影响。进一步分析大型语言模型技术在民族语言学研究中的潜力与价值,并探讨该研究方向对“有形”“有感”“有效”地增进民族认同感、增强民族自信心、促进民族团结,实现中华民族伟大复兴的实际应用价值和意义。展开更多
Channel prediction is an effective approach for reducing the feedback or estimation overhead in massive multi-input multi-output (m-MIMO) systems. However, existing channel prediction methods lack precision due to mod...Channel prediction is an effective approach for reducing the feedback or estimation overhead in massive multi-input multi-output (m-MIMO) systems. However, existing channel prediction methods lack precision due to model mismatch errors or network generalization issues. Large language models (LLMs) have demonstrated powerful modeling and generalization abilities, and have been successfully applied to cross-modal tasks, including the time series analysis. Leveraging the expressive power of LLMs, we propose a pre-trained LLM-empowered channel prediction(LLM4CP)method to predict the future downlink channel state information (CSI) sequence based on the historical uplink CSI sequence. We fine-tune the network while freezing most of the parameters of the pre-trained LLM for better cross-modality knowledge transfer. To bridge the gap between the channel data and the feature space of the LLM,preprocessor, embedding, and output modules are specifically tailored by taking into account unique channel characteristics. Simulations validate that the proposed method achieves state-of-the-art (SOTA) prediction performance on full-sample, few-shot, and generalization tests with low training and inference costs.展开更多
文摘Large Language Models (LLMs) have revolutionized Generative Artificial Intelligence (GenAI) tasks, becoming an integral part of various applications in society, including text generation, translation, summarization, and more. However, their widespread usage emphasizes the critical need to enhance their security posture to ensure the integrity and reliability of their outputs and minimize harmful effects. Prompt injections and training data poisoning attacks are two of the most prominent vulnerabilities in LLMs, which could potentially lead to unpredictable and undesirable behaviors, such as biased outputs, misinformation propagation, and even malicious content generation. The Common Vulnerability Scoring System (CVSS) framework provides a standardized approach to capturing the principal characteristics of vulnerabilities, facilitating a deeper understanding of their severity within the security and AI communities. By extending the current CVSS framework, we generate scores for these vulnerabilities such that organizations can prioritize mitigation efforts, allocate resources effectively, and implement targeted security measures to defend against potential risks.
文摘This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assistance to simulate real threats. We introduce a comprehensive, multi-tiered defense framework named GUARDIAN (Guardrails for Upholding Ethics in Language Models) comprising a system prompt filter, pre-processing filter leveraging a toxic classifier and ethical prompt generator, and pre-display filter using the model itself for output screening. Extensive testing on Meta’s Llama-2 model demonstrates the capability to block 100% of attack prompts. The approach also auto-suggests safer prompt alternatives, thereby bolstering language model security. Quantitatively evaluated defense layers and an ethical substitution mechanism represent key innovations to counter sophisticated attacks. The integrated methodology not only fortifies smaller LLMs against emerging cyber threats but also guides the broader application of LLMs in a secure and ethical manner.
文摘在过去20年中,语言建模(Language models,LM)已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural language processing,NLP)领域下游的关键技术受到广泛关注.近年来,大语言模型(Large language models,LLMs),例如ChatGPT等技术,取得了显著进展,对人工智能乃至其他领域的变革和发展产生了深远的影响.鉴于LLMs迅猛的发展,本文首先对LLMs相关技术架构和模型规模等方面的演进历程进行了全面综述,总结了模型训练方法、优化技术以及评估手段.随后,分析了LLMs在教育、医疗、金融、工业等领域的应用现状,同时讨论了它们的优势和局限性.此外,还探讨了大语言模型针对社会伦理、隐私和安全等方面引发的安全性与一致性问题及技术措施.最后,展望了大语言模型未来的研究趋势,包括模型的规模与效能、多模态处理、社会影响等方面的发展方向.本文通过全面分析当前研究状况和未来走向,旨在为研究者提供关于大语言模型的深刻见解和启发,以推动该领域的进一步发展.
文摘Artificial Intelligence (AI) is transforming organizational dynamics, and revolutionizing corporate leadership practices. This research paper delves into the question of how AI influences corporate leadership, examining both its advantages and disadvantages. Positive impacts of AI are evident in communication, feedback systems, tracking mechanisms, and decision-making processes within organizations. AI-powered communication tools, as exemplified by Slack, facilitate seamless collaboration, transcending geographical barriers. Feedback systems, like Adobe’s Performance Management System, employ AI algorithms to provide personalized development opportunities, enhancing employee growth. AI-based tracking systems optimize resource allocation, as exemplified by studies like “AI-Based Tracking Systems: Enhancing Efficiency and Accountability.” Additionally, AI-powered decision support, demonstrated during the COVID-19 pandemic, showcases the capability to navigate complex challenges and maintain resilience. However, AI adoption poses challenges in human resources, potentially leading to job displacement and necessitating upskilling efforts. Managing AI errors becomes crucial, as illustrated by instances like Amazon’s biased recruiting tool. Data privacy concerns also arise, emphasizing the need for robust security measures. The proposed solution suggests leveraging Local Machine Learning Models (LLMs) to address data privacy issues. Approaches such as federated learning, on-device learning, differential privacy, and homomorphic encryption offer promising strategies. By exploring the evolving dynamics of AI and leadership, this research advocates for responsible AI adoption and proposes LLMs as a potential solution, fostering a balanced integration of AI benefits while mitigating associated risks in corporate settings.
文摘With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study.
文摘大型语言模型(Large-scale Language Models,LLMs)在自然语言处理(Natural Language Processing,NLP)领域取得了显著的突破。民族语言学作为一门研究人类语言多样性、演变及其与文化关系的学科,与大型语言模型技术的结合将为语言学研究带来新的可能。通过深入分析大型语言模型技术在民族语言学研究领域的应用与影响,从民族语言资源建设、语言文本生成、语言翻译与对话系统、语言特征分析与挖掘、语言的演变与历史研究这5个方面入手,揭示大型语言模型技术在民族语言学研究领域所具有的广泛应用前景和深远影响。进一步分析大型语言模型技术在民族语言学研究中的潜力与价值,并探讨该研究方向对“有形”“有感”“有效”地增进民族认同感、增强民族自信心、促进民族团结,实现中华民族伟大复兴的实际应用价值和意义。
基金supported in part by the National Natural Science Foundation of China under Grants 62125101 and 62341101in part by the New Cornerstone Science Foundation through the XPLORER PRIZE+2 种基金in part by Guangdong Provincial Key Lab of Integrated Communication,Sensing and Computation for Ubiquitous Internet of Things under Grant 2023B1212010007in part by Guangzhou Municipal Science and Technology Project under Grant 2023A03J0011in part by Guangdong Provincial Department of Education Major Research Project under Grant 2023ZDZX1037.
文摘Channel prediction is an effective approach for reducing the feedback or estimation overhead in massive multi-input multi-output (m-MIMO) systems. However, existing channel prediction methods lack precision due to model mismatch errors or network generalization issues. Large language models (LLMs) have demonstrated powerful modeling and generalization abilities, and have been successfully applied to cross-modal tasks, including the time series analysis. Leveraging the expressive power of LLMs, we propose a pre-trained LLM-empowered channel prediction(LLM4CP)method to predict the future downlink channel state information (CSI) sequence based on the historical uplink CSI sequence. We fine-tune the network while freezing most of the parameters of the pre-trained LLM for better cross-modality knowledge transfer. To bridge the gap between the channel data and the feature space of the LLM,preprocessor, embedding, and output modules are specifically tailored by taking into account unique channel characteristics. Simulations validate that the proposed method achieves state-of-the-art (SOTA) prediction performance on full-sample, few-shot, and generalization tests with low training and inference costs.