期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
Evaluating Privacy Leakage and Memorization Attacks on Large Language Models (LLMs) in Generative AI Applications 被引量:1
1
作者 Harshvardhan Aditya Siddansh Chawla +6 位作者 Gunika Dhingra Parijat Rai Saumil Sood Tanmay Singh Zeba Mohsin Wase arshdeep bahga Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2024年第5期421-447,共27页
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor... The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. We describe different black-box attacks from potential adversaries and study their impact on the amount and type of information that may be recovered from commonly used and deployed LLMs. Our research investigates the relationship between PII leakage, memorization, and factors such as model size, architecture, and the nature of attacks employed. The study utilizes two broad categories of attacks: PII leakage-focused attacks (auto-completion and extraction attacks) and memorization-focused attacks (various membership inference attacks). The findings from these investigations are quantified using an array of evaluative metrics, providing a detailed understanding of LLM vulnerabilities and the effectiveness of different attacks. 展开更多
关键词 Large Language Models PII Leakage Privacy Memorization OVERFITTING Membership Inference Attack (MIA)
下载PDF
Protecting LLMs against Privacy Attacks While Preserving Utility
2
作者 Gunika Dhingra Saumil Sood +2 位作者 Zeba Mohsin Wase arshdeep bahga Vijay K. Madisetti 《Journal of Information Security》 2024年第4期448-473,共26页
The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Infor... The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application. 展开更多
关键词 Large Language Models PII Leakage PRIVACY Memorization Membership Inference Attack (MIA) DEFENSES Generative Adversarial Networks (GANs) Synthetic Data
下载PDF
GUARDIAN: A Multi-Tiered Defense Architecture for Thwarting Prompt Injection Attacks on LLMs
3
作者 Parijat Rai Saumil Sood +1 位作者 Vijay K. Madisetti arshdeep bahga 《Journal of Software Engineering and Applications》 2024年第1期43-68,共26页
This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assist... This paper introduces a novel multi-tiered defense architecture to protect language models from adversarial prompt attacks. We construct adversarial prompts using strategies like role emulation and manipulative assistance to simulate real threats. We introduce a comprehensive, multi-tiered defense framework named GUARDIAN (Guardrails for Upholding Ethics in Language Models) comprising a system prompt filter, pre-processing filter leveraging a toxic classifier and ethical prompt generator, and pre-display filter using the model itself for output screening. Extensive testing on Meta’s Llama-2 model demonstrates the capability to block 100% of attack prompts. The approach also auto-suggests safer prompt alternatives, thereby bolstering language model security. Quantitatively evaluated defense layers and an ethical substitution mechanism represent key innovations to counter sophisticated attacks. The integrated methodology not only fortifies smaller LLMs against emerging cyber threats but also guides the broader application of LLMs in a secure and ethical manner. 展开更多
关键词 Large Language Models (LLMs) Adversarial Attack Prompt Injection Filter Defense Artificial Intelligence Machine Learning CYBERSECURITY
下载PDF
Smaller & Smarter: Score-Driven Network Chaining of Smaller Language Models
4
作者 Gunika Dhingra Siddansh Chawla +1 位作者 Vijay K. Madisetti arshdeep bahga 《Journal of Software Engineering and Applications》 2024年第1期23-42,共20页
With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily meas... With the continuous evolution and expanding applications of Large Language Models (LLMs), there has been a noticeable surge in the size of the emerging models. It is not solely the growth in model size, primarily measured by the number of parameters, but also the subsequent escalation in computational demands, hardware and software prerequisites for training, all culminating in a substantial financial investment as well. In this paper, we present novel techniques like supervision, parallelization, and scoring functions to get better results out of chains of smaller language models, rather than relying solely on scaling up model size. Firstly, we propose an approach to quantify the performance of a Smaller Language Models (SLM) by introducing a corresponding supervisor model that incrementally corrects the encountered errors. Secondly, we propose an approach to utilize two smaller language models (in a network) performing the same task and retrieving the best relevant output from the two, ensuring peak performance for a specific task. Experimental evaluations establish the quantitative accuracy improvements on financial reasoning and arithmetic calculation tasks from utilizing techniques like supervisor models (in a network of model scenario), threshold scoring and parallel processing over a baseline study. 展开更多
关键词 Large Language Models (LLMs) Smaller Language Models (SLMs) FINANCE NETWORKING Supervisor Model Scoring Function
下载PDF
Whispered Tuning: Data Privacy Preservation in Fine-Tuning LLMs through Differential Privacy
5
作者 Tanmay Singh Harshvardhan Aditya +1 位作者 Vijay K. Madisetti arshdeep bahga 《Journal of Software Engineering and Applications》 2024年第1期1-22,共22页
The proliferation of Large Language Models (LLMs) across various sectors underscored the urgency of addressing potential privacy breaches. Vulnerabilities, such as prompt injection attacks and other adversarial tactic... The proliferation of Large Language Models (LLMs) across various sectors underscored the urgency of addressing potential privacy breaches. Vulnerabilities, such as prompt injection attacks and other adversarial tactics, could make these models inadvertently disclose their training data. Such disclosures could compromise personal identifiable information, posing significant privacy risks. In this paper, we proposed a novel multi-faceted approach called Whispered Tuning to address privacy leaks in large language models (LLMs). We integrated a PII redaction model, differential privacy techniques, and an output filter into the LLM fine-tuning process to enhance confidentiality. Additionally, we introduced novel ideas like the Epsilon Dial for adjustable privacy budgeting for differentiated Training Phases per data handler role. Through empirical validation, including attacks on non-private models, we demonstrated the robustness of our proposed solution SecureNLP in safeguarding privacy without compromising utility. This pioneering methodology significantly fortified LLMs against privacy infringements, enabling responsible adoption across sectors. 展开更多
关键词 NLP Differential Privacy Adversarial Attacks Informed Decisions
下载PDF
Blockchain Platform for Industrial Internet of Things 被引量:45
6
作者 arshdeep bahga Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2016年第10期533-546,共14页
Internet of Things (IoT) are being adopted for industrial and manufacturing applications such as manufacturing automation, remote machine diagnostics, prognostic health management of industrial machines and supply cha... Internet of Things (IoT) are being adopted for industrial and manufacturing applications such as manufacturing automation, remote machine diagnostics, prognostic health management of industrial machines and supply chain management. Cloud-Based Manufacturing is a recent on-demand model of manufacturing that is leveraging IoT technologies. While Cloud-Based Manufacturing enables on-demand access to manufacturing resources, a trusted intermediary is required for transactions between the users who wish to avail manufacturing services. We present a decentralized, peer-to-peer platform called BPIIoT for Industrial Internet of Things based on the Block chain technology. With the use of Blockchain technology, the BPIIoT platform enables peers in a decentralized, trustless, peer-to-peer network to interact with each other without the need for a trusted intermediary. 展开更多
关键词 Internet of Things Blockchain Smart Contracts Cloud-Based Manufacturing
下载PDF
Object Detection Meets LLMs: Model Fusion for Safety and Security
7
作者 Zeba Mohsin Wase Vijay K. Madisetti arshdeep bahga 《Journal of Software Engineering and Applications》 2023年第12期672-684,共13页
This paper proposes a novel model fusion approach to enhance predictive capabilities of vision and language models by strategically integrating object detection and large language models. We have named this multimodal... This paper proposes a novel model fusion approach to enhance predictive capabilities of vision and language models by strategically integrating object detection and large language models. We have named this multimodal integration approach as VOLTRON (Vision Object Linguistic Translation for Responsive Observation and Narration). VOLTRON is aimed at improving responses for self-driving vehicles in detecting small objects crossing roads and identifying merged or narrower lanes. The models are fused using a single layer to provide LLaMA2 (Large Language Model Meta AI) with object detection probabilities from YoloV8-n (You Only Look Once) translated into sentences. Experiments using specialized datasets showed accuracy improvements up to 88.16%. We provide a comprehensive exploration of the theoretical aspects that inform our model fusion approach, detailing the fundamental principles upon which it is built. Moreover, we elucidate the intricacies of the methodologies employed for merging these two disparate models, shedding light on the techniques and strategies used. 展开更多
关键词 Computer Vision Large Language Models Self Driving Vehicles
下载PDF
A Value Token Transfer Protocol (VTTP) for Decentralized Finance
8
作者 arshdeep bahga Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2020年第11期303-311,共9页
<div style="text-align:justify;"> <span style="font-family:Verdana;">We present Value Token Transfer Protocol (VTTP), a decentralized finance protocol for exchange of value or tokens wi... <div style="text-align:justify;"> <span style="font-family:Verdana;">We present Value Token Transfer Protocol (VTTP), a decentralized finance protocol for exchange of value or tokens within and between participating blockchain networks, fiat bank accounts and fiat wallets. The protocol allows intra-chain or inter-chain transfers of cryptocurrencies or tokens. VTTP works in both client-server and peer-to-peer models. The protocol comprises receiving from a client a transfer request to transfer value in a form of a cryptocurrency or a token, determining if the transfer request is intra-chain or inter-chain, transmitting to the client a response to the transfer request, the response comprising a raw transaction, receiving from the client a response to the raw transaction wherein a private key of a user is used to sign the raw transaction, defining a signed transaction, verifying a signature of the signed transaction and broadcasting the signed transaction to the sending and receiving blockchain networks.</span> </div> 展开更多
关键词 Blockchain Decentralized Finance Open Finance
下载PDF
Result-as-a-Service (RaaS): Persistent Helper Functions in a Serverless Offering
9
作者 arshdeep bahga Vijay K. Madisetti Joel R. Corporan 《Journal of Software Engineering and Applications》 2020年第10期278-287,共10页
<div style="text-align:justify;"> <span style="font-family:Verdana;">Serverless Computing or Functions-as-a-Service (FaaS) is an execution model for cloud computing environments where t... <div style="text-align:justify;"> <span style="font-family:Verdana;">Serverless Computing or Functions-as-a-Service (FaaS) is an execution model for cloud computing environments where the cloud provider executes a piece of code (a function) by dynamically allocating resources. When a function has not been executed for a long time or is being executed for the first time, a new container has to be created, and the execution environment has to be initialized resulting in a cold start. Cold start can result in a higher latency. We propose a new computing and execution model for cloud environments called Result-as-a-Service (RaaS), which aims to reduce the computational cost and overhead while achieving high availability. In between successive calls to a function, a persistent function can help in successive calls by precomputing the functions for different possible arguments and then distributing the results when a matching function call is found.</span> </div> 展开更多
关键词 Serverless Computing Functions-as-a-Service Lambda Functions
下载PDF
Software Defined Things in Manufacturing Networks
10
作者 arshdeep bahga Vijay K. Madisetti +1 位作者 Raj K. Madisetti Andrew Dugenske 《Journal of Software Engineering and Applications》 2016年第9期425-438,共15页
IoT technologies are being rapidly adopted for manufacturing automation, remote machine diagnostics, prognostic health management of industrial machines and supply chain management. A recent on-demand model of manufac... IoT technologies are being rapidly adopted for manufacturing automation, remote machine diagnostics, prognostic health management of industrial machines and supply chain management. A recent on-demand model of manufacturing that is leveraging IoT technologies is called Cloud-Based Manufacturing. We propose a Software-Defined Industrial Internet of Things (SD-IIoT) platform for as a key enabler for cloud-manufacturing, allowing flexible integration of legacy shop floor equipment into the platform. SD-IIoT enables access to manufacturing resources and allows exchange of data between industrial machines and cloud-based manufacturing applications. 展开更多
关键词 Software-Defined Things Cloud-Based Manufacturing NETCONF LoRa
下载PDF
Synthetic Workload Generation for Cloud Computing Applications 被引量:1
11
作者 arshdeep bahga Vijay Krishna Madisetti 《Journal of Software Engineering and Applications》 2011年第7期396-410,共15页
We present techniques for characterization, modeling and generation of workloads for cloud computing applications. Methods for capturing the workloads of cloud computing applications in two different models - benchmar... We present techniques for characterization, modeling and generation of workloads for cloud computing applications. Methods for capturing the workloads of cloud computing applications in two different models - benchmark application and workload models are described. We give the design and implementation of a synthetic workload generator that accepts the benchmark and workload model specifications generated by the characterization and modeling of workloads of cloud computing applications. We propose the Georgia Tech Cloud Workload Specification Language (GT-CWSL) that provides a structured way for specification of application workloads. The GT-CWSL combines the specifications of benchmark and workload models to create workload specifications that are used by a synthetic workload generator to generate synthetic workloads for performance evaluation of cloud computing applications. 展开更多
关键词 SYNTHETIC WORKLOAD BENCHMARKING Analytical Modeling CLOUD Computing WORKLOAD Specification LANGUAGE
下载PDF
Cloud-Based Information Technology Framework for Data Driven Intelligent Transportation Systems
12
作者 arshdeep bahga Vijay K. Madisetti 《Journal of Transportation Technologies》 2013年第2期131-141,共11页
We present a novel cloud based IT framework, CloudTrack, for data driven intelligent transportation systems. We describe how the proposed framework can be leveraged for real-time fresh food supply tracking and monitor... We present a novel cloud based IT framework, CloudTrack, for data driven intelligent transportation systems. We describe how the proposed framework can be leveraged for real-time fresh food supply tracking and monitoring. CloudTrack allows efficient storage, processing and analysis of real-time location and sensor data collected from fresh food supply vehicles. This paper describes the architecture, design, and implementation of CloudTrack, and how the proposed cloud-based IT framework leverages the parallel computing capability of a computing cloud based on a large-scale distributed batch processing infrastructure. A dynamic vehicle routing approach is adopted where the alerts trigger the generation of new routes. CloudTrack provides the global information of the entire fleet of food supply vehicles and can be used to track and monitor a large number of vehicles in real-time. Our approach leverages the advantages of the IT capabilities of a computing cloud into the operations and supply chain. 展开更多
关键词 CLOUD COMPUTING Vehicle ROUTING Supply CHAIN TRACKING HADOOP
下载PDF
Performance Evaluation Approach for Multi-Tier Cloud Applications
13
作者 arshdeep bahga Vijay K. Madisetti 《Journal of Software Engineering and Applications》 2013年第2期74-83,共10页
Complex multi-tier applications deployed in cloud computing environments can experience rapid changes in their workloads. To ensure market readiness of such applications, adequate resources need to be provisioned so t... Complex multi-tier applications deployed in cloud computing environments can experience rapid changes in their workloads. To ensure market readiness of such applications, adequate resources need to be provisioned so that the applications can meet the demands of specified workload levels and at the same time ensure that service level agreements are met. Multi-tier cloud applications can have complex deployment configurations with load balancers, web servers, application servers and database servers. Complex dependencies may exist between servers in various tiers. To support provisioning and capacity planning decisions, performance testing approaches with synthetic workloads are used. Accuracy of a performance testing approach is determined by how closely the generated synthetic workloads mimic the realistic workloads. Since multi-tier applications can have varied deployment configurations and characteristic workloads, there is a need for a generic performance testing methodology that allows accurately modeling the performance of applications. We propose a methodology for performance testing of complex multi-tier applications. The workloads of multi-tier cloud applications are captured in two different models-benchmark application and workload models. An architecture model captures the deployment configurations of multi-tier applications. We propose a rapid deployment prototyping methodology that can help in choosing the best and most cost effective deployments for multi-tier applications that meet the specified performance requirements. We also describe a system bottleneck detection approach based on experimental evaluation of multi-tier applications. 展开更多
关键词 PERFORMANCE Modeling Synthetic WORKLOAD BENCHMARKING MULTI-TIER APPLICATIONS CLOUD Computing
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部