期刊文献+

S3ML:一种安全的机器学习推理服务系统 被引量:1

S3ML: Secure Serving System for Machine Learning Inference
下载PDF
导出
摘要 隐私保护问题在当今机器学习领域日益受到关注,构建具备数据安全保障的机器学习服务系统变得越来越重要.与此同时,以英特尔SGX为代表的可信执行环境技术得到了日益广泛的使用来开发可信应用和系统.SGX为开发者提供了基于硬件的名为飞地的安全容器来保障应用程序的机密性和完整性.本文基于SGX提出了一种面向机器学习推理的安全服务系统S3ML. S3ML将机器学习模型运行在SGX飞地中以保护用户隐私.为了构建一个实用的基于SGX的安全服务系统, S3ML解决了来自两方面的挑战.首先,机器学习推理服务为了保证高可用性和可扩展性,通常包含多个后端模型服务器实例.当这些实例在SGX飞地内运行时,需要新的系统架构和协议来同步证书及密钥,以构建安全的分布式飞地集群. S3ML设计了基于SGX认证机制的飞地配置服务,来专门负责在客户端和模型服务器实例之间生成、持久化和分发证书及密钥.这样S3ML可以复用现有的基础设施来对服务进行透明的负载均衡和故障转移,以确保服务的高可用性和可扩展性.其次, SGX飞地运行在一个名为飞地页面缓存(EPC)的特殊内存区域,该区域的大小有限,由主机上的所有SGX飞地竞争,运行在飞地中应用的性能因此易受到干扰.为了满足机器学习推理服务的服务级别目标,一方面S3ML使用轻量级的机器学习框架和模型来构建模型服务器以减少EPC消耗.另一方面,通过实验发现了使用EPC页交换吞吐量作为保障服务级别目标的间接监控指标是可行的.基于该发现, S3ML提出基于EPC页交换强度来控制服务的负载均衡和水平扩展活动.基于Kubernetes、TensorFlow Lite和Occlum实现了S3ML,并在一系列模型上进行实验,对S3ML的系统开销、可行性和有效性进行了评估. As the privacy-preserving problem gains increasing concerns in today’s machine learning(ML) world, constructing an ML serving system with a data security guarantee becomes very important. Meanwhile, trusted execution environments(e.g., Intel SGX) have been widely used for developing trusted applications and systems. For instance, Intel SGX offers developers hardware-based secure containers(i.e., enclaves) to guarantee application confidentiality and integrity. This paper presents S3ML, an SGX-based secure serving system for ML inference. S3ML leverages Intel SGX to host ML models for users’ privacy protection. To build a practical secure serving system, S3ML addresses several challenges to run model servers inside SGX enclaves. In order to ensure availability and scalability, a frontend ML inference service typically consists of many backend model server instances. When these instances are running inside SGX enclaves, new system architectures and protocols are in need to synchronize cryptographic certificates and keys to construct distributed secure enclave clusters. A dedicated module is designed, it is called attestation-based enclave configuration service in S3ML, responsible for generating, persisting, and distributing certificates and keys among clients and model server instances. The existing infrastructure can then be reused to do transparent load balancing and failover to ensure service high-availability and scalability. Besides, SGX enclaves rely on a special memory region called the enclave page cache(EPC), which has a limited size and is contended by a host’s all enclaves.Therefore, the performance of SGX-based applications is vulnerable to EPC interferences. To satisfy the service-level objective(SLO) of ML inference services, S3ML first integrates lightweight ML framework/models to reduce EPC consumption. Furthermore, through offline analysis, it is found feasible to use EPC paging throughput as indirect monitoring metric to satisfy SLO. Based on this result, S3ML uses real-time EPC paging information to control service load balancing and scaling activities for SLO satisfaction. S3ML has been implemented based on Kubernetes, TensorFlow Lite, and Occlum. The system overhead, feasibility, and effectiveness of S3ML are demonstrated through extensive experiments on a series of popular ML models.
作者 马俊明 吴秉哲 余超凡 周爱辉 巫锡斌 陈向群 MA Jun-Ming;WU Bing-Zhe;YU Chao-Fan;ZHOU Ai-Hui;WU Xi-Bin;CHEN Xiang-Qun(School of Software and Microelectronics,Peking University,Beijing 102600,China;School of Electronics Engineering and Computer Science,Peking University,Beijing 100871,China;Key Laboratory of High Confidence Software Technologies of Ministry of Education(Peking University),Beijing 100871,China;Ant Group,Hangzhou 310013,China)
出处 《软件学报》 EI CSCD 北大核心 2022年第9期3312-3330,共19页 Journal of Software
基金 国家重点研发计划(2017YFE0123600)。
关键词 机器学习推理 服务系统 SGX 可信计算 隐私保护 machine learning inference serving system SGX trusted computing privacy-preserving
  • 相关文献

参考文献3

二级参考文献11

共引文献96

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部