Byte-addressable non-volatile memory(NVM),as a new participant in the storage hierarchy,gives extremely high performance in storage,which forces changes to be made on current filesystem designs.Page cache,once a signi...Byte-addressable non-volatile memory(NVM),as a new participant in the storage hierarchy,gives extremely high performance in storage,which forces changes to be made on current filesystem designs.Page cache,once a significant mechanism filling the performance gap between Dynamic Random Access Memory(DRAM)and block devices,is now a liability that heavily hinders the writing performance of NVM filesystems.Therefore state-of-the-art NVM filesystems leverage the direct access(DAX)technology to bypass the page cache entirely.However,the DRAM still provides higher bandwidth than NVM,which prevents skewed read workloads from benefiting from a higher bandwidth of the DRAM and leads to sub-optimal performance for the system.In this paper,we propose RCache,a readintensive workload-aware page cache for NVM filesystems.Different from traditional caching mechanisms where all reads go through DRAM,RCache uses a tiered page cache design,including assigning DRAM and NVM to hot and cold data separately,and reading data from both sides.To avoid copying data to DRAM in a critical path,RCache migrates data from NVM to DRAM in a background thread.Additionally,RCache manages data in DRAM in a lock-free manner for better latency and scalability.Evaluations on Intel Optane Data Center(DC)Persistent Memory Modules show that,compared with NOVA,RCache achieves 3 times higher bandwidth for read-intensive workloads and introduces little performance loss for write operations.展开更多
英特尔于2019年4月正式发布基于3D-Xpoint技术的傲腾持久性内存(Optane DC persistent memory),这为构建高效的持久性内存存储系统提供了新的机遇.然而,现有的存储系统软件并不能很好地利用其字节寻址特性,持久性内存性能很难充分发挥....英特尔于2019年4月正式发布基于3D-Xpoint技术的傲腾持久性内存(Optane DC persistent memory),这为构建高效的持久性内存存储系统提供了新的机遇.然而,现有的存储系统软件并不能很好地利用其字节寻址特性,持久性内存性能很难充分发挥.提出一种文件系统数据页的混合管理机制HDPM,通过选择性使用写时复制机制和日志结构管理文件数据,充分发挥持久性内存字节可寻址特性,从而避免了传统单一模式在非对齐写或者小写造成的写放大问题.为避免影响读性能,HDPM引入逆向扫描机制,实现日志结构重构数据页时不引入额外数据拷贝.HDPM还提出一种多重垃圾回收机制进行日志清理.当单个日志结构过大时,通过读写流程主动回收日志结构;当持久性内存空间受限时,则通过后台线程使用免锁机制异步释放日志空间.实验显示,HDPM相比于NOVA文件系统,单线程写延迟降低达58%,且读延迟不受影响;Filebench多线程测试显示,HDPM相比于NOVA提升吞吐率33%.展开更多
Cloud computing faces a series of challenges,such as insufficient bandwidth,unsatisfactory real-time,privacy protection,and energy consumption.To overcome the challenges,edge computing emerges.Edge computing refers to...Cloud computing faces a series of challenges,such as insufficient bandwidth,unsatisfactory real-time,privacy protection,and energy consumption.To overcome the challenges,edge computing emerges.Edge computing refers to a process where the open platform that converges the core capabilities of networks,computing,storage,and applications provides intelligent services at the network edge near the source of the objects or data to meet the critical requirements for agile connection,real-time services,data optimization,application intelligence,security and privacy protection of industry digitization.Edge computing consists of three elements:edge,computing,and intelligence.Edge computing and the Internet of Things(IoT)mutually create,and edge computing and cloud computing complement each other.In the architecture of edge computing,resources are distributed to the edge nodes,and therefore the storage system is near users while the computation function is near data.In this way,the stress on the backbone network can be lessened.With this architecture,the existing key technologies for computation,networks,and storage will change significantly.ZTE’s edge computing solutions can ensure the service quality of operators and greatly enhance the experience of mobile users.展开更多
Traditional named entity recognition methods need professional domain knowl-edge and a large amount of human participation to extract features,as well as the Chinese named entity recognition method based on a neural n...Traditional named entity recognition methods need professional domain knowl-edge and a large amount of human participation to extract features,as well as the Chinese named entity recognition method based on a neural network model,which brings the prob-lem that vector representation is too singular in the process of character vector representa-tion.To solve the above problem,we propose a Chinese named entity recognition method based on the BERT-BiLSTM-ATT-CRF model.Firstly,we use the bidirectional encoder representations from transformers(BERT)pre-training language model to obtain the se-mantic vector of the word according to the context information of the word;Secondly,the word vectors trained by BERT are input into the bidirectional long-term and short-term memory network embedded with attention mechanism(BiLSTM-ATT)to capture the most important semantic information in the sentence;Finally,the conditional random field(CRF)is used to learn the dependence between adjacent tags to obtain the global optimal sentence level tag sequence.The experimental results show that the proposed model achieves state-of-the-art performance on both Microsoft Research Asia(MSRA)corpus and people’s daily corpus,with F1 values of 94.77% and 95.97% respectively.展开更多
基金supported by ZTE Industry⁃University⁃Institute Coopera⁃tion Funds under Grant No.HC⁃CN⁃20181128026.
文摘Byte-addressable non-volatile memory(NVM),as a new participant in the storage hierarchy,gives extremely high performance in storage,which forces changes to be made on current filesystem designs.Page cache,once a significant mechanism filling the performance gap between Dynamic Random Access Memory(DRAM)and block devices,is now a liability that heavily hinders the writing performance of NVM filesystems.Therefore state-of-the-art NVM filesystems leverage the direct access(DAX)technology to bypass the page cache entirely.However,the DRAM still provides higher bandwidth than NVM,which prevents skewed read workloads from benefiting from a higher bandwidth of the DRAM and leads to sub-optimal performance for the system.In this paper,we propose RCache,a readintensive workload-aware page cache for NVM filesystems.Different from traditional caching mechanisms where all reads go through DRAM,RCache uses a tiered page cache design,including assigning DRAM and NVM to hot and cold data separately,and reading data from both sides.To avoid copying data to DRAM in a critical path,RCache migrates data from NVM to DRAM in a background thread.Additionally,RCache manages data in DRAM in a lock-free manner for better latency and scalability.Evaluations on Intel Optane Data Center(DC)Persistent Memory Modules show that,compared with NOVA,RCache achieves 3 times higher bandwidth for read-intensive workloads and introduces little performance loss for write operations.
文摘持久性内存(persistent memory,PMEM)同时具备内存的低时延字节寻址和磁盘的持久化特性,将对现有软件架构体系产生革命性的变化和深远的影响.分布式存储在云计算和数据中心得到了广泛的应用,然而现有的以Ceph BlueStore为代表的后端存储引擎是面向传统机械盘和固态硬盘(solid state disk,SSD)设计的,其原有的优化设计机制不适合PMEM特性优势的发挥.提出了一种基于持久性内存和SSD的后端存储MixStore,通过易失区段标记和待删除列表技术实现了适用于持久性内存的并发跳表,用于替代RocksDB实现元数据管理机制,在保证事务一致性的同时,消除了BlueStore的compaction所引发的性能抖动等问题,同时提升元数据的并发访问性能;通过结合元数据管理机制的数据对象存储优化设计,把非对齐的小数据对象存放在PMEM中,把对齐的大块数据对象存储在SSD上,充分发挥了PMEM的字节寻址、持久性特性和SSD的大容量低成本优势,并结合延迟写入和CoW(copy-on-write)技术实现数据更新策略优化,消除了BlueStore的WAL日志引起的写放大,提升小数据写入性能.测试结果表明,在同样的硬件环境下,相比BlueStore,MixStore的写吞吐提升59%,写时延降低了37%,有效地提升了系统的性能.
文摘英特尔于2019年4月正式发布基于3D-Xpoint技术的傲腾持久性内存(Optane DC persistent memory),这为构建高效的持久性内存存储系统提供了新的机遇.然而,现有的存储系统软件并不能很好地利用其字节寻址特性,持久性内存性能很难充分发挥.提出一种文件系统数据页的混合管理机制HDPM,通过选择性使用写时复制机制和日志结构管理文件数据,充分发挥持久性内存字节可寻址特性,从而避免了传统单一模式在非对齐写或者小写造成的写放大问题.为避免影响读性能,HDPM引入逆向扫描机制,实现日志结构重构数据页时不引入额外数据拷贝.HDPM还提出一种多重垃圾回收机制进行日志清理.当单个日志结构过大时,通过读写流程主动回收日志结构;当持久性内存空间受限时,则通过后台线程使用免锁机制异步释放日志空间.实验显示,HDPM相比于NOVA文件系统,单线程写延迟降低达58%,且读延迟不受影响;Filebench多线程测试显示,HDPM相比于NOVA提升吞吐率33%.
文摘Cloud computing faces a series of challenges,such as insufficient bandwidth,unsatisfactory real-time,privacy protection,and energy consumption.To overcome the challenges,edge computing emerges.Edge computing refers to a process where the open platform that converges the core capabilities of networks,computing,storage,and applications provides intelligent services at the network edge near the source of the objects or data to meet the critical requirements for agile connection,real-time services,data optimization,application intelligence,security and privacy protection of industry digitization.Edge computing consists of three elements:edge,computing,and intelligence.Edge computing and the Internet of Things(IoT)mutually create,and edge computing and cloud computing complement each other.In the architecture of edge computing,resources are distributed to the edge nodes,and therefore the storage system is near users while the computation function is near data.In this way,the stress on the backbone network can be lessened.With this architecture,the existing key technologies for computation,networks,and storage will change significantly.ZTE’s edge computing solutions can ensure the service quality of operators and greatly enhance the experience of mobile users.
文摘Traditional named entity recognition methods need professional domain knowl-edge and a large amount of human participation to extract features,as well as the Chinese named entity recognition method based on a neural network model,which brings the prob-lem that vector representation is too singular in the process of character vector representa-tion.To solve the above problem,we propose a Chinese named entity recognition method based on the BERT-BiLSTM-ATT-CRF model.Firstly,we use the bidirectional encoder representations from transformers(BERT)pre-training language model to obtain the se-mantic vector of the word according to the context information of the word;Secondly,the word vectors trained by BERT are input into the bidirectional long-term and short-term memory network embedded with attention mechanism(BiLSTM-ATT)to capture the most important semantic information in the sentence;Finally,the conditional random field(CRF)is used to learn the dependence between adjacent tags to obtain the global optimal sentence level tag sequence.The experimental results show that the proposed model achieves state-of-the-art performance on both Microsoft Research Asia(MSRA)corpus and people’s daily corpus,with F1 values of 94.77% and 95.97% respectively.