计算机系柴云鹏老师的论文被MSST 2014录用为长文
近日,葡京娱乐场官方网站 计算机系柴云鹏老师的论文《PLC-Cache: Endurable SSD Cache for Deduplication-based Primary Storage》被MSST 2014 (30th International Conference on Massive Storage Systems and Technology)作为长文录用。MSST每年只接收14篇左右的长文,长文录用率只有10%左右,非常难得。
存储领域目前最重要的国际会议有两个:一是USENIX FAST,另一个是MSST。MSST的历史非常悠久,从存储系统的早期阶段就开始举办,目前已经有30年的历史,大多数作者来自美国的著名高校或IBM、Netapp等存储领域的顶尖企业,其研究成果对存储领域的发展起到重要推动作用,在工业届有很大影响力。会议将于2014年6月在美国硅谷中心地带的圣克拉拉大学举办。这是本校教师第一次在该会议上发表论文。
该文针对目前发展迅速的数据去重(data deduplication)存储系统,提出了PLC-Cache算法来解决其中SSD缓存所面临的写入量过多所带来的性能下降和寿命缩短的问题,PLC-Cache的性能比经典的LRU算法提高23.4%,并可以延长SSD缓存的使用寿命长达15.7倍。
论文的摘要为: Data deduplication techniques improve cost efficiency by dramatically reducing space needs of storage systems.SSD-based data cache has been adopted to remedy the declining I/O performance induced by deduplication operations in the latency-sensitive primary storage. Unfortunately, frequent data updates caused by classical cache algorithms (e.g., FIFO, LRU, and LFU) inevitably slow downSSDs’ I/O processing speed while significantly shortening SSDs’ lifetime. To address this problem, we propose a new approach – PLC-Cache – to greatly improve the I/O performance as well as write durability of SSDs. PLC-Cache is conducive to amplifying the proportion of the Popular and Long-term Cached (PLC) data,whichis infrequently written andkept in SSD cache in a long time period to catalyze cache hits, in an entire SSD written data set. PLC-Cache advocates a two-phase approach. First, non-popular data are ruled out from being written into SSDs. Second, PLC-Cache makes an effort to convertSSD written data into PLC-data as much as possible. Our experimental results based on a practical deduplication system indicate that compared with the existing caching schemes, PLC-Cache shortens data access latency by an average of 23.4%. Importantly, PLC-Cache improves the lifetime of SSD-based caches by reducing the amount of data written to SSDs by a factor of 15.7.