基于固态硬盘(solid-state drive,SSD)和硬盘(hard disk drive,HDD)混合存储的数据中心已经成为大数据计算领域的高性能载体,数据中心负载应该可将不同特性的数据按需持久化到SSD或HDD,以提升系统整体性能.Spark是目前产业界广泛使用的...基于固态硬盘(solid-state drive,SSD)和硬盘(hard disk drive,HDD)混合存储的数据中心已经成为大数据计算领域的高性能载体,数据中心负载应该可将不同特性的数据按需持久化到SSD或HDD,以提升系统整体性能.Spark是目前产业界广泛使用的高效大数据计算框架,尤其适用于多次迭代计算的应用领域,其原因在于Spark可以将中间数据持久化在内存或硬盘中,且持久化数据到硬盘打破了内存容量不足对数据集规模的限制.然而,当前的Spark实现并未专门提供显式的面向SSD的持久化接口,尽管可根据配置信息将数据按比例分布到不同的存储介质中,但是用户无法根据数据特征按需指定RDD的持久化存储介质,针对性和灵活性不足.这不仅成为进一步提升Spark性能的瓶颈,而且严重影响了混合存储系统性能的发挥.有鉴于此,首次提出面向SSD的数据持久化策略.探索了Spark数据持久化原理,基于混合存储系统优化了Spark的持久化架构,最终通过提供特定的持久化API实现用户可显式、灵活指定RDD的持久化介质.基于SparkBench的实验结果表明,经本方案优化后的Spark与原生版本相比,其性能平均提升14.02%.展开更多
Indoor localization is very critical for medical care applications, e.g., the patient localization or tracking inside the building of the hospital. Traditional Radio Frequency Identification(RFID) technologies are ver...Indoor localization is very critical for medical care applications, e.g., the patient localization or tracking inside the building of the hospital. Traditional Radio Frequency Identification(RFID) technologies are very popular in this area since their cost is very low. In such technologies, each tag acts as the transmitter and the Radio Signal Strength Indicator(RSSI) information is measured from the readers. However, RSSI information suffers severely from the multi- path phenomenon. As a result, if in a very large area, the localization accuracy will be affected seriously. In order to solve this problem, we introduce Wireless Sensor Networks(WSNs) with only a few nodes, each of which acts as both transmitter and receiver. In such networks, the change of signal strength(referred as dynamic of RSSI) is leveraged to select a cluster of reference tags as candidates. Then the fi nal target location is estimated by using the RSSI relationships between the target tag and candidate reference tags. Thus, the localization accuracy and scalability are able to be improved. We proposed two algorithms, SA-LANDMARC, and COCKTAIL. Experiments show that the localization accuracy of the two algorithms can reach 0.7m and 0.45 m, respectively. Compared to most traditional Radio Frequency(RF)-based approaches, the localization accuracy is improved at least 50%.展开更多
文摘基于固态硬盘(solid-state drive,SSD)和硬盘(hard disk drive,HDD)混合存储的数据中心已经成为大数据计算领域的高性能载体,数据中心负载应该可将不同特性的数据按需持久化到SSD或HDD,以提升系统整体性能.Spark是目前产业界广泛使用的高效大数据计算框架,尤其适用于多次迭代计算的应用领域,其原因在于Spark可以将中间数据持久化在内存或硬盘中,且持久化数据到硬盘打破了内存容量不足对数据集规模的限制.然而,当前的Spark实现并未专门提供显式的面向SSD的持久化接口,尽管可根据配置信息将数据按比例分布到不同的存储介质中,但是用户无法根据数据特征按需指定RDD的持久化存储介质,针对性和灵活性不足.这不仅成为进一步提升Spark性能的瓶颈,而且严重影响了混合存储系统性能的发挥.有鉴于此,首次提出面向SSD的数据持久化策略.探索了Spark数据持久化原理,基于混合存储系统优化了Spark的持久化架构,最终通过提供特定的持久化API实现用户可显式、灵活指定RDD的持久化介质.基于SparkBench的实验结果表明,经本方案优化后的Spark与原生版本相比,其性能平均提升14.02%.
基金supported in part by China NSFC Grant 61202377 and 61170076the Guangdong Natural Science Foundation under Grant 2014A030313553+2 种基金the China National High Technology Research and Development Program 863, under Grant 2015AA015305Joint Funds of the National Natural Science Foundation of China under Grant U1301252Guangdong Province Key Laboratory Project under grant 2012A061400024
文摘Indoor localization is very critical for medical care applications, e.g., the patient localization or tracking inside the building of the hospital. Traditional Radio Frequency Identification(RFID) technologies are very popular in this area since their cost is very low. In such technologies, each tag acts as the transmitter and the Radio Signal Strength Indicator(RSSI) information is measured from the readers. However, RSSI information suffers severely from the multi- path phenomenon. As a result, if in a very large area, the localization accuracy will be affected seriously. In order to solve this problem, we introduce Wireless Sensor Networks(WSNs) with only a few nodes, each of which acts as both transmitter and receiver. In such networks, the change of signal strength(referred as dynamic of RSSI) is leveraged to select a cluster of reference tags as candidates. Then the fi nal target location is estimated by using the RSSI relationships between the target tag and candidate reference tags. Thus, the localization accuracy and scalability are able to be improved. We proposed two algorithms, SA-LANDMARC, and COCKTAIL. Experiments show that the localization accuracy of the two algorithms can reach 0.7m and 0.45 m, respectively. Compared to most traditional Radio Frequency(RF)-based approaches, the localization accuracy is improved at least 50%.