Codes of Open Source Software(OSS)are widely reused during software development nowadays.However,reusing some specific versions of OSS introduces 1-day vulnerabilities of which details are publicly available,which may...Codes of Open Source Software(OSS)are widely reused during software development nowadays.However,reusing some specific versions of OSS introduces 1-day vulnerabilities of which details are publicly available,which may be exploited and lead to serious security issues.Existing state-of-the-art OSS reuse detection work can not identify the specific versions of reused OSS well.The features they selected are not distinguishable enough for version detection and the matching scores are only based on similarity.This paper presents B2SMatcher,a fine-grained version identification tool for OSS in commercial off-the-shelf(COTS)software.We first discuss five kinds of version-sensitive code features that are trackable in both binary and source code.We categorize these features into program-level features and function-level features and propose a two-stage version identification approach based on the two levels of code features.B2SMatcher also identifies different types of OSS version reuse based on matching scores and matched feature instances.In order to extract source code features as accurately as possible,B2SMatcher innovatively uses machine learning methods to obtain the source files involved in the compilation and uses function abstraction and normalization methods to eliminate the comparison costs on redundant functions across versions.We have evaluated B2SMatcher using 6351 candidate OSS versions and 585 binaries.The result shows that B2SMatcher achieves a high precision up to 89.2%and outperforms state-of-the-art tools.Finally,we show how B2SMatcher can be used to evaluate real-world software and find some security risks in practice.展开更多
The quality of measurement data is critical to the accuracy of both outdoor and indoor localization methods.Due to the inevitable measurement error,the analytics on the error data is critical to evaluate localization ...The quality of measurement data is critical to the accuracy of both outdoor and indoor localization methods.Due to the inevitable measurement error,the analytics on the error data is critical to evaluate localization methods and to find the effective ones.For indoor localization,Received Signal Strength(RSS)is a convenient and low-cost measurement that has been adopted in many localization approaches.However,using RSS data for localization needs to solve a fundamental problem,that is,how accurate are these methods?The reason of the low accuracy of the current RSS-based localization methods is the oversimplified analysis on RSS measurement data.In this proposed work,we adopt a generalized measurement model to find optimal estimators whose estimated error is equal to the Cram′er-Rao Lower Bound(CRLB).Through mathematical techniques,the key factors that affect the accuracy of RSS-based localization methods are revealed,and the analytics expression that discloses the proportional relationship between the localization accuracy and these factors is derived.The significance of our discovery has two folds:First,we present a general expression for localization error data analytics,which can explain and predict the accuracy of range-based localization algorithms;second,the further study on the general analytics expression and its minimum can be used to optimize current localization algorithms.展开更多
基金the National Natural Science Foundation of China(Grant No.61802394,U1836209)Key Program of the National Natural Science Foundation of China(Grant No.62032010).
文摘Codes of Open Source Software(OSS)are widely reused during software development nowadays.However,reusing some specific versions of OSS introduces 1-day vulnerabilities of which details are publicly available,which may be exploited and lead to serious security issues.Existing state-of-the-art OSS reuse detection work can not identify the specific versions of reused OSS well.The features they selected are not distinguishable enough for version detection and the matching scores are only based on similarity.This paper presents B2SMatcher,a fine-grained version identification tool for OSS in commercial off-the-shelf(COTS)software.We first discuss five kinds of version-sensitive code features that are trackable in both binary and source code.We categorize these features into program-level features and function-level features and propose a two-stage version identification approach based on the two levels of code features.B2SMatcher also identifies different types of OSS version reuse based on matching scores and matched feature instances.In order to extract source code features as accurately as possible,B2SMatcher innovatively uses machine learning methods to obtain the source files involved in the compilation and uses function abstraction and normalization methods to eliminate the comparison costs on redundant functions across versions.We have evaluated B2SMatcher using 6351 candidate OSS versions and 585 binaries.The result shows that B2SMatcher achieves a high precision up to 89.2%and outperforms state-of-the-art tools.Finally,we show how B2SMatcher can be used to evaluate real-world software and find some security risks in practice.
基金partially supported by the National Key Research and Development Program of China(No.2016YFE0121800)
文摘The quality of measurement data is critical to the accuracy of both outdoor and indoor localization methods.Due to the inevitable measurement error,the analytics on the error data is critical to evaluate localization methods and to find the effective ones.For indoor localization,Received Signal Strength(RSS)is a convenient and low-cost measurement that has been adopted in many localization approaches.However,using RSS data for localization needs to solve a fundamental problem,that is,how accurate are these methods?The reason of the low accuracy of the current RSS-based localization methods is the oversimplified analysis on RSS measurement data.In this proposed work,we adopt a generalized measurement model to find optimal estimators whose estimated error is equal to the Cram′er-Rao Lower Bound(CRLB).Through mathematical techniques,the key factors that affect the accuracy of RSS-based localization methods are revealed,and the analytics expression that discloses the proportional relationship between the localization accuracy and these factors is derived.The significance of our discovery has two folds:First,we present a general expression for localization error data analytics,which can explain and predict the accuracy of range-based localization algorithms;second,the further study on the general analytics expression and its minimum can be used to optimize current localization algorithms.