摘要
Python is widely used in web crawler, machine learning, data analysis and so on. However, there is no guarantee that Python scripts are trusted in their whole lifetime because of system insecurity. When the system is attacked, scripts in the computer are likely to be tampered with. Therefore, the trustworthiness of Python scripts needs to be checked through different configuration strategies, including integrity verification and vulnerability detection. In this paper, integrity verification and vulnerability detection are based on two Python scripts, an original Python script and a current Python script, and the original Python script is assumed to has no vulnerabilities. By comparing with the original script, we can find out whether the current script is integrity or not and detect whether there are vulnerabilities if the integrity of the current file is destroyed. Integrity verification with Hash functions is not applied in some cases. In this mode, any changes including blank lines added are considered illegal. So loose integrity verification by combining UNIX diff tool with abstract syntax trees is proposed. The vulnerability detection starts from the premise that the original Python script has no vulnerabilities, and taint analysis is applied on the vulnerability detection framework Bandit to find vulnerabilities. Besides, in order not to change the usage of Python, both integrity verification and vulnerability detection modules are embedded in Python interpreter. The experiments show that the performance of security analysis framework is good and Bandit with taint can greatly reduce the false positive results without affecting the performance.
Python is widely used in web crawler, machine learning, data analysis and so on. However, there is no guarantee that Python scripts are trusted in their whole lifetime because of system insecurity. When the system is attacked, scripts in the computer are likely to be tampered with. Therefore, the trustworthiness of Python scripts needs to be checked through different configuration strategies, including integrity verification and vulnerability detection. In this paper, integrity verification and vulnerability detection are based on two Python scripts, an original Python script and a current Python script, and the original Python script is assumed to has no vulnerabilities. By comparing with the original script, we can find out whether the current script is integrity or not and detect whether there are vulnerabilities if the integrity of the current file is destroyed. Integrity verification with Hash functions is not applied in some cases. In this mode, any changes including blank lines added are considered illegal. So loose integrity verification by combining UNIX diff tool with abstract syntax trees is proposed. The vulnerability detection starts from the premise that the original Python script has no vulnerabilities, and taint analysis is applied on the vulnerability detection framework Bandit to find vulnerabilities. Besides, in order not to change the usage of Python, both integrity verification and vulnerability detection modules are embedded in Python interpreter. The experiments show that the performance of security analysis framework is good and Bandit with taint can greatly reduce the false positive results without affecting the performance.
基金
Supported by the National Natural Science Foundation of China(61572066)