Smartphones are vulnerable to fraudulent use despite having strong authentication mechanisms.Active authentication based on behavioral biometrics is a solution to protect the privacy of data in smart devices.Machinele...Smartphones are vulnerable to fraudulent use despite having strong authentication mechanisms.Active authentication based on behavioral biometrics is a solution to protect the privacy of data in smart devices.Machinelearning-based frameworks are effective for active authentication.However,the success of any machine-learningbased techniques depends highly on the relevancy of the data in hand for training.In addition,the training time should be very efficient.Keeping in view both issues,we’ve explored a novel fraudulent user detection method based solely on the app usage patterns of legitimate users.We hypothesized that every user has a unique pattern hidden in his/her usage of apps.Motivated by this observation,we’ve designed a way to obtain training data,which can be used by any machine learning model for effective authentication.To achieve better accuracy with reduced training time,we removed data instances related to any specific user from the training samples which did not contain any apps from the user-specific priority list.An information theoretic app ranking scheme was used to prepare a user-targeted apps priority list.Predictability of each instance related to a candidate app was calculated by using a knockout approach.Finally,a weighted rank was calculated for each app specific to every user.Instances with low ranked apps were removed to derive the reduced training set.Two datasets as well as seven classifiers for experimentation revealed that our reduced training data significantly lowered the prediction error rates in the context of classifying the legitimate user of a smartphone.展开更多
Focusing on the sensitive behaviors of malware, such as privacy stealing and money costing, this paper proposes a new method to monitor software behaviors and detect malicious applications on Android platform. Accordi...Focusing on the sensitive behaviors of malware, such as privacy stealing and money costing, this paper proposes a new method to monitor software behaviors and detect malicious applications on Android platform. According to the theory and implementation of Android Binder interprocess communication mechanism, a prototype system that integrates behavior monitoring and intercepting, malware detection, and identification is built in this work. There are 50 different kinds of samples used in the experiment of malware detection, including 40 normal samples and 10 malicious samples. The theoretical analysis and experimental result demonstrate that this system is effective in malware detection and interception, with a true positive rate equal to 100% and a false positive rate less than 3%.展开更多
As the smartphone market leader,Android has been a prominent target for malware attacks.The number of malicious applications(apps)identified for it has increased continually over the past decade,creating an immense ch...As the smartphone market leader,Android has been a prominent target for malware attacks.The number of malicious applications(apps)identified for it has increased continually over the past decade,creating an immense challenge for all parties involved.For market holders and researchers,in particular,the large number of samples has made manual malware detection unfeasible,leading to an influx of research that investigate Machine Learning(ML)approaches to automate this process.However,while some of the proposed approaches achieve high performance,rapidly evolving Android malware has made them unable to maintain their accuracy over time.This has created a need in the community to conduct further research,and build more flexible ML pipelines.Doing so,however,is currently hindered by a lack of systematic overview of the existing literature,to learn from and improve upon the existing solutions.Existing survey papers often focus only on parts of the ML process(e.g,data collection or model deployment),while omitting other important stages,such as model evaluation and explanation.n this paper,we address this problem with a review of 42 highly-cited papers,spanning a decade of research(from 2011 to 2021).We introduce a novel procedural taxonomy of the published literature,covering how they have used ML algorithms,what features they have engineered,which dimensionality reduction techniques they have employed,what datasets they have employed for training,and what their evaluation and explanation strategies are.Drawing from this taxonomy,we also identify gaps in knowledge and provide ideas for improvement and future work.展开更多
文摘Smartphones are vulnerable to fraudulent use despite having strong authentication mechanisms.Active authentication based on behavioral biometrics is a solution to protect the privacy of data in smart devices.Machinelearning-based frameworks are effective for active authentication.However,the success of any machine-learningbased techniques depends highly on the relevancy of the data in hand for training.In addition,the training time should be very efficient.Keeping in view both issues,we’ve explored a novel fraudulent user detection method based solely on the app usage patterns of legitimate users.We hypothesized that every user has a unique pattern hidden in his/her usage of apps.Motivated by this observation,we’ve designed a way to obtain training data,which can be used by any machine learning model for effective authentication.To achieve better accuracy with reduced training time,we removed data instances related to any specific user from the training samples which did not contain any apps from the user-specific priority list.An information theoretic app ranking scheme was used to prepare a user-targeted apps priority list.Predictability of each instance related to a candidate app was calculated by using a knockout approach.Finally,a weighted rank was calculated for each app specific to every user.Instances with low ranked apps were removed to derive the reduced training set.Two datasets as well as seven classifiers for experimentation revealed that our reduced training data significantly lowered the prediction error rates in the context of classifying the legitimate user of a smartphone.
基金Supported by the National Natural Science Foundation of China(61103220)the Fundamental Research Funds for the Central Universities (6082013)+1 种基金the National Natural Science Foundation of Hubei(2011CDB456)Chenguang Program(2012710367)
文摘Focusing on the sensitive behaviors of malware, such as privacy stealing and money costing, this paper proposes a new method to monitor software behaviors and detect malicious applications on Android platform. According to the theory and implementation of Android Binder interprocess communication mechanism, a prototype system that integrates behavior monitoring and intercepting, malware detection, and identification is built in this work. There are 50 different kinds of samples used in the experiment of malware detection, including 40 normal samples and 10 malicious samples. The theoretical analysis and experimental result demonstrate that this system is effective in malware detection and interception, with a true positive rate equal to 100% and a false positive rate less than 3%.
文摘As the smartphone market leader,Android has been a prominent target for malware attacks.The number of malicious applications(apps)identified for it has increased continually over the past decade,creating an immense challenge for all parties involved.For market holders and researchers,in particular,the large number of samples has made manual malware detection unfeasible,leading to an influx of research that investigate Machine Learning(ML)approaches to automate this process.However,while some of the proposed approaches achieve high performance,rapidly evolving Android malware has made them unable to maintain their accuracy over time.This has created a need in the community to conduct further research,and build more flexible ML pipelines.Doing so,however,is currently hindered by a lack of systematic overview of the existing literature,to learn from and improve upon the existing solutions.Existing survey papers often focus only on parts of the ML process(e.g,data collection or model deployment),while omitting other important stages,such as model evaluation and explanation.n this paper,we address this problem with a review of 42 highly-cited papers,spanning a decade of research(from 2011 to 2021).We introduce a novel procedural taxonomy of the published literature,covering how they have used ML algorithms,what features they have engineered,which dimensionality reduction techniques they have employed,what datasets they have employed for training,and what their evaluation and explanation strategies are.Drawing from this taxonomy,we also identify gaps in knowledge and provide ideas for improvement and future work.