摘要
The National Institute of Standards and Technology(NIST)has identified natural language policies as the preferred expression of policy and implicitly called for an automated translation of ABAC natural language access control policy(NLACP)to a machine-readable form.To study the automation process,we consider the hierarchical ABAC model as our reference model since it better reflects the requirements of real-world organizations.Therefore,this paper focuses on the questions of:how can we automatically infer the hierarchical structure of an ABAC model given NLACPs;and,how can we extract and define the set of authorization attributes based on the resulting structure.To address these questions,we propose an approach built upon recent advancements in natural language processing and machine learning techniques.For such a solution,the lack of appropriate data often poses a bottleneck.Therefore,we decouple the primary contributions of this work into:(1)developing a practical framework to extract authorization attributes of hierarchical ABAC system from natural language artifacts,and(2)generating a set of realistic synthetic natural language access control policies(NLACPs)to evaluate the proposed framework.Our experimental results are promising as we achieved-in average-an F1-score of 0.96 when extracting attributes values of subjects,and 0.91 when extracting the values of objects’attributes from natural language access control policies.