Artificial intelligence(AI) is intrinsically data-driven.It calls for the application of statistical concepts through human-machine collaboration during the generation of data,the development of algorithms,and the eva...Artificial intelligence(AI) is intrinsically data-driven.It calls for the application of statistical concepts through human-machine collaboration during the generation of data,the development of algorithms,and the evaluation of results.This paper discusses how such human-machine collaboration can be approached through the statistical concepts of population,question of interest,representativeness of training data,and scrutiny of results(PQRS).The PQRS workflow provides a conceptual framework for integrating statistical ideas with human input into AI products and researches.These ideas include experimental design principles of randomization and local control as well as the principle of stability to gain reproducibility and interpretability of algorithms and data results.We discuss the use of these principles in the contexts of self-driving cars,automated medical diagnoses,and examples from the authors' collaborative research.展开更多
基金supported by the Army Research Office(No.W911NF1710005)the National Science Foundation(Nos.DMS-1613002 and IIS 1741340)+1 种基金the Center for Science of Information,a US National Science Foundation Science and Technology Center(No.CCF-0939370)the National Library of Medicine of the NIH(No.T32LM012417)
文摘Artificial intelligence(AI) is intrinsically data-driven.It calls for the application of statistical concepts through human-machine collaboration during the generation of data,the development of algorithms,and the evaluation of results.This paper discusses how such human-machine collaboration can be approached through the statistical concepts of population,question of interest,representativeness of training data,and scrutiny of results(PQRS).The PQRS workflow provides a conceptual framework for integrating statistical ideas with human input into AI products and researches.These ideas include experimental design principles of randomization and local control as well as the principle of stability to gain reproducibility and interpretability of algorithms and data results.We discuss the use of these principles in the contexts of self-driving cars,automated medical diagnoses,and examples from the authors' collaborative research.