The speech recognition technology has been increasingly common in our lives.Recently,a number of commercial smart speakers containing the personal assistant system using speech recognition came out.While the smart spe...The speech recognition technology has been increasingly common in our lives.Recently,a number of commercial smart speakers containing the personal assistant system using speech recognition came out.While the smart speaker vendors have been concerned about the intelligence and the convenience of their assistants,but there have been little mentions of the smart speakers in security aspects.As the smart speakers are becoming the hub for home automation,its security vulnerabilities can cause critical problems.In this paper,we categorize attack vectors and classify them into hardware-based,network-based,and software-based.With the attack vectors,we describe the detail attack scenarios and show the result of tests on several commercial smart speakers.In addition,we suggest guidelines to mitigate various attacks against smart speaker ecosystem.展开更多
The ability of Voice User Interface(VUI)to understand how users will express their commands naturally and intuitively is an essential component of user experience,especially when the user is interacting with the VUI f...The ability of Voice User Interface(VUI)to understand how users will express their commands naturally and intuitively is an essential component of user experience,especially when the user is interacting with the VUI for the first time.Designing an automated method for testing the usability of VUI is a challenge for two reasons.First,there are many different ways for a user to express the same intention,e.g.“play some music”,“put some music on”,etc.,that is difficult to determine in advance.Second,many VUI apps today typically rely on the platform service provider(e.g.Amazon,Google,etc.)to perform many of the speech recognition and natural language processing tasks,and these services are provided as a blackbox.Consequently,it is difficult for the app developer to obtain information about errors and user feedback.In this paper,we propose a framework,VORI,to systematically evaluate the interactability of VUI,as well as a new metric for quantifying the interactability of a VUI.We use VORI to analyze 127 applications on Alexa by sending over 82,931 commands.Our analysis results highlight that 41.7%of apps only accept strict input that has to exactly match the developer’s predefined sample commands with an interactability score of 20%or less.This suggests developers should consider a better interactability strategy in the design of VUIs,and more research is needed to further explore the design space to improve the interactability.展开更多
基金This work was supported by Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.2019-0-00231,Development of artificial intelligence based video security technology and systems for public infrastructure safety)。
文摘The speech recognition technology has been increasingly common in our lives.Recently,a number of commercial smart speakers containing the personal assistant system using speech recognition came out.While the smart speaker vendors have been concerned about the intelligence and the convenience of their assistants,but there have been little mentions of the smart speakers in security aspects.As the smart speakers are becoming the hub for home automation,its security vulnerabilities can cause critical problems.In this paper,we categorize attack vectors and classify them into hardware-based,network-based,and software-based.With the attack vectors,we describe the detail attack scenarios and show the result of tests on several commercial smart speakers.In addition,we suggest guidelines to mitigate various attacks against smart speaker ecosystem.
文摘The ability of Voice User Interface(VUI)to understand how users will express their commands naturally and intuitively is an essential component of user experience,especially when the user is interacting with the VUI for the first time.Designing an automated method for testing the usability of VUI is a challenge for two reasons.First,there are many different ways for a user to express the same intention,e.g.“play some music”,“put some music on”,etc.,that is difficult to determine in advance.Second,many VUI apps today typically rely on the platform service provider(e.g.Amazon,Google,etc.)to perform many of the speech recognition and natural language processing tasks,and these services are provided as a blackbox.Consequently,it is difficult for the app developer to obtain information about errors and user feedback.In this paper,we propose a framework,VORI,to systematically evaluate the interactability of VUI,as well as a new metric for quantifying the interactability of a VUI.We use VORI to analyze 127 applications on Alexa by sending over 82,931 commands.Our analysis results highlight that 41.7%of apps only accept strict input that has to exactly match the developer’s predefined sample commands with an interactability score of 20%or less.This suggests developers should consider a better interactability strategy in the design of VUIs,and more research is needed to further explore the design space to improve the interactability.