Social media,including Twitter,has become an important source for disaster response.Yet most studies focus on a very limited amount of geotagged data(approximately 1%of all tweets)while discarding a rich body of data ...Social media,including Twitter,has become an important source for disaster response.Yet most studies focus on a very limited amount of geotagged data(approximately 1%of all tweets)while discarding a rich body of data that contains location expressions in text.Location information is crucial to understanding the impact of disasters,including where damage has occurred and where the people who need help are situated.In this paper,we propose a novel two-stage machine learningand deep learning-based framework for power outage detection from Twitter.First,we apply a probabilistic classification model using bag-ofngrams features to find true power outage tweets.Second,we implement a new deep learning method-bidirectional long short-term memory networks-to extract outage locations from text.Results show a promising classification accuracy(86%)in identifying true power outage tweets,and approximately 20 times more usable tweets can be located compared with simply relying on geotagged tweets.The method of identifying location names used in this paper does not require language-or domain-specific external resources such as gazetteers or handcrafted features,so it can be extended to other situational awareness analyzes and new applications.展开更多
基金the financial support received from Oak Ridge National Laboratory(ORNL)’s Liane Russell Distinguished Early Career Fellowship and grant no.TG0100000.
文摘Social media,including Twitter,has become an important source for disaster response.Yet most studies focus on a very limited amount of geotagged data(approximately 1%of all tweets)while discarding a rich body of data that contains location expressions in text.Location information is crucial to understanding the impact of disasters,including where damage has occurred and where the people who need help are situated.In this paper,we propose a novel two-stage machine learningand deep learning-based framework for power outage detection from Twitter.First,we apply a probabilistic classification model using bag-ofngrams features to find true power outage tweets.Second,we implement a new deep learning method-bidirectional long short-term memory networks-to extract outage locations from text.Results show a promising classification accuracy(86%)in identifying true power outage tweets,and approximately 20 times more usable tweets can be located compared with simply relying on geotagged tweets.The method of identifying location names used in this paper does not require language-or domain-specific external resources such as gazetteers or handcrafted features,so it can be extended to other situational awareness analyzes and new applications.