摘要
Geotagging is the process of labeling data and information with geographical identification metadata, and text mining refers to the process of deriving information from text through data analytics. Geotagging and text mining are used to mine rich sources of social media data, such as video, website, text, and Quick Response (QR) code. They have been frequently used to model consumer behaviors and market trends. This study uses both techniques to understand the resilience of infrastructure in Chennai, India using data mined from the 2015 flood. This paper presents a conceptual study on the potential use of social media (Twitter in this case) to better understand infrastructure resiliency. Using feature- extraction techniques, the research team extracted Twitter data from tweets generated by the Chennai population during the flood. First, this study shows that these techniques are useful in identifying locations, defects, and failure intensities of infrastructure using the location metadata from geotags, words containing the locations, and the frequencies of tweets from each location. However, more efforts are needed to better utilize the texts generated from the tweets, including a better understanding of the cultural contexts of the words used in the tweets, the contexts of the words used to describe the incidents, and the least frequently used words.
地理位置标记是一种使用地理标识元数据来标记数据和信息的过程,文本挖掘是指通过数据分析从文本中获取信息的过程。地理位置标记和文本挖掘这两种方法常被用于分析丰富的社交媒体数据,如视频、网站、文本和二维(QR)代码。它们经常被用来模拟消费者行为和预测市场趋势。本研究使用这两种技术分析在2015年印度金奈洪灾中获得的数据,从而了解了当地的基础设施的恢复能力。本文对社交媒体(主要是Twitter)的潜在用途进行了概念化研究,这能够帮助我们更好地了解基础设施的恢复能力。研究小组使用特征提取技术从在发洪水期间由印度金奈人发出的推文中提取出Twitter数据。首先,本项研究指出,这些技术有助于从地理位置标记,包含位置的单词以及每个位置的推文频率来判别基础设施的位置、缺陷和故障程度。然而,要更好地利用推文文本,以及更深入地理解文中使用的词语文化背景、用于描述该事件的词语背景和使用频率最低的词语,还需做出更多努力。