摘要
百度等按照时间或焦点的传统新闻检索方式,缺少对新闻事件在时间维度和空间维度及时空发展规律上的组织和表达。鉴此,本文提出了一种在时间和空间维度对在线简易信息聚合(Really Simple Syndication,RSS)新闻进行多维描述和时空可视化的方法,帮助用户全面、直观理解焦点新闻事件的时空发展过程及趋势。该方法从新浪、百度和Google等多家网站的RSS新闻服务中抽取新闻,将新闻报道时间近似为新闻事件发生时间进行时间维度描述,动态解析并识别新闻概要中的中文地名词汇,进行地址匹配和空间定位,完成新闻事件空间维度描述。以H7N9禽流感热点新闻为例,本文通过过度颜色、统计折线图进行时间维可视化表达,以大小渐变的圆形符号进行空间维可视化表达,多维度描述并展示了H7N9禽流感新闻事件的发展过程和趋势。
Traditional methods of news retrieval which return a series of related news-list that sorted by time or events such as Baidu, are lack of intuitive description in both temporal and spatial dimensions, as well as spa- tio-temporal development that related to news events. This paper presented a method of multi-dimensional de- scription and spatio-temporal visualization of online RSS news events, which helps readers understand the spa- tio-temporal development of the whole news event. Firstly, this method pulled news from several well-known websites such as Baidu, Sina and Google News based on RSS (Really Simple Syndication) service, and then used a multi-dimensional description method to mark the spatial and temporal dimensions of RSS news. The method of temporal dimensional description defines news publishing time as news' occurrence time, while the method of spatial dimensional description dynamically parses and identifies Chinese geographical name from news description, and then matches them with their geographical coordinates. Spatial dimensional description method is the primary content of this article. This approach has been separated into four stages to accomplish the analyzing process: (i) XSL Transformation, which uses XSL(eXtensible Stylesheet Language) to transform a news RSS document into a HTML(Hypertext Markup Language) document; (ii) Description Extraction, which uses the regular expression to extract the news description from news HTML document; (iii) Chinese place Name Extraction, which uses ICTCLAS to extract geographic name from description; And (iv) Geocoding, which uses Google Geocoder API to get the geographical coordinates of the place name. At last, this paper dem- onstrated the spatio-temporal visualization of news events and made a brief analysis by setting H7N9 hot news as an example. In the analysis, temporal visualization used transition color to show the changes between two time nodes according to the amount of news, and then used line chart to show the variation tendency of the total amount of news. Furthermore, spatial visualization clustered news by province and used different-sized plots to indicate the diffidence of news amounts between two provinces.
出处
《地球信息科学学报》
CSCD
北大核心
2014年第3期341-348,共8页
Journal of Geo-information Science
基金
国家自然科学基金项目(41271394)
高等学校博士学科点专项科研基金项目(20123718120001)
数字制图与国土信息应用工程国家测绘局重点实验室开放研究基金项目(GCWD201107)
国家科技支撑计划项目(2011BAB01B04)
关键词
新闻事件
RSS
多维描述
时空可视化
news event
RSS
multi-dimensional description
spatio-temporal visualization