摘要
随着机器学习和数据挖掘等方面技术的发展,文本分析与文本挖掘理论方法受到了广泛关注,并且在诸多领域取得了有价值的成果。针对目前多数文本分析只集中在对文本片段的分类和对文本的标注,提出通过运用数据挖掘的方法,基于统计和马尔可夫过程构建模型,用一种从文本中提取地名信息并使用地图服务供应商,补全信息形成结构化数据的方法。在对案例文本中提取地址信息的过程中,成功找到了绝大部分地名,并将其结构化。
In recent years,with the development of machine learning and data mining technology,text analysis and text mining theory has been widely concerned,and has achieved valuable results in many fields.However,most of the current text analysis focuses on the classification of text fragments and text annotation.In this paper,by using the method of data mining,a model was built based on statistics and Markov process to try to put forward a method of extracting place name information from the text,and completing the information to form structured data through the map service provider.In the process of extracting address information from the example text,most of the place names were successfully found and structured.
作者
郭利荣
GUO Lirong(China Datacom Corporation Limited,Guangzhou,Guangdong 510630,China)
出处
《信息记录材料》
2022年第10期30-32,共3页
Information Recording Materials