摘要
Most semi-structured data are of certain structure regularity. Having beenstored as structured data in relational database (RDB), they can be effectively managed by databasemanagement system (DBMS). Some semi-structured data are difficult to transform due to theirirregular structures. We design an efficient algorithm and data structure for ensuring losslesstransformation. We bring forward an approach of schema extraction through data mining, in whichdifferent kinds of elements are transformed respectively and lossless mapping from semi-structureddata to structured data can be achieved.
大多数半结构化数据都具有一定的结构规律 ,将它们转化为基于关系数据库存储的结构化数据 ,可有效地应用DBMS技术进行处理 .部分不便于转化的数据作特殊处理 ,以保证整个数据的无损映射 .本文在完成DTD的转换后 ,从一种最简单的映射方式入手 ,提出改进方案 ,利用一种基于数据挖掘的模式抽取方法 ,对不同类型的元素分别处理 ,设计了一套有效的溢出数据处理办法 。
基金
theplanofkeyuniversityfacultymembersofStateEducationMinistryand"3 3 3"TalentPlanofJiangsuProvinc