一种实体模式匹配算法被引量：1

An Entity Schema Matching Algorithm

下载PDF

导出

摘要提出了一种异构数据源的实体模式匹配算法.算法从中英双语的环境出发,利用基于中文wordnet的中英文词义相似度算法和中文分词工具,从列名距离、数据类型、数据内容的词性成份等角度来建立同类实体不同模式之间的映射关系.该算法可用于分析数据空间中不同数据源实体之间的关联,以及其他研究领域中的中英文语义信息相关性分析和实体模式匹配. An entity schema matching algorithm for heterogeneous data sources was proposed.Considering the Chinese-English bilingual environment,the algorithm used a semantic similarity algorithm based on Chinese wordnet and the Chinese lexical analysis system.And it established the mapping between different schema of similar entities by the factor of the distance from the column names,data types and the data content such as POS tagging.The algorithm was able to discover relationship between entities in different data resources of dataspace and other research for Chinese-English data semantic analysis and entity schema matching.

作者吴思颖吴扬扬

机构地区华侨大学计算机科学与技术学院

出处《郑州大学学报（理学版）》 CAS 北大核心 2011年第1期50-56,共7页 Journal of Zhengzhou University:Natural Science Edition

基金福建省科技计划重点项目编号2008I0021 福建省自然科学基金资助项目编号2009J01289

关键词模式匹配映射相似度 schema matching mapping similarity

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献12

1Erhard R, Philip A B. A survey of approaches to automatic schema matching[J]. VLDB Journal, 2001, 10(4) :334-350.
2钱颖.发掘数据库模式间的复杂语义匹配[J].小型微型计算机系统,2008,29(5):817-824. 被引量：2
3姜芳艽,孟小峰,贾琳琳.Deep Web集成服务的不确定模式匹配[J].计算机学报,2008,31(8):1412-1421. 被引量：14
4Shvaiko P, Euzenat J. A survey of schema based matching approaches[J]. Journal on Data Semantics IV, 2005, 3730: 146-171.
5Madhavan J, Bernstein P A, Rahm E. Generic schema matching with cupid[C]//Proc 27th Intl Conference on Very Large Databases (VLDB). Rome, 2001:49-58.
6Melnik S, Garcia M H, Rahm E. Similarity flooding: a versatile graph matching algorithm[C]//Proceedings of the 18th International Conference on Data Engineering (ICDE). San Jose: C A, 2002 : 117-128.
7Doan A, Domingos P, Halevy P. Reconciling schemas of disparate data sources: a machine learning approaeh[C]//Proceedings of the ACM SIGMOD Conference. California, 2001:509-520.
8Do H H, Rahm E. COMA-A system for flexible combination of schema matching approaches[C]//Proceedings of the Very Large Data Bases Conference (VLDB). Hong Kong, 2001:610-621.
9荀恩东,颜伟.基于语义网计算英语词语相似度[J].情报学报,2006,25(1):43-48. 被引量：41
10吴思颖,吴扬扬.基于中文WordNet的中英文词语相似度计算[J].郑州大学学报（理学版）,2010,42(2):66-69. 被引量：21

二级参考文献53

1俞士汶.网上的基础语言信息资源[J].术语标准化与信息技术,2001(4):19-23. 被引量：2
2俞士汶,段慧明,朱学锋.汉语词的概率语法属性描述[J].语言文字应用,2001(3):21-26. 被引量：6
3荀恩东,颜伟.基于语义网计算英语词语相似度[J].情报学报,2006,25(1):43-48. 被引量：41
4张承立,陈剑波,齐开悦.基于语义网的语义相似度算法改进[J].计算机工程与应用,2006,42(17):165-166. 被引量：38
5周强陈力为等.一个人机互助的汉语语料库多级加工处理系统CCMP.计算语言学进展与应用[M].北京:清华大学出版社,1995.50.
6Sebti A, Barfrous A A. A new word sense similarity measure in WordNet[C] //Proceedings of the International Multiconference on Computer Science and Information Technology. Washinton D C:IEEE Computer Society, 2008: 369-373.
7Hirst G, St-Onge D. Lexical chains as representations of context for the detection and correction of malapropisms[M]// WordNet: an Electronic Lexical Database. Cambridge M A: MIT Press, 1998.
8Resnik P. Using information content to evaluate semantic similarity in a taxonomy[C]//Proceedings of the 14th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers Inc,1995: 448-453.
9刘群李素建.基《知网》的词汇语义相似度计算.计算语言学及中文信息处理,2002,7(2):59-59.
10George A.Miller,Richard Beckwith,Christiane Fellbaum,Derek Gross,and Katherine Miller.Introduction to WordNet:An On-line Lexical Database[EB].Cognitive Science Laboratory,Princeton University,1993.51 ～ 57

共引文献204

1杨丽姣,肖航,刘智颖.《信息处理用现代汉语词类标记规范》修订研究[J].语言文字应用,2021(3):111-120. 被引量：1
2卢雪晖,徐会丹,李斌,陈思瑜.先秦词网构建及梵汉对比研究[J].中文信息学报,2023,37(3):36-45. 被引量：1
3吴先,胡俊峰.基于历时语料库的在线词典编纂系统设计[J].中文信息学报,2020(5):27-35. 被引量：1
4程勇,徐德宽,董军.基于多元语言特征与深度特征融合的中文文本阅读难度自动分级研究[J].中文信息学报,2020(4):101-110. 被引量：11
5李月,赵连江,王波,王奕凝,张志方.文物保护单位安全监测信息集成共享模式研究[J].科技促进发展,2022,18(6):776-783.
6谭晓平.现代汉语文本语料库建设及应用现状研究[J].对外汉语研究,2018,0(1):20-29.
7姚露露,李云岭,宋凯丽.中文地名地址标注方法研究[J].测绘地理信息,2021,46(S01):182-184. 被引量：1
8化振红.建立中古汉语语料库分词规范的若干问题[J].语言研究集刊,2021(2):151-167. 被引量：2
9刘亮.现代汉语广义助词知识库构建与应用[J].光盘技术,2008(4):14-16. 被引量：2
10钟冬,朱怡安,王云岚.一种新的基于隐式经验的服务发现模型[J].西北工业大学学报,2009,27(1):71-76.

同被引文献10

1Adeva J J G, Atxa J M P. Intrusion detection in Web applications using text mining[ J]. Engineering Application of Artificial In- telligence, 2007, 20(4): 555-556.
2Kurundkar G D, Naik N A, Khamitkar S D. Network intrusion detection using Snort [ J ]. International Journal of Engineering Research and Applications, 2012, 2(2) : 1288 -1296.
3Namjoshi K, Narlikar G. Robust and fast pattern matching for intrusion detection[ C ]//IEEE Conference on Computer Commu- nications. Piscataway, 2010:14 - 19.
4Guinde N B, Ziavras S G. Efficient hardware support for pattern matching in network intrusion detection[ J]. Computers and Se- curity, 2010, 29(7): 756-769.
5Boyer R S, Moore J S. A fast string searching algorithm[ J ]. Communication of the ACM, 1977, 20 (10) : 762 -772.
6王杰,王同军,孙珂珂.提高Snort规则匹配速度的新方法[J].计算机工程与应用,2009,45(28):109-111. 被引量：4
7杨文君,魏占国,王玉平.入侵检测系统中高效的模式匹配算法[J].小型微型计算机系统,2009,30(11):2189-2194. 被引量：3
8黄锋,吴华瑞.一种自适应的Web信息抽取规则自动生成方法[J].广西师范大学学报（自然科学版）,2010,28(1):127-130. 被引量：5
9王培凤,李莉.一种改进的多模式匹配算法在Snort中的应用[J].计算机科学,2012,39(2):72-74. 被引量：8
10马占飞,尹传卓.Windows平台下Snort系统的架构与实现[J].计算机技术与发展,2013,23(1):154-156. 被引量：1

引证文献1

1蒋亚平,赵军伟,田月霞.IBM算法及其在Snort系统下的实现[J].郑州大学学报（理学版）,2014,46(2):50-54.

1乔德军,温炎耿,张学红,王保民.高校数据中心的设计与实现[J].邯郸学院学报,2007,17(3):41-43. 被引量：3
2吴思颖,吴扬扬.基于中文WordNet的中英文词语相似度计算[J].郑州大学学报（理学版）,2010,42(2):66-69. 被引量：21
3张俐,李晶皎,胡明涵,姚天顺.中文WordNet的研究及实现[J].东北大学学报（自然科学版）,2003,24(4):327-329. 被引量：20
4徐云青,徐义锋,李舟军.基于XML的ER模型的建立[J].计算机应用与软件,2006,23(6):48-50.
5助理小秘书[J].网络与信息,2004,18(12):97-97.
6长江边上.硬件检测仅此而已[J].网友世界,2008(5):70-71.
7欧叶.解决Windows 7下看电影“双语并行”问题[J].网络与信息,2010,24(2):72-73.
8简明,周芳.中英双语主版七彩虹C．G31MK[J].数码先锋,2007,0(11):69-69.
9闫宝华.MIS系统中细粒度实体bean问题解决方案的研究[J].计算机与信息技术,2008(5):82-85.
10闫宝华.信息管理系统中实体bean问题之解决方案研究[J].网络安全技术与应用,2011(1):44-47.

郑州大学学报（理学版）

2011年第1期

浏览历史

内容加载中请稍等...

一种实体模式匹配算法被引量：1

参考文献12

二级参考文献53

共引文献204

同被引文献10

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种实体模式匹配算法 被引量：1

参考文献12

二级参考文献53

共引文献204

同被引文献10

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种实体模式匹配算法被引量：1