基于关联规则挖掘的汉语语义搭配规则获取方法被引量：5

Automatic Acquisition of Chinese Semantic Collocation Rules Based on Association Rule Mining Technique

下载PDF

导出

摘要针对自然语言处理系统在短语分析时的词汇排歧和结构排歧需要,本文提出了一种基于语料库的汉语短语语义搭配规则自动获取方法.该方法以《知网》为语义知识资源,在标注了句法语义信息的汉语短语熟语料库基础上,先采用数据挖掘中元规则制导的交叉层关联规则挖掘方法,自动发现汉语短语的语义搭配规律,再根据统计结果自动优选后生成语义搭配规则库.实验结果表明该方法是切实可行的.运用该方法自动获取的语义搭配规则具有较好的排歧效果. The semantic collocations play important roles in parsing Chinese phrases. It is useful for both semantic disambiguation and structural disambiguation. In this paper,a corpus-based method was proposed to automatically acquire semantic collocation rules from a Chinese phrase corpus,which was annotated with semantic knowledge according to HowNet. Moreover,a metarule-guided algorithm for mining cross-level association rules was developed to acquire semantic collocation rules from the corpus. And an optimized algorithm was developed to filter these rules. The experiment results showed the effectiveness of the proposed method. Disambiguation performance of the automatically acquired rules was quiet well.

作者郑旭玲周昌乐李堂秋陈毅东

机构地区厦门大学计算机科学系

出处《厦门大学学报（自然科学版）》 CAS CSCD 北大核心 2007年第3期331-336,共6页 Journal of Xiamen University：Natural Science

基金国家自然科学基金(60373080)资助

关键词语义规则语料库关联规则知网 semantic rules corpus association rules HowNet

分类号 TP391.2 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献9

1董振东,董强....知网[EB/OL]. http://www.keenage.com/zhiwang/c_zhiwang.html,,(2000-10-25)[2006-10-07]..
2俞士汶.现代汉语短语结构知识库规格说明书.汉语语言与计算学报,2003,13(2):215-226.
3董振东,董强.关于知网-中文信息结构库[EB/OL].(2000-10-25)[2006-10-07] http://www.keenage.com/html/c_index.html.
4Han J,Kamber M.Data mining:concepts and techniques[M].San Francisco:Morgan Kaufmann Publishers,2001.
5Han J,Fu Y.Discovery of multiple-level association rules from large databases[C]//Proceedings of 21th International Conference on Very Large Data Bases.Zurich:Morgan Kaufmann Publishers,1995:420-431.
6欧阳为民,蔡庆生.大型数据库中多层关联规则的元模式制导发现[J].软件学报,1997,8(12):920-927. 被引量：7
7Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[C]//Proceedings of the 1993 ACM-SIGMOD International Conference on Management of Data.Washington:ACM Press,1993:207-216.
8Aggarwal C C,Sun Z,Yu P S.Online generation of profile association rules[C]//Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining.Florida:AAAI Press,1998:129-133.
9Aggarwal C C,Yu P S.A new approach to online generation of association rules[J].IEEE Transactions on Knowledge and Data Engineering,2001,13(4):527-540.

二级参考文献6

1Han J，Proc 1996 Int’l Conf on Data Mining and Knowledge Discovery，1996年
2Han J，Proc 2th VLDB Conf Zurich，1995年
3Shen W，Advances in Knowledge Discovery and Data Mining，1995年
4Han J，AAAI’94 Workshop on Knowledge Discovery in Databases，1994年
5Han J，IEEE Trans Knowl Data Eng，1993年，5期，29页
6欧阳为民,蔡庆生.在数据库中自动发现广义序贯模式[J].软件学报,1997,8(11):864-870. 被引量：12

共引文献10

1俞士汶,段慧明,朱学锋,张化瑞.综合型语言知识库的建设与利用[J].中文信息学报,2004,18(5):1-10. 被引量：29
2朱恒民,刘建国,王宁生.基于连接属性的元规则实例方法[J].控制与决策,2005,20(10):1120-1124. 被引量：1
3崔贤岳,李际军.数据挖掘技术在税务系统中的应用[J].计算机工程,2007,33(14):283-284. 被引量：5
4俞士汶.建设综合型语言知识库的理念与成果的价值[J].中文信息学报,2007,21(6):3-12. 被引量：13
5欧阳为民,蔡庆生.发现序贯模式的增量式更新技术[J].小型微型计算机系统,1998,19(11):12-17. 被引量：1
6欧阳为民,郑诚.KDD研究中的若干问题与方法[J].安徽大学学报（自然科学版）,1999,23(1):41-52. 被引量：10
7欧阳为民,蔡庆生.基于时间窗口的增量式关联规则更新技术[J].软件学报,1999,10(4):426-429. 被引量：3
8白如江,于晓繁,王效岳.国内外主要本体库比较分析研究[J].现代图书情报技术,2011(1):3-13. 被引量：14
9倪志伟,周之强,公维峰,孟金华.基于维分类的关联规则的元规则制导挖掘[J].计算机工程与应用,2011,47(30):140-143.
10俞士汶,穗志方,朱学锋.综合型语言知识库及其前景[J].中文信息学报,2011,25(6):12-20. 被引量：9

同被引文献70

1谌志群,张国煊.文本挖掘与中文文本挖掘模型研究[J].情报科学,2007,25(7):1046-1051. 被引量：50
2Zhang Yan-qing,Shteynberg M,Prasad S K,et al.Granular fuzzy Web intelligence techniques for profitable data mining[C]∥The 12th IEEE International Conference on Fuzzy Systems (FUZZ 03),May 2003:1462-1464.
3Wang Lipo,Fu Xiuju.Data Mining with Computational Intelligence[M].Berlin,Heidelberg:Springer-Verlag,2005.
4Song D,Bruza P D,Huang Z,et al.Classifying document titles based on information inference[C]//Proceedings of the 14th International Symposium on Methodologies for Intelligent Systems.Japan.Berlin,Heidelberg:Springer,2003:297-306.
5Zelikovitz Sarah.Transductive LSI for short text classification problems[C]// Proeeedings of the 17th International FLAIRS Conference.Miami:AAAI Press,2004.
6Selvi P,Gopalan N P.Sentence similarity computation based on WordNet and corpus statistics[C]//International Conference on Computational Intelligence and Multimedia Applications,13-15 Dec.2007,Sivakasi,Tamil Nadu.Washington,DC:IEEE Computer Society,2007,1:9-14.
7Sarnovsky M,Paralic M.Text mining workflows construction with support of ontologies[C]//Proceedings of the 6th International Symposium on Applied Machine Intelligence and Informatics,SAMI'08,January 21-22,2008,Herlany,Slovakia.Hungary:Budapest Polytechnic,2008:173-177.
8陈骏.基于语义网的文本信息分类技术研究[D].南京:南京理工大学,2007.
9Jung Jason J,Jo Geun-Sik.Semantic analysis for data preparation of Web usage mining[C]//Proceedings of the 17th International Conference on Innovations in Aplied Atificial Itelligence,Ottawa,Canada.Berlin,Heidelberg:Springer,2004:1249-1258.
10Therrien C W.Decision,Estimation,and Classification:An Introduction to Pattern Recognition and Related Topics[M].New York:John Wiley & Sons,Inc.1989.

引证文献5

1张玉峰,胡凤,董坚峰.泛在知识环境中数据挖掘技术进展分析[J].情报学报,2010,29(2):202-207. 被引量：9
2张玉峰,何超.基于领域本体的语义文本挖掘研究[J].情报学报,2011,30(8):832-839. 被引量：16
3李东明,张丽娟,赵伟,石晶.基于MDL和LSC的语义优选方法[J].计算机工程,2011,37(17):15-18.
4李东明,张丽娟,赵伟,石晶.无指导学习语义优选[J].计算机应用与软件,2012,29(1):155-158. 被引量：1
5邓沌华,胡金柱,李源.面向现代汉语复句信息工程的语料仓库构建研究[J].信息系统工程,2013,26(9):146-148. 被引量：3

二级引证文献28

1朱红艳,宋艳辉.社会网络分析视角下的我国竞争情报研究进展分析[J].现代情报,2010,30(10):18-22. 被引量：2
2胥伟岚,易菲,龙朝阳.知识供应链模型在图书馆知识服务的应用[J].图书情报工作,2011,55(3):52-55. 被引量：11
3廖开际,罗俊勤,闫健峻.基于概率推理的动态知识集群及其应用研究[J].情报学报,2011,30(9):956-962.
4高新陵,王正兴.“十一五”期间我国文献情报领域知识发现研究综述[J].图书情报工作,2011,55(24):56-60. 被引量：2
5涂军,曹鹏.数字图书馆中基于本体的语义检索模型研究[J].情报杂志,2012,31(7):191-194. 被引量：8
6明均仁.融合语义关联挖掘的文本情感分析算法研究[J].图书情报工作,2012,56(15):99-103. 被引量：6
7张玉峰,何超,王志芳,周磊.融合语义聚类的企业竞争力影响因素分析研究[J].现代图书情报技术,2012(9):49-55. 被引量：3
8黎九平.基于SOM文本聚类的领域本体学习研究[J].情报探索,2012(11):89-92. 被引量：1
9明均仁.基于本体图的文本聚类模型研究[J].情报科学,2013,31(2):29-33. 被引量：6
10唐晓波,郭萍.基于语义文本挖掘的企业竞争对手分析模型研究[J].情报学报,2013,32(1):28-36. 被引量：4

1卫刚,叶晨洲.数据发掘在服装设计中的应用[J].微型电脑应用,2000,16(3):31-33.
2朱冲,王大为,张向利.基于最大熵方法汉语基本短语分析[J].计算机工程与应用,2008,44(32):137-139. 被引量：2
3辛燕,鞠时光.基于多维数据模型的交叉层关联规则挖掘[J].小型微型计算机系统,2006,27(4):681-686. 被引量：4
4昝红英,左维松,张坤丽,吴云芳.规则和统计相结合的情感分析研究[J].计算机工程与科学,2011,33(5):146-150. 被引量：4
5张晓孪,王西锋.基于概念图的汉语语义计算的研究与实现[J].计算机工程与应用,2011,47(10):120-123. 被引量：10
6张建莉.基于《知网》语义知识的名词短语识别过程中的排歧[J].福建电脑,2006,22(4):112-113.
7Li Nan.Connecting Minds Through Literature[J].Beijing Review,2016,59(1):42-43.
8廖年旺.好马配好鞍——CPU与内存，怎样搭配才合理[J].现代计算机（中旬刊）,2004(7):98-99.
9张定华,罗明,吴宝海,唐明,齐国宁.智能加工技术的发展与应用[J].航空制造技术,2010,53(21):38-43. 被引量：18
10陈心威.浅谈色彩在网页设计中的应用[J].福建电脑,2009,25(10):146-146. 被引量：8

厦门大学学报（自然科学版）

2007年第3期

浏览历史

内容加载中请稍等...

基于关联规则挖掘的汉语语义搭配规则获取方法被引量：5

参考文献9

二级参考文献6

共引文献10

同被引文献70

引证文献5

二级引证文献28

相关作者

相关机构

相关主题

浏览历史

基于关联规则挖掘的汉语语义搭配规则获取方法 被引量：5

参考文献9

二级参考文献6

共引文献10

同被引文献70

引证文献5

二级引证文献28

相关作者

相关机构

相关主题

浏览历史

基于关联规则挖掘的汉语语义搭配规则获取方法被引量：5