期刊文献+

基于Web属性抽取训练分类模型的方法研究 被引量:3

The Method Research of Web-based Attribute Extraction for Training Classification Model
下载PDF
导出
摘要 针对通用搜索引擎信息量大、查询不准确、深度不够等问题,提出了基于Web的产品属性抽取这一新的搜索引擎服务模式。基于Web的产品属性抽取实际就是一个自动分类问题,其任务是:在给定的分类体系下,根据相关产品模板自动地判断属性的是非。完成此任务的关键在于寻找有效的特征值;确定相关分类规则,最终通过P、R和F指标来评价分类算法。 Carrying out Web-based product attribute extraction is one of the new search engine service patterns, it is put forward in relation that the general search engine is informative, inquiries inaccurate and not enough depth. Web-based product attribute extraction is a actual automatic classification problem, the task is: In a given classification system, in accordance with the relevant product template carry automatically attribute judge of right and wrong. Currently, the key is to search the effective feature value, determine the relevant classification rules, through P, R and F indicators assess the classification algorithm finally.
作者 吴月萍
出处 《上海第二工业大学学报》 2008年第1期29-34,共6页 Journal of Shanghai Polytechnic University
关键词 属性抽取 分类规则 特征值 最大熵 attribute extraction classification rule feature value maximum entropy
  • 相关文献

参考文献7

  • 1TJONG E F, SANG K, FIEN De MENLDE. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition [C].Proceedings of CoNLL-2003, Canada, Edmonton, 2003: 142-147.
  • 2李素建,刘群,杨志峰.基于最大熵模型的组块分析[J].计算机学报,2003,26(12):1722-1727. 被引量:58
  • 3周雅倩,郭以昆,黄萱菁,吴立德.基于最大熵方法的中英文基本名词短语识别[J].计算机研究与发展,2003,40(3):440-446. 被引量:61
  • 4BORTHWICK, ANDREW, STERLING J, et al. Exploiting diverse knowledge sources via maximum entropy in named entity recognition [C].Processing of the 6th Workshop on Very Large Corpora, Canada, Montreal, 1998:152-160.
  • 5WOJCIETH SKUT, BRANTS T. A Maximum-entropy partial parser for unrestricted paper[C].Proceedings of the 6th Workshop on Very Large Corpora, Canada, Montreal, 1998: 143-151.
  • 6DARROCH J N, RATCLIFF D. Generalized iterative scaling for log-linear models[J]. Annals of Mathematical Statistics, 1972,43(5): 1470-1480.
  • 7ZHANG Le. Maximum Entropy Modeling Toolkit for Python and C++. URL http://homepages.inf.ed.ac.uk/s0450736/.2004:23-24.

二级参考文献33

  • 1[1]Erik F, Tjong Kim Sang,Buchholz S. Introduction to the CoNLL-2000 Shared Task: Chunking. In: Proceedings of CoNLL2000 and LLL-2000, Lisbon, Portugal, 2000. 127~132
  • 2[2]Steven A. Parsing by Chunks. In: Berwick, Abney, Tenny eds. Principle-Based Parsing: Kluwer Academic Publishers,1991. 257~278
  • 3[5]Ratnaparkhi A. A maximum entropy model for part-of-speech tagging. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1996
  • 4[6]Ratnaparkhi A. A simple introduction to maximum entropy models for natural language processing. Institute for Research in Cognitive Science, University of Pennsylvania : Technical Report 9708, 1997
  • 5[7]Berger A, Pietra S D, Pietra V D. A maximum entropy approach to natural language processing. Computational Linguistics, 1996,22(1):39~71
  • 6[8]Skut, Wojciech, Thorsten Brants. A maximum entropy partial parser for unrestricted text. In:Proceedings of the 6th Workshop on Very Large Corpora, Montreal, Canada, 1998. 143~151
  • 7[10]Abney S. Part-of-speech tagging and partial parsing. In:Church K, Young S, Bloothooft G eds. Corpus-Based Methods in Language and Speech, An ELSNET volume, Dordrecht:Kluwer Academic Publishers, 1996. 119~136
  • 8[11]Church K W. A stochastic parts program and noun phrase parser for unrestricted text. In:Proceedings of the 2nd Conference on Applied Natural Language Processing, Texas, USA, 1988.136~143
  • 9[12]Ramshaw L A, Marcus M P. Text chunking using transformation-based learning. In: Proceedings of ACL Third Workshop on Very Large Corpora, Cambridge, USA, 1995. 82~94
  • 10[13]Darroch J N, Ratcliff D. Generalized iterative scaling for loglinear models. Annals of Mathematical Statistics, 1972,43(5):1470~1480

共引文献103

同被引文献35

  • 1王海涛,曹存根,高颖.基于领域本体的半结构化文本知识自动获取方法的设计和实现[J].计算机学报,2005,28(12):2010-2018. 被引量:31
  • 2刘非凡,赵军,吕碧波,徐波,于浩,夏迎炬.面向商务信息抽取的产品命名实体识别研究[J].中文信息学报,2006,20(1):7-13. 被引量:47
  • 3M A Hearst.Automatic Acquisition of Hyponyms from Large Text corpora [C]// Proceedings of the 14th Conference on Computational Linguistics, 1992:539-545.
  • 4S A Caraballo.Automatic Construction of a Hypernym-labeled Noun Hierarchy from Text [C]//Proceedings of the 37th Annual Meeting of the Association for Computational Linguistic on Computational Linguistics, 1999:120-126.
  • 5M Berland and E Charniak.Finding Parts in Very Large Corpora[C.] //Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics,on Computational 57-64.
  • 6M Poesio, T Ishikawa,etal.Acquiring Lexical Knowledge for Anaphora Resolution[C]//Proceedings of the 3rd Conference on Language Resources and Evaluation (LREC),2002.
  • 7A Almuhareb and M Poesio.Attribute-Based and Value-Based Clustering:An Evaluation[C]//Proc of EMNLP,2004:158-165.
  • 8Zhang Le. Maximum Entropy Modeling Toolkit for Python and C++ [EB,OL]. URL http://homepages.inf.ed.ac.uk/s0450736//maxenttoolkit.html.
  • 9P Resnik.Semantic Similarity in a Taxonomy:An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language[J]. Journal of Artificial Intelligence, 1999(11):95-130.
  • 10HATZIVASSILOGLOU V,MCKEOWN K R.Predicting the semantic orientation of adjectives[C]//Proceeding of the 35th Annual Meeting of the Association of Computational Linguistics(ACL-97),New Brunswick,1997:174-181.

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部