期刊文献+

基于遗传算法的自然语言参数阈值优化方法 被引量:1

A Parameter Threshold Optimizing Approach to Natural Language Based on Genetic Algorithm
下载PDF
导出
摘要 提出了一种基于遗传算法的动词-动词搭配参数阈值自动优化方法.该方法的主要优点表现在三个方面:①该方法是一种数据驱动的机器学习方法,在一定程度上避免了经验性方法确定参数阈值所固有的人为误差;②与经验性方法每次分别确定一个参数阈值不同,该方法是一种多参数整体阈值优化方法;③不像经验性方法那样给不同数据提供的是同一组参数阈值,该方法能动态获得适合于不同规模或不同领域数据的参数阈值.对比实验表明,使用本方法所获得的4个阈值对于提高动词-动词搭配F值的效果明显.本方法不仅适用于动词-动词搭配参数阈值的选取,也适用于其它多参数阈值选取问题,如规则边界优化,分类与聚类参数阈值优化等. A method for optimizing parameter thresholds for Verb-Verb collocation based on genetic algorithm is proposed. The main advantages of the method are shown as follows. The first, as comparing with the experiential methods, it can avoid the factitious errors of parameter thresholds by the aid of data-driven machine learning. The second, differing from the experiential methods that one parameter threshold is determined at one time, it is a kind of holistic parameter threshold optimal approach. The third, it can dynamically obtain optimal parameter thresholds from the given training samples with different scales in different fields. The comparative experiment shows that the F-value of verb-verb collocation is markedly enlarged when using four parameter thresholds obtained by the method. In fact, it can also be used to solve other multi-parameter threshold optimizing problems such as rule boundary optimizing, parameter threshold optimizing in classification and clustering problems, etc.
出处 《测试技术学报》 2006年第1期75-81,共7页 Journal of Test and Measurement Technology
关键词 搭配 参数闽值优化 遗传算法 测试分析 自然语言 collocation parameter threshold optimizing genetic algorithm test analysis natural language
  • 相关文献

参考文献8

  • 1Benson M, Benson E, Ilson R. The BBI Combinatory Dictionary of English: A Guide to Word Combinations[Z],Published by John Benjamins Publishing Company, 1986.
  • 2Smadia F, Retrieving collocation from text : Xtract[J]. Computational Linguistic. 1993, 19 (1) : 143-177.
  • 3孙茂松,黄昌宁,方捷.汉语搭配定量分析初探[J].中国语文,1997(1):29-38. 被引量:55
  • 4Wang Suge, You Liping, Liu Kaiying. Automatic Acquisitive Method of Verb-Verb Collocation[C]. Advances in Comp-utation of Oriental Languages. 20th International Conference on Computer Processing of Oriental Languages shenyang. Tsinghua Unversity Press, China, August 4-6,2003 Proceedings : 184-190.
  • 5Lu Qin, Li Yin, Xu Ruifeng. Improving Xtract for Chinese Collocation Extraction. Proceedings [C]. IEEE International Conference on Natural Language Processing and Knowledge Engineering, 2003, Beijing: 333-338.
  • 6张猛,王大玲,于戈.一种基于自动阈值发现的文本聚类方法[J].计算机研究与发展,2004,41(10):1748-1753. 被引量:16
  • 7季文赟,周傲英,张亮,金文.一种基于遗传算法的优化分类器的方法[J].软件学报,2002,13(2):245-249. 被引量:6
  • 8张敏,林川,马少平.使用遗传算法的信息检索动态参数学习方法[J].计算机研究与发展,2005,42(3):486-492. 被引量:4

二级参考文献35

  • 1J MacQueen. Some methods for classification and analysis of multivariate observation. In: Proc of the 5th Berkeley Symp Math Statist and Prob 1. California; University of California Press,1967. 281~297
  • 2L Kaufman, P J Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley & Sons,1990
  • 3M Ankerst, M M Breunig, H P Kriegel, et al. OPTICS:Ordering points to identify the clustering structure. In: Proc of the 1999 ACM SIGMOD Int'l Conf on Management of Data (SIGMOD' 99). New York: ACM Press, 1999. 164~169
  • 4A Hotho, G Stumme. Conceptual clustering of text clusters.FGML Workshop, Hannover, 2002
  • 5D S Modha, W S Spangler. Feature weighting in k-means clustering. Machine Learning, 2003, 52(3): 217~237
  • 6F Beil, M Ester, X Xu. Frequent term-based text clustering. In:Proc of 2002 Int Conf Knowledge Discovery and Data Mining.New York: ACM Press, 2002. 436~442
  • 7B B Wang, R I McKay, Hussein AAbbass, etal. A comparative study for domain ontology guided feature extraction. In: Proc of 26th Australian Computer Science Conference (ACSC2003).Darlinghurst, Australia: Australian Computer Society Inc, 2003.69~ 78
  • 8S. Robertson, K. Sparck Jones. Relevance weighting of search terms. Journal of the American Society for Information Science,1976, 27(3): 129--146.
  • 9N. Fuhr, C. Buekley. A probabilistie learning approach for document indexing. ACM Trans. on Information System, 1991,9(3) : 223--248.
  • 10S. E. Robertson, S. Walker. Microsoft Cambridge at TREC-9Filtering track. In: Proc. of the 9th Text Retrieval Conf(TREC-9). Gaithersburg, MD: NIST Special Publication, 2000.361 - 368.

共引文献77

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部