摘要
提出了一种基于遗传算法的动词-动词搭配参数阈值自动优化方法.该方法的主要优点表现在三个方面:①该方法是一种数据驱动的机器学习方法,在一定程度上避免了经验性方法确定参数阈值所固有的人为误差;②与经验性方法每次分别确定一个参数阈值不同,该方法是一种多参数整体阈值优化方法;③不像经验性方法那样给不同数据提供的是同一组参数阈值,该方法能动态获得适合于不同规模或不同领域数据的参数阈值.对比实验表明,使用本方法所获得的4个阈值对于提高动词-动词搭配F值的效果明显.本方法不仅适用于动词-动词搭配参数阈值的选取,也适用于其它多参数阈值选取问题,如规则边界优化,分类与聚类参数阈值优化等.
A method for optimizing parameter thresholds for Verb-Verb collocation based on genetic algorithm is proposed. The main advantages of the method are shown as follows. The first, as comparing with the experiential methods, it can avoid the factitious errors of parameter thresholds by the aid of data-driven machine learning. The second, differing from the experiential methods that one parameter threshold is determined at one time, it is a kind of holistic parameter threshold optimal approach. The third, it can dynamically obtain optimal parameter thresholds from the given training samples with different scales in different fields. The comparative experiment shows that the F-value of verb-verb collocation is markedly enlarged when using four parameter thresholds obtained by the method. In fact, it can also be used to solve other multi-parameter threshold optimizing problems such as rule boundary optimizing, parameter threshold optimizing in classification and clustering problems, etc.
出处
《测试技术学报》
2006年第1期75-81,共7页
Journal of Test and Measurement Technology
关键词
搭配
参数闽值优化
遗传算法
测试分析
自然语言
collocation
parameter threshold optimizing
genetic algorithm
test analysis
natural language