摘要
针对传统Web信息抽取的隐马尔可夫模型对初值十分敏感和在实际应用中模型参数极易陷入局部最优的问题,提出了一种基于改进的粒子群优化算法的隐马尔可夫模型参数优化模型,用于Web信息抽取.以似然概率值作为适应度函数,使用改进的粒子群优化算法结合Baum-Welch算法对HMM模型参数进行全局优化,实现了Web页面信息的抽取.实验结果表明,该算法在精确率和时间等指标上与现有算法相比具有更好的性能.
The traditional HMM for Web information extraction is sensitive to the initial model parameters and easy to lead to a local optimal model in practice.A parameters optimum model algorithm based on improved PSO for HMM is put forward for Web information extraction.The algorithm makes the fitness values as the results of the likelihood values,and combines improved PSO and Baum-Welch algorithm to optimize HMM parameters globally to extract information in Web pages.Experimental results show that the new algorithm improves the performance in precision and time-consuming over the present algorithm.
出处
《河南师范大学学报(自然科学版)》
CAS
CSCD
北大核心
2010年第5期65-68,共4页
Journal of Henan Normal University(Natural Science Edition)
基金
河南省科技厅基金项目(102300410198)
河南师范大学青年科学基金(2008qk19,2008qk20)
关键词
PSO
HMM
WEB信息抽取
Particle Swarm Optimization
Hidden Markov Model
Web information extraction