摘要
提出一种基于协方差特征爬虫的网页语义概念树构建方法,引入语义概念决策树算法进行主特征建模,根据语义三叉特征决策树概率正则训练迁移法则,得到决策树网络节点最近时刻获得的数据集有效特征概率,采用协方差特征网页爬虫进行网页语义概念树构建算法的改进。通过协方差特征爬虫,进行自相关成分的独立快速分离,得到语义自相关检索编码,实现网页语义概念树构建指导信息检索。仿真结果表明,该算法能有效进行数据挖掘和网页语义概念树的构建,为信息定位提供了最优分叉路径,从而实现对主题热点信息的准确检索和定位,算法具有较好的网页召回和定位检索性能,数据召回率提高明显,展示了较好的应用价值。
Construction method of Webpage semantic concept tree is proposed based on covariance features reptile, the decision tree algorithm of feature modeling is obtained, according to semantic trigeminal feature decision tree probability regular training transfer rule, decision tree node set effective feature probability is obtained, the covariance feature Webpage crawler is used to design Webpage semantic concept tree construction algorithm. The covariance features reptile, rapid separation of autocorrelation components are independent, the semantic correlation retrieval code, and the Webpage semantic concept tree construction guidance information retrieval is realized. The simulation results show that, the algorithm can effectively realize data mining and Webpage semantic concept tree, it provides the optimal branching path for the information orientation, so as to realize the theme topic information retrieval and location accuracy, the algorithm has better Webpage recall and positioning data retrieval performance, it can improve the recall rate significantly, it has a good application value.
出处
《科技通报》
北大核心
2015年第4期85-87,共3页
Bulletin of Science and Technology
基金
广西高等教育教改工程项目(NO.2012JGB404)
关键词
协方差
特征爬虫
网页
语义概念树
covariance
characteristics of crawler
Webpage
semantic concept tree