摘要
在信息无障碍评估领域中,为了便于预测网页评估结果的相关性,提出一种融合网页特征与URL相似性的网页距离度量方法,并利用部分已知样本评估结果的偏序关系学习出不同特征应分配的权重,从而提升网页距离偏序关系与评估结果偏序关系的相关性。实验结果表明该方法较传统距离度量方法能更好的表征评估结果的偏序关系,使得加权距离越相近的网页,其信息无障碍评估结果越相似。
In the field of information accessibility assessment,in order to facilitate the prediction of the relevance of webpage evaluation results,a method of web page distance measurement which integrates the similarity between web features and URL is proposed,and the different weights are learnt by using the partial order relation of some known sample evaluation results,so as to enhance the similarity between the partial order relationship with distance measurement and the partial order relationship with evaluation results. The experimental results show that the method is better than the traditional Euclidean distance method to characterize the partial order relation of the evaluation results,and the nearby pages in the content evaluation results is similar of the evaluation results.
出处
《科技通报》
2018年第9期195-200,205,共7页
Bulletin of Science and Technology
基金
江西省高校人文社会科学研究项目(SH17203)
国家科技支撑计划课题(2014BAK15B02)
国家自然科学基金(61173185
61173186)
浙江省自然科学基金(LZ13F020001)
江西省教育厅科学技术研究项目(GJJ161437)
江西省高等学校教学改革研究项目(JXJG-16-77-4)
关键词
数据挖掘
网页距离
权重学习
半监督学习
data mining
web page distance
weight learning
semi-supervised learning