期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Meaningful String Extraction Based on Clustering for Improving Webpage Classification
1
作者 Chen Jie Tan Jianlong +1 位作者 Liao Hao Zhou Yanquan 《China Communications》 SCIE CSCD 2012年第3期68-77,共10页
Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with ... Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with this problem,we propose two scenarios to extract meaningful strings based on document clustering and term clustering with multi-strategies to optimize a Vector Space Model(VSM) in order to improve webpage classification.The results show that document clustering work better than term clustering in coping with document content.However,a better overall performance is obtained by spectral clustering with document clustering.Moreover,owing to image existing in a same webpage with document content,the proposed method is also applied to extract image meaningful terms,and experiment results also show its effectiveness in improving webpage classification. 展开更多
关键词 webpage classification meaningfulstring extraction document clustering term cluste-ring K-MEANS spectral clustering
下载PDF
图纸自动输入及管理系统
2
作者 黄汉文 李新友 +5 位作者 唐泽圣 唐龙 赵致格 张风昌 王德英 张锋 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 1998年第S1期83-86,共4页
介绍了图纸自动输入及管理系统研制中的关键技术。该系统基于对工程图纸自动输入计算机的点阵图象,作编辑、修改和图文信息管理。主要关键技术包括:对非均匀蓝图作自适应分块局部二值化处理;字符串提取算法;对点阵图素(直线、圆弧... 介绍了图纸自动输入及管理系统研制中的关键技术。该系统基于对工程图纸自动输入计算机的点阵图象,作编辑、修改和图文信息管理。主要关键技术包括:对非均匀蓝图作自适应分块局部二值化处理;字符串提取算法;对点阵图素(直线、圆弧)的交互拾取等。系统是在UNIX环境下开发的,可在工作站或微机环境下运行,已达到实用化,并提供大量用户广泛使用。 展开更多
关键词 图纸自动输入 二值化 字符串提取 图素交互拾取
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部