基于少量示例的个性化Web信息自动获取系统(英文) 被引量：1

A Personalized Web Information Auto-retrieval System Based on Small Samples

下载PDF

导出

摘要基于关键词的搜索引擎满足了人们一定的需要,但由于其通用的性质,并不能满足用户的个性化需求,为此,设计并实现了一个基于示例的个性化Web信息自动获取系统.该系统采用了一种新的基于少量Web示例网页和语料库词频统计的特征抽取算法和过滤阈值设定方法.实验结果表明,较基于关键词的搜索引擎而言,该系统能充分考虑用户的兴趣偏好(示例),长期、主动地向用户提供更加准确的Web信息获取服务. current search engines based on keywords satisfy some users＇ need, they can＇t meet users＇ personalized demands for their all purpose characteristics. The design and implementation of a novel personalized Web information auto-retrieval system based on small samples is presented. This system adopts a new algorithm of fea- ture extraction and a new method to determine filtering threshold based on small webpage training sets and term-frequency statistics of corpus. Experimental results show that this system can long-termly and on its own initiative provide more accurate Web information-obtaining service to a user according to his interest than the search engines based on keywords.

作者张春元康耀红雷景生

机构地区海南大学信息科学技术学院

出处《郑州大学学报（理学版）》 CAS 2006年第4期44-49,共6页 Journal of Zhengzhou University:Natural Science Edition

关键词个性化Web信息获取 WEB信息过滤特征抽取少量Web文档示例 personalized Web information retrieval Web document filtering feature extraetion small samples ofWeb documents

分类号 TP311.52 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献8

1STEFAN K,ARMIN H,MARKUS J.Improving document retrieval by automatic query expansion using collaborative learning of term-based concepts[J].Lecture Notes in Computer Science,2002,2423:376-387.
2LI X M,YAN H F,WANG J M.The principle,technique,and system of search engine[M].Beijing:China Science Press,2005.
3Baidu search engine.http://www.baidu.com.
4Zhongsou search engine.http://www.zhongsou.com.
5RICARDO B Y,BERTHIER R N.Modern information retrieval[M].Beijing:China Machine Press,2004.
6Institute of Computational Linguistics of Peking University.PFR People's Daily corpus[EB/OL].http://www.icl.pku.edu.cn/icl-groups/corpus/dwldform1.asp.
7FRANCOIS D,REMI G,MARC T.Text classification from positive and unlabeled examples[C]//IPMU'02,9th International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems.Annecy,France,2002:1927-1934.
8夏迎炬,黄萱菁,胡恬,吴立德.自适应信息过滤中使用少量正例进行阈值优化(英文)[J].软件学报,2003,14(10):1697-1705. 被引量：6

二级参考文献14

1Salton G. Develovments in automatic text retrieval. Science, 1991,253:974-979
2Zhai C, Jansen P,Roma N, Stoica E, Evans DA. Optimization in CLARIT adaptive filtering. In:Voorhees EM, Harman DK, eds.Proceedings of the 8th Text Retrieval Conference. 1999.253-258.
3Zhang Y, Callan J. Yfilter at TREC9. In: Voorhees EM, Harman DK, eds, Proceedings of the 9th Text Retrieval Conference.Gaithersburg. 2000. 154-161.
4Allan J. Incremental relevance feedback for information filtering. In:Frei HP, Harman D, Schiuble P, Wilkinson R, eds.Proceedings of the 19th annual international ACM SIGIR conference on Research and Development in Information Retrieval 1996.Zurich, Switzerland. 1996. 270-278.
5Arampatzis A, Beney J, Koster CHA, van der Weide TP. KUN on the TREC9 filtering track: Incrementality, decay, and theshold optimization for adaptive filtering systems. In:Voorhees EM, Harman DK, eds. Proceedings of the 9th Text Retrieval Conference.Gaithersburg, 2000. 87-109.
6Bucldey C, Salton G, Allan J. The effect of adding relevance information in a relevance feedback enviroment.ln: Croft WB, van Rijsbergen CJ, eds. Proceedings of the 17th Annual International ACM-SIGIR Conference on Research md Development in Information Retrieval. Dublin, ACM/Springer, 1994. 292-300.
7Voorhees EM, et al. Overview of TREC 2001. In: Voorhees EM, Harman DK, eds. Proceedings of the 9th Text Retrieval Conference. Gaithersburg, 2001. 1 - 12.
8Sebastiani F. Macrame learning in automated text categorization, ACM Computing Surveys, 2002,34(1): 1--47.
9Wu LD, et al. FDU at TREC--9: CLIR, filtering and QA tasks. In: Voorhees EM, Harman DK, eds. Proceedings of the 9th Text Retrieval Conference. Galthersburg, 2000. 202-219.
10Robertson SE, Walker S. Microsoft cambridge at TREC9: Filtering track. In:Voorhees EM, Harman DK, eds. Proceedings of the 9th Text Retrieval Conference. Gaithersburg, 2001. 117-131.

共引文献5

1王金宝.基于增量学习和阈值优化的自适应信息过滤研究[J].计算机应用,2006,26(5):1099-1101.
2吴长瀛.基于VSM不良文本过滤系统的硬件实现[J].信息安全与通信保密,2006(9):113-115.
3庞雅丽,王彩芬.个性化信息过滤技术[J].甘肃科技,2007,23(3):124-126. 被引量：4
4魏善岭,傅英亮,鲁明羽.一种用于互动型不良信息过滤的贝叶斯改进方案[J].广西师范大学学报（自然科学版）,2009,27(3):134-137. 被引量：1
5樊康新.基于TREC目标优化的过滤阈值调整算法[J].图书情报工作,2009,53(23):107-110.

同被引文献12

1阳晓萍,汤兵勇,宋月婵.个性化服务综述[J].科技情报开发与经济,2006,16(24):247-248. 被引量：5
2Mackenzie G G,Wu M C,Eyal M R. Predicting Preference from Fixations[J].Psychnology Journal,2009,(02):141-158.
3Kai P,Antti A,Samuel K. Learning to Learn Implicit Queries from Gaze Patterns[A].2008.760-767.
4Granka L A,Thorsten J,Geri K G. Eye-tracking Analysis of User Behavior in WWW Search[A].Sheffield,South Yorkshire,UK,2004.478-479.
5Jeff H,Ryen W W. No Clicks,No Problem:Using Cursor Movements to Understand and Improve Search[A].2011.1225-1234.
6Chen L,Pearl P. Eye-Tracking Study of User Behavior in Recommender Interfaces[A].2010.375-380.
7王国霞,刘贺平.个性化推荐系统综述[J].计算机工程与应用,2012,48(7):66-76. 被引量：337
8郝水龙,吴共庆,胡学钢.基于层次向量空间模型的用户兴趣表示及更新[J].南京大学学报（自然科学版）,2012,48(2):190-197. 被引量：26
9刘忠宝,赵文娟.个性化搜索引擎中用户兴趣模型的构建方法[J].计算机系统应用,2012,21(11):1-6. 被引量：2
10刘江涛,杜清运,彭子凤.SDI信息服务部门用户偏好挖掘方法研究[J].武汉大学学报（信息科学版）,2013,38(3):329-333. 被引量：10

引证文献1

1苌道方,钟悦.考虑行为和眼动跟踪的用户兴趣模型[J].河南科技大学学报（自然科学版）,2014,35(1):49-52. 被引量：6

二级引证文献6

1陈俊彦,王勇,张红梅,李鹏飞.基于肤色模型和背景差分的手指区域分割方法[J].河南科技大学学报（自然科学版）,2015,36(1):39-42. 被引量：2
2陈曦,孟庆虎.骨架关节点跟踪的人体行为识别方法[J].河南科技大学学报（自然科学版）,2015,36(2):43-48. 被引量：5
3高原.移动设备人机交互眼动跟踪方法分析[J].电子技术与软件工程,2016(11):121-121.
4袁银池,王秀红,金玉成.基于用户阅读行为的主动推送微服务模式研究——以专利文献为例[J].情报理论与实践,2017,40(1):98-103. 被引量：6
5徐海洋,孔军,蒋敏.基于四元数3D骨骼表示的人体行为识别[J].激光与光电子学进展,2018,55(2):162-169. 被引量：3
6胡晓红,王红,任衍具.基于眼动技术的互联网广告效果研究[J].计算机应用研究,2018,35(5):1345-1349. 被引量：9

1衣治安,律佳.基于逻辑实现的模糊匹配算法在Web信息过滤中的应用[J].郑州轻工业学院学报（自然科学版）,2009,24(3):59-61.
2徐义峰,陈春明,徐云青.粗糙集理论在Web信息过滤中的应用研究[J].计算机系统应用,2007,16(3):40-42. 被引量：2
3黄立勤.语义集成在信息自动获取系统中的实现[J].福州大学学报（自然科学版）,2002,30(6):793-797. 被引量：1
4胡恬,夏迎炬,黄萱菁,吴立德.基于向量空间模型的Web中文信息过滤系统[J].计算机工程,2003,29(z1):25-26. 被引量：6
5赵晓丽.语义分析方法在web信息过滤中的应用[J].科技通报,2012,28(10):37-39. 被引量：3
6衣治安,刘杨.粗糙集理论和DT_SVM在Web信息过滤中的应用[J].计算机工程,2008,34(15):208-210. 被引量：1
7王秀芳,陈飞,农宇,王坤.农业专题图信息自动获取系统[J].农业机械学报,2008,39(5):99-103. 被引量：2
8樊康新,邱建林,顾卫江.基于VSM的自适应信息过滤系统的研究与设计[J].计算机时代,2009(11):1-3. 被引量：2
9白秀伟,惠晓威.物联网中间件数据处理研究[J].计算机测量与控制,2012,20(7):1938-1940. 被引量：6
10郗君甫,刘国华,李金才,唐军军,祁瑞丽.面向top-K分级的数据库关键词查询系统体系结构[J].燕山大学学报,2010,34(1):67-73. 被引量：1

郑州大学学报（理学版）

2006年第4期

浏览历史

内容加载中请稍等...