一种基于半监督学习的多模态Web查询精化方法被引量：2

Multi-Modal Web Search Query Refinement Based on Semi-Supervised Learning

下载PDF

导出

摘要 Web搜索系统往往通过与用户的交互来精化查询以提高搜索性能.除文字之外,网页中还存在着大量其它模态的信息,如图像、音频和视频等.以往对于查询精化的研究很少涉及对多模态信息的利用.文中提出了一种基于半监督学习的多模态Web查询精化方法M2S2QR,将Web查询精化转化为一个机器学习问题加以解决.首先,基于用户判断后的网页信息,分别为不同模态训练相应的学习器,然后利用未经用户判断的网页信息来提高学习器性能,最后将不同模态学习器结合起来使用.实验验证了文中方法的有效性. Web search systems usually improve search performance by interacting with users to refine queries. In addition to text information, usually a large amount of information of other mo- dalities, such as image, audio and video, exist in Web pages. Few previous researches on Web query refinement, however, try to exploit the multi-modal information. This paper proposes a multi-modal Web search query refinement method M2S2QR based on semi-supervised learning, which transforms Web search query refinement into a machine learning problem. First, based on the information given by Web pages judged by users, classifiers are trained for different modali- ties, respectively. Then, Web pages that have not been judged by users are used to help improve the performance of the classifiers. Finally the classifiers of different modalities are combined to use. Experiments validate the effectiveness of the proposed method.

作者姜远黎铭周志华

机构地区南京大学计算机软件新技术国家重点实验室

出处《计算机学报》 EI CSCD 北大核心 2009年第10期2099-2106,共8页 Chinese Journal of Computers

基金国家自然科学基金(60635030 60721002) 江苏省自然科学基金(BK2008018)资助

关键词机器学习半监督学习多模态信息 WEB搜索查询精化 machine learning semi-supervised learning multi-modal information Web search query refinement

分类号 TP18 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献28

1Silverstein C, Henzinger M, Marais H, Moricz M. Analysis of a very large AltaVista query log. Digital Systems Research Center, Technical. Report 1998-014, 1998.
2Blum A, Mitchell T. Combining labeled and unlabeled data with co-training//Proceedings of the 11th Annual Conference on Computational Learning Theory. Madison, WI, 1998: 92-100.
3Rocchio J J. Relevance feedback in information retrieval// Salton G ed. The SMART Retrieval System. Englewood Cliffs, NJ: Prentice-Hall, 1971.
4Ide E. New experiments in relevance feedback//Salton G ed. The SMART Retrieval System. Englewood Cliffs, NJ: Prentice-Hall, 1971.
5Salton G, Buekley C. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 1990, 37(4): 288-297.
6Robertson S, Jones K S. Relevance weighting of search terms. Journal of the American Society for Information Science, 1976, 27(3): 129-146.
7Robertson S. On term selection for query expansion. Journal of Documentation, 1990, 46(4): 359-364.
8Harman D. Relevance feedback revisited//Proceedings of the 15th ACM International Conference on Research and Development in Information Retrieval. Copenhagen, Denmark, 1992, 1-10.
9Spark Jones K. Automatic Keyword Classification for Information Retrieval. London, UK: Butterworths, 1971.
10Qiu Y, Frei H P. Concept based query expansion//Proceedings of the 16th ACM International Conference on Research and Development in Information Retrieval. Pittsburgh, PA, 1993:160-169.

共引文献11

1黎铭,周志华.基于多核集成的在线半监督学习方法[J].计算机研究与发展,2008,45(12):2060-2068. 被引量：12
2李广水,宋丁全,郑滔,李杨,苏继申.协同训练支持向量机对遥感影像的分类研究[J].计算机工程与应用,2009,45(29):160-163. 被引量：3
3夏士雄,李佑文,周勇.一种半监督局部线性嵌入算法的文本分类方法[J].计算机应用研究,2010,27(1):64-67. 被引量：9
4潘志松,严岳松,缪志敏,倪桂强,张晖.基于半监督学习的单类分类器[J].解放军理工大学学报（自然科学版）,2010,11(4):397-402. 被引量：1
5赵涛涛,洪宇,华震威,赵明明,姚建民.基于Tri-training算法的中文短语翻译自由度计算[J].广西师范大学学报（自然科学版）,2010,28(3):122-125.
6黄霜明,谢丽聪.协同训练半监督学习二次伪迭代算法[J].广西师范大学学报（自然科学版）,2011,29(3):110-114.
7舒才良,严宣辉,曾庆盛.不完备数据下的免疫分类算法[J].计算机工程与应用,2012,48(20):172-176. 被引量：3
8王娇,罗四维.一种半监督协同训练的正则化算法[J].计算机科学,2012,39(7):215-218. 被引量：3
9王娇,罗四维,王立.一种针对多关系数据的半监督协同训练算法[J].计算机科学,2012,39(B06):536-539.
10胡菊花,姜远,周志华.一种基于教学模型的协同训练方法[J].计算机研究与发展,2013,50(11):2262-2268. 被引量：3

同被引文献20

1林大超,安凤平,郭章林,张立宁.滑坡位移的多模态支持向量机模型预测[J].岩土力学,2011,32(S1):451-458. 被引量：31
2Pan S J, Yang Q. A survey on transfer learning [ J]. IEEE Trans- actions on Knowledge and Data Engineering, 2010,22 ( 10 ) : 1345 - 1359.
3Dai W, Yang Q, Xue G, et al. Boosting for uansferleamig[ C]. In Proceedings of the 24th International Conference on Machine Learn- ing, 2007:193-200.
4Zhu X. Semi-supervised learning literature survey [R]. Madison: Department of Computer Sciences, University of Wisconsin,2005.
5Bennett K P,Demifiz A, Maclin R. Exploiting unlabeled data in ensemble methods [ C]. In Proceedings of the 8th Knowledge Dis- cover), and Data Mining, 2002:289-296.
6Shi Y, Lan Z, Liu W, et al. Extending semi-supervised learning methods for inductive transfer learning [ C]. In Proceedings of the 9th International Conference on Data M/ning, 2009:483-492.
7Xie S, Fan W, Peng J, et al. Latent space domain transfer between high dimensional overlapping distributions CC]. In Proceedings of the 18th International Conference on World Wide Web, 2009:91- 100.
8Pan S J, Ni X,-Sun J, et al. Cross-domain sentiment classification via spectral feature alignment [ C ]. In Proceedings of the 19th In- ternational Conference on World Wide Web, 2010:751-760.
9Freund Y, Schapire R E. A decision-theoretic generalization of on- line learning and an application to boosting [J]. Journal of Com- puter and System Sciences, 1997,55 : 119-139.
10Mason L, Baxter J, Bartlett P, et al. Functional gradient tech- niques for combining hypotheses [ A]. In: SchoIkopf B, Smola A, Bartlett P, et 81 ed. Advances in Large Margin Classifiers [ C]. Cambridge: MIT Press, 2000:221-246.

引证文献2

1洪佳明,陈炳超,印鉴.一种结合半监督Boosting方法的迁移学习算法[J].小型微型计算机系统,2011,32(11):2169-2173. 被引量：4
2袁红春,张硕,陈冠奇.基于双模态深度学习模型的渔场渔情预报[J].江苏农业学报,2021,37(2):435-442. 被引量：1

二级引证文献5

1许敏,王士同,顾鑫.TL-SVM:一种迁移学习算法[J].控制与决策,2014,29(1):141-146. 被引量：14
2朱丽辉,谢瑾奎,潘书敏,杨宗源.在线广告中改进数据分层的动态点击率评估算法[J].小型微型计算机系统,2015,36(7):1492-1497. 被引量：2
3王莉莉,冯其帅,陈德运,杨海陆.一种基于正则化判别分析的迁移学习算法[J].哈尔滨理工大学学报,2019,24(2):89-95. 被引量：3
4朱佳丽,宋燕.基于迁移学习的注意力胶囊网络[J].智能计算机与应用,2021,11(2):44-49. 被引量：1
5杜艳玲,马玉玲,汪金涛,陈珂,林泓羽,陈刚.基于ConvLSTM-CNN预测太平洋长鳍金枪鱼时空分布趋势[J].海洋通报,2024,43(2):174-187.

1王大玲,冯时,张一飞,于戈.社会媒体多模态、多层次资源推荐技术研究[J].智能系统学报,2014,9(3):265-275. 被引量：6
2吴玲达,文军,陈丹雯,袁志民.新闻视频故事单元关联分析技术研究综述[J].计算机科学,2010,37(6):5-10.
3万剑怡,孙永强,薛锦云.一种从Z规约到并行程序的精化方法[J].软件学报,2002,13(11):2106-2111. 被引量：3
4吴梦麟,陈强,孙权森.结合影像和文本信息的医学病例检索[J].计算机辅助设计与图形学学报,2014,26(9):1430-1437. 被引量：3
5刘扬,郑逢斌,樊卞玲.基于文本及视音频多模态信息的新闻分割[J].计算机工程与应用,2007,43(35):190-194. 被引量：3
6袁理,陈庆虎.基于少量特征点的多模态人脸识别[J].计算机工程与应用,2013,49(3):190-193. 被引量：1
7李鹏.加工精密主轴时顶尖定位精度的精化方法[J].科技咨询导报,2007(14):1-1.
8黄殿中,张静飞,张茹,李鹏超,郭云彪.基于大数据环境的多模态信息隐藏新体系[J].电子学报,2017,45(2):477-484. 被引量：9
9王崇,杨帆,许建兵.基于随机petri网的企业档案管理系统的建设与应用[J].控制工程,2015,22(6):1230-1234. 被引量：1
10LI Li,SONG XiaoYu,GU Ming,LUO XiangYu.Competent predicate abstraction in model checking[J].Science China(Information Sciences),2011,54(2):258-267. 被引量：4

计算机学报

2009年第10期

浏览历史

内容加载中请稍等...

一种基于半监督学习的多模态Web查询精化方法被引量：2

参考文献28

共引文献11

同被引文献20

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

一种基于半监督学习的多模态Web查询精化方法 被引量：2

参考文献28

共引文献11

同被引文献20

引证文献2

二级引证文献5

相关作者

相关机构

相关主题

浏览历史

一种基于半监督学习的多模态Web查询精化方法被引量：2