期刊文献+

机器学习在图书馆应用初探:以TensorFlow为例 被引量:31

Machine Learning and Its application in Library:Take TensorFlow as an Example
下载PDF
导出
摘要 机器学习是人工智能的重要分支,TensorFlow是谷歌第二代开源人工智能机器学习平台。此文重点介绍机器学习(主要是深度神经网络)的基本原理和利用TensorFlow进行机器学习的基本方法,探讨在图书馆领域应用的可能和场景。以《全国报刊索引》的自动分类问题作为实验对象,利用两台图形工作站,建立了TensorFlow深度学习模型,通过设定参数和阈值、系统调优等工作,实践了应用TensorFlow的完整过程,论证了其可行性。实验通过对170万余条题录数据进行训练和测试,克服了报刊索引数据过于简单与中国图书馆分类法的类目过于细致之间的矛盾,实现了大类近80%和四级分类总体近70%的准确率(其中TP类达到91%),得出基本可代替人工分类流程的结论,为全国报刊索引的分类流程的半自动化提供有力工具,从而可望大大节省人力成本。下一步将继续利用TensorFlow的优化功能,结合更多的字段属性,进行系统调优,力争做到自动分类90%以上的准确率。 Machine learning (ML) is a particular approach to artificial intelligence. TensorFlow is the second generation machine learning framework of Google. This paper focuses on the basic principles of ma- chine learning and the basic methods of machine learning by using TensorFlow. Its purpose is to explore the possibilities and scenarios of machine learning applications in library. A TensorFlow ML model is es- tablished and with the index data from National Index of Newspapers and Magazines, a complete process of automatic classification of records had been accomplished and proved feasible. Through the training process and testing of more than 170 million data records, the experiment has overcome the contradiction between the less comprehension of the index data and the trivial category labels, and reached nearly 80 ~ of the cate- gories and nearly 70% of the accuracy rate. It can be concluded that the approach is capable of taking into practice, at least to carry on a semi-automatic processing of classification, which is expected to significantly save labor costs. The next step will be optimizing the parameters and system tuning. We hope it can strive to achieve an accuracy of 90 % by automatic classification.
机构地区 上海图书馆
出处 《大学图书馆学报》 CSSCI 北大核心 2017年第6期31-40,共10页 Journal of Academic Libraries
基金 国家社会科学基金重大项目"面向大数据的数字图书馆移动视觉搜索机制及应用研究"(编号:15ZDB126)的研究成果之一
关键词 智慧图书馆 人工智能 机器学习 'TensorFlow 自动分类 神经网络 Smart Library Artificial Intelligence Machine Learning TensorFlow Automatic
  • 相关文献

参考文献7

二级参考文献85

  • 1武妍,王守觉.一种通过反馈提高神经网络学习性能的新算法[J].计算机研究与发展,2004,41(9):1488-1492. 被引量:15
  • 2何琳,侯汉清,白振田,张雪英.基于标引经验和机器学习相结合的多层自动分类[J].情报学报,2006,25(6):725-729. 被引量:19
  • 3清华大学图书馆馆藏目录[EB/OL].http://innopac.lib.tsinghua.edu.cn/screens/mainmenu.html.
  • 4韩立群.人工神经网络[M].北京:北京邮电出版社,2006.
  • 5Sebastiani F. Machine learning in automated text categorization [ J ]. ACM Computing Surveys, 2002, 34 ( 1 ) : 1 - 47.
  • 6Maron M. Automatic indexing: An experimental inquiry[ J]. Journal of the Association for Computing Machinery, 1961, 8(3) : 404 -417.
  • 7Gennari J H, Musen M A, Fergerson R W, et al. The evolution of protege: An environment for knowledge-based systems development [ J ]. International Journal of Human-Computer Studies, 2003, 58(1) : 89 - 123.
  • 8Quinlan J R. Induction of decision tree [ J ]. Machine Learning, 1986,1(1) :81 - 106.
  • 9Quinlan J R. C4.5 : Programs for machine leaning [M]. Los Altos, California: Morgan Kaufmann Publishers, Inc. , 1993.
  • 10Hecht-Nielsen R. Theory of the back propagation neural network [ C ]. In Proceedings of International Joint Conference on Neural Networks, IEEE, 1989, 1:593 - 603.

共引文献453

同被引文献456

引证文献31

二级引证文献319

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部