期刊文献+

巧用Clementine简化数据处理

Simplifying data processing by making use of Clementine in a clever way
下载PDF
导出
摘要 用著名的数据挖掘工具Clementine处理数据有些大材小用,但它的确比Excel更易用、更高效,处理数据时不需要翻看复杂的编程手册、在Excel表中拉滚动条、选择各种函数等。以国家科技文献中心(NSTL)签到数据上传处理为研究实例,涉及数据查重、规范、筛选、映射、比对、频次统计等各种常见任务,介绍了如何根据不同处理需求定制相应Clementine数据流和Clementine工具在海量数据处理中的优势。 It is to put a large material to a small use when Clementine, a well-known data mining tool is used to process da. However, it is easier to use with a higher efficacy in processing data than Excel because it does not need to read the complex programming manual, to pull the scroll bar in Excel, and to select the different functions. How to build the corresponding data flow according to the requirements of different data processing and bring Clementine into full play was described by taking the uploading of registered attendance data in National Science and Technology Literature Center as an example, including duplicate data check, data standardization, data screening, data mapping, comparison and frequency.
作者 郑慧霞
出处 《中华医学图书情报杂志》 CAS 2011年第4期59-62,共4页 Chinese Journal of Medical Library and Information Science
基金 中国医学科学院医学信息研究所基本科研业务费支持项目:基于Web挖掘的读者行为分析(编号R0830)
关键词 CLEMENTINE 数据处理 映射 比对 Clementine data processing mapping comparison
  • 相关文献

参考文献2

二级参考文献14

  • 1吕佳.Web日志挖掘技术应用研究[J].重庆师范大学学报(自然科学版),2006,23(4):39-44. 被引量:15
  • 2孔昊,周长胜.Web日志挖掘预处理研究[J].北京机械工业学院学报,2005,20(4):28-31. 被引量:8
  • 3互联网数据挖掘综述-web使用记录的挖掘[EB/OL].http://www.dwway.com/html/80/n-2180-3.html.
  • 4AWStats简介[EB/OL].http://www.chedong.com/tech/awstats.html.
  • 5Maristella Agosti and Giorgio Maria Di NunZio. Web Log Mining: a study of user sessions [ EB/OL ]. http : //www. dblab, ntua. gr/persd12007/papers/72, pdf,.
  • 6互联网数据挖掘综述:Web使用记录的挖掘[EB/OL].[2009-11-25].http://www.dwway.com/html/80/n-2180-3.html.
  • 7Chen MS, Park JS, Yu PS. Data mining for path traversal patterns in a web environment [ C ]. International Conference on Distributed Computing Systems, Hongkong. 1996:385-392. http://citeseerx. ist. psu. edu/viewdoc/download? doi = 10.1.1.43. 9534&rep = repl &type = pdf.
  • 8Clementine的数据挖掘中文教程[EB/OL].[2009-11-25].http://www.quanwen.com.cn/doc/1544013/.
  • 9Chen MS, Park JS, Yu PS. Yu,Efficient Data Mining for Path Traversal Patterns [ J ]. IEEE Trans Knowl Data Eng (S1041 - 4347), 1998,10(2) :209-221.
  • 10网站流量统计指标及其网络营销含义:独立访问者数量分析[EB/OL].(2007-04-16)[2009-1l一25].http://hi.baidu.com/jaso/blog/itera/af50220868c95fd062d9860e.html.

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部