期刊文献+

基于法律大数据的智能系统

Intelligence System Based on Big Data of Law
下载PDF
导出
摘要 随着互联网规模的不断壮大,信息量正以前所未有的速度巨量增长着。在这个环境下,大数据应运而生。其法律数据呈现出数量大、速率快、多样化的特点。如何运用先进的方式对海量数据进行采集、处理以及分析显得尤为关键。提出了一套基于法律大数据的智能系统。该系统利用Scrapy网络爬虫采集判决文书和法律条目并使用正则和TF-IDF提取要素信息和文本关键字,实现多维度的文书分类检索功能,并结合Word2Vec与TF-IDF分析文章相似度,实现相关文书的内容推荐。 With the continuous development of the Internet,the amount of information is growing at an unprecedented rate leading to the big data age.The law data in big data age shows the character of large quantity,fast speed and diversification.It is particularly important to use advanced methods to collect,process and analyze massive data.This paper proposes an display system based on big data of law which could collect judgement documents law terms and the key elements and keywords through Scrapy framework and regular expression matching,TF-IDF respectively for achieving the categorization of query function from the aspect of territory and court,text keyword,case type or etc.The system also achieve the recommendation function of related documents by calculating the similarity of articles through Word2Vec and TF-IDF.
作者 张健东 Zhang Jiandong
出处 《工业控制计算机》 2020年第5期69-71,共3页 Industrial Control Computer
关键词 法律大数据 数据采集 数据检索 Word2vec 内容推荐 big data of law data acquisition data-retrieval Word2Vec content recommendation
  • 相关文献

参考文献4

二级参考文献58

  • 1张玉芳,彭时名,吕佳.基于文本分类TFIDF方法的改进与应用[J].计算机工程,2006,32(19):76-78. 被引量:121
  • 2Baeza-Yates R,Ribeiro-Neto B.Modern Information Retrieval[M].New York:ACM press,1999.
  • 3Manning C D,Schütze H.Foundations of Statistical NaturalLanguage Processing [M].Cambridge:MIT press,1999.
  • 4Hwang M,Choi C,Youn B,et al.Word Sense Disambiguation Based on Relation Structure[C]∥International Conference on Advanced Language Processing and Web Information Technology.2008:15-20.
  • 5Wang X,Mccallum A,Wei X.Topical N-Grams:Phrase andTopic Discovery,with an Application to Information Retrieval [C]∥IEEE International Conference on Data Mining.IEEE Computer Society,2007:697-702.
  • 6Haruechaiyasak C,Jitkrittum W,Sangkeettrakarn C,et al.Im-plementing News Article Category Browsing Based on Text Categorization Technique [C]∥2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.IEEE Computer Society,2008:143-146.
  • 7Mikolov T,Sutskever I,Chen K,et al.Distributed Representations of Words and Phrases and their Compositionality [J].Advances in Neural Information Processing Systems,2013,26:3111-3119.
  • 8Mikolov T,Chen K,Corrado G,et al.Efficient Estimation of Word Representations in Vector Space [C]∥ICLR 2013.2013.
  • 9Joachims T.A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization [M].Springer US,1997:143-151.
  • 10Hinton G E.Learning distributed representations of concepts[C]∥Proceedings of CogSci.1986:1-12.

共引文献478

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部