摘要
随着互联网规模的不断壮大,信息量正以前所未有的速度巨量增长着。在这个环境下,大数据应运而生。其法律数据呈现出数量大、速率快、多样化的特点。如何运用先进的方式对海量数据进行采集、处理以及分析显得尤为关键。提出了一套基于法律大数据的智能系统。该系统利用Scrapy网络爬虫采集判决文书和法律条目并使用正则和TF-IDF提取要素信息和文本关键字,实现多维度的文书分类检索功能,并结合Word2Vec与TF-IDF分析文章相似度,实现相关文书的内容推荐。
With the continuous development of the Internet,the amount of information is growing at an unprecedented rate leading to the big data age.The law data in big data age shows the character of large quantity,fast speed and diversification.It is particularly important to use advanced methods to collect,process and analyze massive data.This paper proposes an display system based on big data of law which could collect judgement documents law terms and the key elements and keywords through Scrapy framework and regular expression matching,TF-IDF respectively for achieving the categorization of query function from the aspect of territory and court,text keyword,case type or etc.The system also achieve the recommendation function of related documents by calculating the similarity of articles through Word2Vec and TF-IDF.
出处
《工业控制计算机》
2020年第5期69-71,共3页
Industrial Control Computer