In the recent informatization of Chinese courts, the huge amount of law cases and judgment documents, which were digital stored,has provided a good foundation for the research of judicial big data and machine learning...In the recent informatization of Chinese courts, the huge amount of law cases and judgment documents, which were digital stored,has provided a good foundation for the research of judicial big data and machine learning. In this situation, some ideas about Chinese courts can reach automation or get better result through the research of machine learning, such as similar documents recommendation, workload evaluation based on similarity of judgement documents and prediction of possible relevant statutes. In trying to achieve all above mentioned, and also in face of the characteristics of Chinese judgement document, we propose a topic model based approach to measure the text similarity of Chinese judgement document, which is based on TF-IDF, Latent Dirichlet Allocation (LDA), Labeled Latent Dirichlet Allocation (LLDA) and other treatments. Combining with the characteristics of Chinese judgment document,we focus on the specific steps of approach, the preprocessing of corpus, the parameters choices of training and the evaluation of similarity measure result. Besides, implementing the approach for prediction of possible statutes and regarding the prediction accuracy as the evaluation metric, we designed experiments to demonstrate the reasonability of decisions in the process of design and the high performance of our approach on text similarity measure. The experiments also show the restriction of our approach which need to be focused in future work.展开更多
随着智慧政务的深入发展,针对政务平台在答复群众留言的质量与效率方面产生的多方面问题,依据政务绩效评估理论,结合ALBERT(A Lite BERT)等算法,研究了政务答复的及时性、相关性、详尽性、信息强度、可解释性和规范性;并根据自编码器提...随着智慧政务的深入发展,针对政务平台在答复群众留言的质量与效率方面产生的多方面问题,依据政务绩效评估理论,结合ALBERT(A Lite BERT)等算法,研究了政务答复的及时性、相关性、详尽性、信息强度、可解释性和规范性;并根据自编码器提取的潜在空间表征和熵权法确定的表征权重,构建政务答复质量的综合评价模型。对海关业务咨询的答复质量进行评价,其中各表征权重分别为0.098、0.436、0.466;归一化评分在0.2~0.4之间的答复最多,占比39.7%;模型对3000条随机选取的答复评分与人工评分的一致性程度为0.777,MSE为0.035,表明该模型能够反映真实的答复质量。展开更多
文摘In the recent informatization of Chinese courts, the huge amount of law cases and judgment documents, which were digital stored,has provided a good foundation for the research of judicial big data and machine learning. In this situation, some ideas about Chinese courts can reach automation or get better result through the research of machine learning, such as similar documents recommendation, workload evaluation based on similarity of judgement documents and prediction of possible relevant statutes. In trying to achieve all above mentioned, and also in face of the characteristics of Chinese judgement document, we propose a topic model based approach to measure the text similarity of Chinese judgement document, which is based on TF-IDF, Latent Dirichlet Allocation (LDA), Labeled Latent Dirichlet Allocation (LLDA) and other treatments. Combining with the characteristics of Chinese judgment document,we focus on the specific steps of approach, the preprocessing of corpus, the parameters choices of training and the evaluation of similarity measure result. Besides, implementing the approach for prediction of possible statutes and regarding the prediction accuracy as the evaluation metric, we designed experiments to demonstrate the reasonability of decisions in the process of design and the high performance of our approach on text similarity measure. The experiments also show the restriction of our approach which need to be focused in future work.
文摘随着智慧政务的深入发展,针对政务平台在答复群众留言的质量与效率方面产生的多方面问题,依据政务绩效评估理论,结合ALBERT(A Lite BERT)等算法,研究了政务答复的及时性、相关性、详尽性、信息强度、可解释性和规范性;并根据自编码器提取的潜在空间表征和熵权法确定的表征权重,构建政务答复质量的综合评价模型。对海关业务咨询的答复质量进行评价,其中各表征权重分别为0.098、0.436、0.466;归一化评分在0.2~0.4之间的答复最多,占比39.7%;模型对3000条随机选取的答复评分与人工评分的一致性程度为0.777,MSE为0.035,表明该模型能够反映真实的答复质量。