With the promotion of Wisdom Court construction and the increasing completeness of judicial big data, the combination of judicial and artificial intelligence attracted more and more attention. The Judicial document is...With the promotion of Wisdom Court construction and the increasing completeness of judicial big data, the combination of judicial and artificial intelligence attracted more and more attention. The Judicial document is the most common textual information in cases. Due to the development of text analysis and processing techniques, we can mine more information from the judicial text and apply it in judgment. In this paper, we use the popular deep learning text classification algorithms to predict the imprisonment based on the fact of cases, which is expected to assist judges and staffs of procuratorate on sentencing. The result of our experiments shows the feasibility and utility of our method.展开更多
Under the background of judicial responsibility system, making similar judgments according to similar cases is vital for front-line judges to solve complicated problems such as non-standard use of law and inconsistenc...Under the background of judicial responsibility system, making similar judgments according to similar cases is vital for front-line judges to solve complicated problems such as non-standard use of law and inconsistency of judicial ruling standards. In this paper, a method is proposed for judicial cases based on the LDA topic model. The case, penalty and legal provisions were set. Gibbs Sampling algorithm was employed to estimate the probability distribution of topics on the implicit topic set in a text and calculate the similarity between texts by cosine similarity. The quality of screening was used as a final evaluation indicator. The verification of massive experiments shows that the case screening method based on LDA and cosine similarity has a satisfactory effect.展开更多
Under the background of the Judicial Reform of China, big data of judicial cases are widely used to solve the problem of judicial research. Similarity analysis of judicial cases is the basis of wisdom judicature. In v...Under the background of the Judicial Reform of China, big data of judicial cases are widely used to solve the problem of judicial research. Similarity analysis of judicial cases is the basis of wisdom judicature. In view of the necessity of getting rid of the ineffective information and extracting useful rules and conditions from the descriptive document, the analysis of Chinese judicial cases with a certain format is a big challenge. Hence, we propose a method that focuses on producing recommendations that are based on the content of judicial cases. Considering the particularity of Chinese language, we use “jieba” text segmentation to preprocess the cases. In view of the lack of labels of user interest and behavior, the proposed method considers the content information via adopting TF-IDF combined with LDA topic model, as opposed to the traditional methods such as CF (Collaborative Filtering Recommendations). Users are recommended to compute cosine similarity of cases in the same topic. In the experiments, we evaluate the performance of the proposed model on a given dataset of nearly 200,000 judicial cases. The experimental result reveals when the number of topics is around 80, the proposed method gets the best performance.展开更多
In crowdsourced mobile application testing, workers are often inexperienced in and unfamiliar with software testing. Meanwhile, workers edit test reports in descriptive natural language on mobile devices. Thus, these ...In crowdsourced mobile application testing, workers are often inexperienced in and unfamiliar with software testing. Meanwhile, workers edit test reports in descriptive natural language on mobile devices. Thus, these test reports generally lack important details and challenge developers in understanding the bugs. To improve the quality of inspected test reports, we issue a new problem of test report augmentation by leveraging the additional useful information contained in duplicate test reports. In this paper, we propose a new framework named test report augmentation framework (TRAF) towards resolving the problem. First, natural language processing (NLP) techniques are adopted to preprocess the crowdsourced test reports. Then, three strategies are proposed to augment the environments, inputs, and descriptions of the inspected test reports, respectively. Finally, we visualize the augmented test reports to help developers distinguish the added information. To evaluate TRAF, we conduct experiments over five industrial datasets with 757 crowdsourced test reports. Experimental results show that TRAF can recommend relevant inputs to augment the inspected test reports with 98.49% in terms of NDCG and 88.65% in terms of precision on average, and identify valuable sentences from the descriptions of duplicates to augment the inspected test reports with 83.58% in terms of precision, 77.76% in terms of recall, and 78.72% in terms of F-measure on average. Meanwhile, empirical evaluation also demonstrates that augmented test reports can help developers understand and fix bugs better.展开更多
基金the National Key Research and Development Program of China (2016YFC0800805)the National Natural Science Foundation of China (61772014).
文摘With the promotion of Wisdom Court construction and the increasing completeness of judicial big data, the combination of judicial and artificial intelligence attracted more and more attention. The Judicial document is the most common textual information in cases. Due to the development of text analysis and processing techniques, we can mine more information from the judicial text and apply it in judgment. In this paper, we use the popular deep learning text classification algorithms to predict the imprisonment based on the fact of cases, which is expected to assist judges and staffs of procuratorate on sentencing. The result of our experiments shows the feasibility and utility of our method.
基金the National Key Research and Development Program of China (2016YFC0800805)the National Natural Science Foundation of China (61772014).
文摘Under the background of judicial responsibility system, making similar judgments according to similar cases is vital for front-line judges to solve complicated problems such as non-standard use of law and inconsistency of judicial ruling standards. In this paper, a method is proposed for judicial cases based on the LDA topic model. The case, penalty and legal provisions were set. Gibbs Sampling algorithm was employed to estimate the probability distribution of topics on the implicit topic set in a text and calculate the similarity between texts by cosine similarity. The quality of screening was used as a final evaluation indicator. The verification of massive experiments shows that the case screening method based on LDA and cosine similarity has a satisfactory effect.
基金the National Key Research and Development Program of China (2016YFC0800805)the National Natural Science Foundation of China (61772014).
文摘Under the background of the Judicial Reform of China, big data of judicial cases are widely used to solve the problem of judicial research. Similarity analysis of judicial cases is the basis of wisdom judicature. In view of the necessity of getting rid of the ineffective information and extracting useful rules and conditions from the descriptive document, the analysis of Chinese judicial cases with a certain format is a big challenge. Hence, we propose a method that focuses on producing recommendations that are based on the content of judicial cases. Considering the particularity of Chinese language, we use “jieba” text segmentation to preprocess the cases. In view of the lack of labels of user interest and behavior, the proposed method considers the content information via adopting TF-IDF combined with LDA topic model, as opposed to the traditional methods such as CF (Collaborative Filtering Recommendations). Users are recommended to compute cosine similarity of cases in the same topic. In the experiments, we evaluate the performance of the proposed model on a given dataset of nearly 200,000 judicial cases. The experimental result reveals when the number of topics is around 80, the proposed method gets the best performance.
基金This work was partially supported by the National Natural Science Foundation of China (Grant Nos. 61370144, 61722202, 61403057, and 61772107)Jiangsu Prospective Project of Industry- University-Research (BY2015069-03)Besides, the authors would thank the three graduate students who devote their efforts for the data annotation.
文摘In crowdsourced mobile application testing, workers are often inexperienced in and unfamiliar with software testing. Meanwhile, workers edit test reports in descriptive natural language on mobile devices. Thus, these test reports generally lack important details and challenge developers in understanding the bugs. To improve the quality of inspected test reports, we issue a new problem of test report augmentation by leveraging the additional useful information contained in duplicate test reports. In this paper, we propose a new framework named test report augmentation framework (TRAF) towards resolving the problem. First, natural language processing (NLP) techniques are adopted to preprocess the crowdsourced test reports. Then, three strategies are proposed to augment the environments, inputs, and descriptions of the inspected test reports, respectively. Finally, we visualize the augmented test reports to help developers distinguish the added information. To evaluate TRAF, we conduct experiments over five industrial datasets with 757 crowdsourced test reports. Experimental results show that TRAF can recommend relevant inputs to augment the inspected test reports with 98.49% in terms of NDCG and 88.65% in terms of precision on average, and identify valuable sentences from the descriptions of duplicates to augment the inspected test reports with 83.58% in terms of precision, 77.76% in terms of recall, and 78.72% in terms of F-measure on average. Meanwhile, empirical evaluation also demonstrates that augmented test reports can help developers understand and fix bugs better.