期刊文献+

面向借贷案件的相似案例匹配模型 被引量:1

Similar Case Matching Model for Lending Cases
下载PDF
导出
摘要 相似案例匹配任务是文本匹配在司法领域的具体应用之一,目的在于区分法律文书是否相似,对类案检索具有重要意义。与传统文本匹配任务相比,法律文本通常篇幅较长,同时相似案例匹配是针对相同案由案件的匹配,案情文本之间的差异较小,以往的文本匹配方法很难计算文本相似度。针对借贷案件文本匹配存在的问题,建立一种融合借贷案件关键要素的相似案例匹配模型。为了获取文本中更丰富的语义特征,构建正则表达式获得借贷案件的特定案件要素,如借款交付形式、借款人基本属性等,并与原有的案情文本相结合,联合学习法律文本与案件关键要素的语义特征。同时,利用共享权重的预训练模型分别对不同的文书进行编码,并且对预训练模型特定编码层的输出进行融合,得到更加丰富的语义信息。引入有监督对比学习框架,更好地利用样本信息,进一步提高相似案例匹配的性能。在CAIL2019-SCM数据集上的实验结果表明,与LFESM模型相比,该模型在测试集上的准确率提高了1.05个百分点。 The purpose of Similar Case Matching(SCM)is to distinguish whether legal documents are similar,which is a specific application of text matching and is vital to the retrieval of similar cases.Compared with conventional texts,legal texts are typically longer,and SCM aims to realize matching for the same case.Moreover,the difference between case texts is negligible;therefore,calculating text similarity using previous text-matching methods is challenging.This study establishes a SCM model that integrates key elements of lending cases to address the issues of text matching in lending cases.To obtain richer semantic features from texts,regular expressions are constructed to obtain specific case elements of lending cases,such as the loan-delivery form and the basic attributes of borrowers,which are then combined with the original case text to jointly learn the semantic features of the legal text and key elements of the case.Additionally,pretrained models with shared weights are used to encode different instruments separately,and the outputs of specific encoding layers of the pretrained models are fused to obtain richer semantic information.Finally,the proposed model incorporates a supervised comparison learning framework to utilize the text information more effectively and further improve the performance of SCM.Experiments on the CAIL2019-SCM dataset show that this model improves the accuracy of the test set by 1.05 percentage points compared with LFESM models.
作者 曹发鑫 孙媛媛 王治政 潘丁豪 林鸿飞 CAO Faxin;SUN Yuanyuan;WANG Zhizheng;PAN Dinghao;LIN Hongfei(School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,Liaoning,China)
出处 《计算机工程》 CSCD 北大核心 2024年第1期306-312,共7页 Computer Engineering
基金 国家重点研发计划(2022YFC3301801) 中央高校基本科研业务费专项资金(DUT22ZD205)。
关键词 相似案例匹配 孪生网络 对比学习 预训练模型 法律关键要素 Similar Case Matching(SCM) Siamese network contrastive learning pretrained model key legal element
  • 相关文献

参考文献2

二级参考文献19

  • 1BANERJI S,SINHA A,LIU Cheng-jun. A new bag of words LBP (BoWL) descriptor for scene image classification[C] //Proc of Computer Analysis of Images and Patterns. Berlin:Springer,2013:490-497.
  • 2WAN G G,LIU Zao. Content-based information retrieval and digital libraries[J].Information Technology and Libraries,2013,27(1):41-47.
  • 3DEERWESTER S,DUMAIS S T,FURNAS G W,et al. Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407.
  • 4HOFFMANN T. Probabilistic latent semantic indexing[C] //Proc of the 22nd Annual SIGIR. 1999.
  • 5WEINSHALL D,LEVI G,HANUKAEV D. Lda topic model with soft assignment of descriptors to words[C] //Proc of the 30th International Conference on Machine Learning. 2013:711-719.
  • 6SONG Fei,CROFT W B. A general language model for information retrieval[C] //Proc of the 8th International Conference on Information and Knowledge Management. New York:ACM Press,1999:316-321.
  • 7NGUYEN V A,BOYD-GRABER J,RESNIK P. Lexical and hierarchical topic regression[C] //Advances in Neural Information Proces-sing Systems. 2013:1106-1114.
  • 8SOLTUZ S M,RHOADES B E. A mixed iteration for nonnegative matrix factorizations[J].Applied Mathematics and Computation,2013,219(18):9847-9855.
  • 9LEE S,LEE J,PARK C Y,et al. Blog topic analysis using TF smoothing and LDA[C] //Proc of the 7th International Conference on Ubiquitous Information Management and Communication. New York:ACM Press,2013.
  • 10TAO Tao,ZHAI Cheng-xiang. A mixture clustering model for pseudo feedback in information retrieval[C] //Proc of Classification,Clustering,and Data Mining Applications. Berlin:Springer,2004:541-551.

共引文献8

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部