With the extensive application of software collaborative development technology,the processing of code data generated in programming scenes has become a research hotspot.In the collaborative programming process,differ...With the extensive application of software collaborative development technology,the processing of code data generated in programming scenes has become a research hotspot.In the collaborative programming process,different users can submit code in a distributed way.The consistency of code grammar can be achieved by syntax constraints.However,when different users work on the same code in semantic development programming practices,the development factors of different users will inevitably lead to the problem of data semantic conflict.In this paper,the characteristics of code segment data in a programming scene are considered.The code sequence can be obtained by disassembling the code segment using lexical analysis technology.Combined with a traditional solution of a data conflict problem,the code sequence can be taken as the declared value object in the data conflict resolution problem.Through the similarity analysis of code sequence objects,the concept of the deviation degree between the declared value object and the truth value object is proposed.A multi-truth discovery algorithm,called the multiple truth discovery algorithm based on deviation(MTDD),is proposed.The basic methods,such as Conflict Resolution on Heterogeneous Data,Voting-K,and MTRuths_Greedy,are compared to verify the performance and precision of the proposed MTDD algorithm.展开更多
基金supported by the National Key R&D Program of China(No.2018YFB1003905)the National Natural Science Foundation of China under Grant(No.61971032)Fundamental Research Funds for the Central Universities(No.FRF-TP-18-008A3).
文摘With the extensive application of software collaborative development technology,the processing of code data generated in programming scenes has become a research hotspot.In the collaborative programming process,different users can submit code in a distributed way.The consistency of code grammar can be achieved by syntax constraints.However,when different users work on the same code in semantic development programming practices,the development factors of different users will inevitably lead to the problem of data semantic conflict.In this paper,the characteristics of code segment data in a programming scene are considered.The code sequence can be obtained by disassembling the code segment using lexical analysis technology.Combined with a traditional solution of a data conflict problem,the code sequence can be taken as the declared value object in the data conflict resolution problem.Through the similarity analysis of code sequence objects,the concept of the deviation degree between the declared value object and the truth value object is proposed.A multi-truth discovery algorithm,called the multiple truth discovery algorithm based on deviation(MTDD),is proposed.The basic methods,such as Conflict Resolution on Heterogeneous Data,Voting-K,and MTRuths_Greedy,are compared to verify the performance and precision of the proposed MTDD algorithm.