The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the proble...The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the problems of evaluating preceding-sibling-following-sibling and preceding-following relationships are still open. In this paper, we studied the structural join and staircase join for sibling relationship. First, the idea of how to filter out and minimize unnecessary reads of elements using parent's structural information is introduced, which can be used to accelerate structural joins of parent-child and preceding-sibling-following-sibling relationships. Second, two efficient structural join algorithms of sibling relationship are proposed. These algorithms lead to optimal join performance: nodes that do not participate in the join can be judged beforehand and then skipped using B^+-tree index. Besides, each element list joined is scanned sequentially once at most. Furthermore, output of join results is sorted in document order. We also discussed the staircase join algorithm for sibling axes. Studies show that, staircase join for sibling axes is close to the structural join for sibling axes and shares the same characteristic of high efficiency. Our experimental results not only demonstrate the effectiveness of our optimizing techniques for sibling axes, but also validate the efficiency of our algorithms. As far as we know, this is the first work addressing this problem specially.展开更多
Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other seman...Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other semantic information such as semantic collocation and semantic category. Some improvements on this distinctive parser are presented. Firstly, "valency" is an essential semantic feature of words. Once the valency of word is determined, the collocation of the word is clear, and the sentence structure can be directly derived. Thus, a syntactic parsing model combining valence structure with semantic dependency is purposed on the base of head-driven statistical syntactic parsing models. Secondly, semantic role labeling(SRL) is very necessary for deep natural language processing. An integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Experiments are conducted for the refined statistical parser. The results show that 87.12% precision and 85.04% recall are obtained, and F measure is improved by 5.68% compared with the head-driven parsing model introduced by Collins.展开更多
At the early stage of software lifecycle, the complexity measurement of UML class diagrams plays an important role in software development, testing and maintenance, and provides guidance for developing high quality so...At the early stage of software lifecycle, the complexity measurement of UML class diagrams plays an important role in software development, testing and maintenance, and provides guidance for developing high quality software. In order to study which one is better, simple or complex metrics, this paper analyzes and compares four typical metrics of UML class diagrams from experimental software engineering view points. Understandability, analyzability and maintainability were classified and predicted for 27 class diagrams related to a banking system by means of algorithm C5.0 within the famous toolkit Weka. Results suggest that the performance of simple metrics is not inferior to that of complex metrics, in some cases even better than that of complex metrics.展开更多
Despite its success,similarity-based collaborative filtering suffers from some limitations,such as scalability,sparsity and recommendation attack.Prior work has shown incorporating trust mechanism into traditional col...Despite its success,similarity-based collaborative filtering suffers from some limitations,such as scalability,sparsity and recommendation attack.Prior work has shown incorporating trust mechanism into traditional collaborative filtering recommender systems can improve these limitations.We argue that trust-based recommender systems are facing novel recommendation attack which is different from the profile injection attacks in traditional recommender system.To the best of our knowledge,there has not any prior study on recommendation attack in a trust-based recommender system.We analyze the attack problem,and find that "victim" nodes play a significant role in the attack.Furthermore,we propose a data provenance method to trace malicious users and identify the "victim" nodes as distrust users of recommender system.Feasibility study of the defend method is done with the dataset crawled from Epinions website.展开更多
基金This work is partially supported by the Natural Science Foundation of Jiangxi Province under Grant No. 0411009.
文摘The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the problems of evaluating preceding-sibling-following-sibling and preceding-following relationships are still open. In this paper, we studied the structural join and staircase join for sibling relationship. First, the idea of how to filter out and minimize unnecessary reads of elements using parent's structural information is introduced, which can be used to accelerate structural joins of parent-child and preceding-sibling-following-sibling relationships. Second, two efficient structural join algorithms of sibling relationship are proposed. These algorithms lead to optimal join performance: nodes that do not participate in the join can be judged beforehand and then skipped using B^+-tree index. Besides, each element list joined is scanned sequentially once at most. Furthermore, output of join results is sorted in document order. We also discussed the staircase join algorithm for sibling axes. Studies show that, staircase join for sibling axes is close to the structural join for sibling axes and shares the same characteristic of high efficiency. Our experimental results not only demonstrate the effectiveness of our optimizing techniques for sibling axes, but also validate the efficiency of our algorithms. As far as we know, this is the first work addressing this problem specially.
基金Project(61262035) supported by the National Natural Science Foundation of ChinaProjects(GJJ12271,GJJ12742) supported by the Science and Technology Foundation of Education Department of Jiangxi Province,ChinaProject(20122BAB201033) supported by the Natural Science Foundation of Jiangxi Province,China
文摘Head-driven statistical models for natural language parsing are the most representative lexicalized syntactic parsing models, but they only utilize semantic dependency between words, and do not incorporate other semantic information such as semantic collocation and semantic category. Some improvements on this distinctive parser are presented. Firstly, "valency" is an essential semantic feature of words. Once the valency of word is determined, the collocation of the word is clear, and the sentence structure can be directly derived. Thus, a syntactic parsing model combining valence structure with semantic dependency is purposed on the base of head-driven statistical syntactic parsing models. Secondly, semantic role labeling(SRL) is very necessary for deep natural language processing. An integrated parsing approach is proposed to integrate semantic parsing into the syntactic parsing process. Experiments are conducted for the refined statistical parser. The results show that 87.12% precision and 85.04% recall are obtained, and F measure is improved by 5.68% compared with the head-driven parsing model introduced by Collins.
文摘At the early stage of software lifecycle, the complexity measurement of UML class diagrams plays an important role in software development, testing and maintenance, and provides guidance for developing high quality software. In order to study which one is better, simple or complex metrics, this paper analyzes and compares four typical metrics of UML class diagrams from experimental software engineering view points. Understandability, analyzability and maintainability were classified and predicted for 27 class diagrams related to a banking system by means of algorithm C5.0 within the famous toolkit Weka. Results suggest that the performance of simple metrics is not inferior to that of complex metrics, in some cases even better than that of complex metrics.
基金Supported by the Foundation of Jiangxi Provincial Department of Education under Grant No.GJJ.10696
文摘Despite its success,similarity-based collaborative filtering suffers from some limitations,such as scalability,sparsity and recommendation attack.Prior work has shown incorporating trust mechanism into traditional collaborative filtering recommender systems can improve these limitations.We argue that trust-based recommender systems are facing novel recommendation attack which is different from the profile injection attacks in traditional recommender system.To the best of our knowledge,there has not any prior study on recommendation attack in a trust-based recommender system.We analyze the attack problem,and find that "victim" nodes play a significant role in the attack.Furthermore,we propose a data provenance method to trace malicious users and identify the "victim" nodes as distrust users of recommender system.Feasibility study of the defend method is done with the dataset crawled from Epinions website.