This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependenc...This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.展开更多
Stance detection aims to automatically determine whether the author is in favor of or against a given target.In principle,the sentiment information of a post highly influences the stance.In this study,we aim to levera...Stance detection aims to automatically determine whether the author is in favor of or against a given target.In principle,the sentiment information of a post highly influences the stance.In this study,we aim to leverage the sentiment information of a post to improve the performance of stance detection.However,conventional discrete models with sentimental features can cause error propagation.We thus propose a joint neural network model to predict the stance and sentiment of a post simultaneously,because the neural network model can learn both representation and interaction between the stance and sentiment collectively.Specifically, we first learn a deep shared representation between stance and sentiment information,and then use a neural stacking model to leverage sentimental information for the stance detection task.Empirical studies demonstrate the effectiveness of our proposed joint neural model.展开更多
This paper puts forward and explores the problem of empty element (EE) recovery in Chinese from the syntactic parsing perspective, which has been largely ignored in the literature. First, we demonstrate why EEs play...This paper puts forward and explores the problem of empty element (EE) recovery in Chinese from the syntactic parsing perspective, which has been largely ignored in the literature. First, we demonstrate why EEs play a critical role in syntactic parsing of Chinese and how EEs can better benefit syntactic parsing of Chinese via re-categorization from the syntactic perspective. Then, we propose two ways to automatically recover EEs: a joint constituent parsing approach and a chunk-based dependency parsing approach. Evaluation on the Chinese TreeBank (CTB) 5.1 corpus shows that integrating EE recovery into the Charniak parser achieves a significant performance improvement of 1.29 in Fl-measure. To the best of our knowledge, this is the first close examination of EEs in syntactic parsing of Chinese, which deserves more attention in the future with regard to its specific importance.展开更多
Personal profile information on social media like LinkedIn.com and Facebook.com is at the core of many inter- esting applications, such as talent recommendation and con- textual advertising. However, personal profiles...Personal profile information on social media like LinkedIn.com and Facebook.com is at the core of many inter- esting applications, such as talent recommendation and con- textual advertising. However, personal profiles usually lack consistent organization confronted with the large amount of available information. Therefore, it is always a challenge for people to quickly find desired information from them. In this paper, we address the task of personal profile summarization by leveraging both textual information and social connection information in social networks from both unsupervised and supervised learning paradigms. Here, using social connec- tion information is motivated by the intuition that people with similar academic, business or social background (e.g., co- major, co-university, and co-corporation) tend to have similar experiences and should have similar summaries. For unsu- pervised learning, we propose a collective ranking approach, called SocialRank, to combine textual information in an in- dividual profile and social context information from relevant profiles in generating a personal profile summary. For super- vised learning, we propose a collective factor graph model, called CoFG, to summarize personal profiles with local tex- tual attribute functions and social connection factors. Exten- sive evaluation on a large dataset from LinkedIn.com demon- strates the usefulness of social connection information in per- sonal profile summarization and the effectiveness of our pro- posed unsupervised and supervised learning approaches.展开更多
We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic ...We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic phenomenon and therefore difficult to identify and eliminate.In this paper,we first create a novel task named implicit discourse relation disambiguation(IDRD).Second,we propose a focus-sensitive relation disambiguation model that affirms a truly-correct relation when it is triggered by focal sentence constituents.In addition,we specifically develop a topicdriven focus identification method and a relation search system(RSS)to support the relation disambiguation.Finally,we improve current relation detection systems by using the disambiguation model.Experiments on the penn discourse treebank(PDTB)show promising improvements.展开更多
Survey generation aims to generate a summary from a scientific topic based on related papers.The structure of papers deeply influences the generative process of survey,especially the relationships between sentence and...Survey generation aims to generate a summary from a scientific topic based on related papers.The structure of papers deeply influences the generative process of survey,especially the relationships between sentence and sentence,paragraph and paragraph.In principle,the structure of paper can influence the quality of the summary.Therefore,we employ the structure of paper to leverage contextual information among sentences in paragraphs to generate a survey for documents.In particular,we present a neural document structure model for survey generation.We take paragraphs as units,and model sentences in paragraphs,we then employ a hierarchical model to learn structure among sentences,which can be used to select important and informative sentences to generate survey.We evaluate our model on scientific document data set.The experimental results show that our model is effective,and the generated survey is informative and readable.展开更多
基金Supported by the National Natural Science Foundation of China under Grant Nos.61331011 and 61273320the National High Technology Research and Development 863 Program of China under Grant No.2012AA011102the Natural Science Foundation of Jiangsu Provincial Department of Education under Grant No.10KJB520016
文摘This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.
基金the National Natural Science Foundation of China (Grant Nos.61331011,61751206,61773276,61672366)Jiangsu Provincial Science and Technology Plan (BK20151222)Project of Natural Science Research of the Universities of Jiangsu Province.
文摘Stance detection aims to automatically determine whether the author is in favor of or against a given target.In principle,the sentiment information of a post highly influences the stance.In this study,we aim to leverage the sentiment information of a post to improve the performance of stance detection.However,conventional discrete models with sentimental features can cause error propagation.We thus propose a joint neural network model to predict the stance and sentiment of a post simultaneously,because the neural network model can learn both representation and interaction between the stance and sentiment collectively.Specifically, we first learn a deep shared representation between stance and sentiment information,and then use a neural stacking model to leverage sentimental information for the stance detection task.Empirical studies demonstrate the effectiveness of our proposed joint neural model.
基金Supported by the National Natural Science Foundation of China under Grant Nos.61273320,61331011,61070123the National High Technology Research and Development 863 Program of China under Grant No.2012AA011102
文摘This paper puts forward and explores the problem of empty element (EE) recovery in Chinese from the syntactic parsing perspective, which has been largely ignored in the literature. First, we demonstrate why EEs play a critical role in syntactic parsing of Chinese and how EEs can better benefit syntactic parsing of Chinese via re-categorization from the syntactic perspective. Then, we propose two ways to automatically recover EEs: a joint constituent parsing approach and a chunk-based dependency parsing approach. Evaluation on the Chinese TreeBank (CTB) 5.1 corpus shows that integrating EE recovery into the Charniak parser achieves a significant performance improvement of 1.29 in Fl-measure. To the best of our knowledge, this is the first close examination of EEs in syntactic parsing of Chinese, which deserves more attention in the future with regard to its specific importance.
基金We appreciate Dr. Jie Tang and Dr. Honglei Zhuang for providing their software and useful suggestions about probobility of graph model (PGM). We acknowledge Dr. Xinfang Liu, Dr. Yunxia Xue, and Dr. Yulai Shen for corpus construction and insightful comments. We also thank anonymous reviewers for their valuable suggestions and comments. The work was supported by the National Natural Science Foundation of China (Grant Nos. 61273320, 61375073, and 61402314) and the Key Project of the National Natural Science Foundation of China (61331011).
文摘Personal profile information on social media like LinkedIn.com and Facebook.com is at the core of many inter- esting applications, such as talent recommendation and con- textual advertising. However, personal profiles usually lack consistent organization confronted with the large amount of available information. Therefore, it is always a challenge for people to quickly find desired information from them. In this paper, we address the task of personal profile summarization by leveraging both textual information and social connection information in social networks from both unsupervised and supervised learning paradigms. Here, using social connec- tion information is motivated by the intuition that people with similar academic, business or social background (e.g., co- major, co-university, and co-corporation) tend to have similar experiences and should have similar summaries. For unsu- pervised learning, we propose a collective ranking approach, called SocialRank, to combine textual information in an in- dividual profile and social context information from relevant profiles in generating a personal profile summary. For super- vised learning, we propose a collective factor graph model, called CoFG, to summarize personal profiles with local tex- tual attribute functions and social connection factors. Exten- sive evaluation on a large dataset from LinkedIn.com demon- strates the usefulness of social connection information in per- sonal profile summarization and the effectiveness of our pro- posed unsupervised and supervised learning approaches.
基金supported by the National Natural Science Foundation of China(Grant Nos.61672368,61373097,61672367,61331011)the Research Foundation of the Ministry of Education and China Mobile(MCM20150602)Natural Science Foundation of Jiangsu(BK20151222).
文摘We study implicit discourse relation detection,which is one of the most challenging tasks in the field of discourse analysis.We specialize in ambiguous implicit discourse relation,which is an imperceptible linguistic phenomenon and therefore difficult to identify and eliminate.In this paper,we first create a novel task named implicit discourse relation disambiguation(IDRD).Second,we propose a focus-sensitive relation disambiguation model that affirms a truly-correct relation when it is triggered by focal sentence constituents.In addition,we specifically develop a topicdriven focus identification method and a relation search system(RSS)to support the relation disambiguation.Finally,we improve current relation detection systems by using the disambiguation model.Experiments on the penn discourse treebank(PDTB)show promising improvements.
基金This work was supported by the Fundamental Research Funds for the Central Universities(2018B678X14 and 2016B44414)Postgraduate Research Practice Innovation Program of Jiangsu Province of China(KYCX18_0553 and KYLX16_0722)+1 种基金the National Natural Science Foundation of China(Grant Nos.61806137 and 61976146)Project of Natural Science Research of the Universities of Jiangsu Province(18KJB520043).
文摘Survey generation aims to generate a summary from a scientific topic based on related papers.The structure of papers deeply influences the generative process of survey,especially the relationships between sentence and sentence,paragraph and paragraph.In principle,the structure of paper can influence the quality of the summary.Therefore,we employ the structure of paper to leverage contextual information among sentences in paragraphs to generate a survey for documents.In particular,we present a neural document structure model for survey generation.We take paragraphs as units,and model sentences in paragraphs,we then employ a hierarchical model to learn structure among sentences,which can be used to select important and informative sentences to generate survey.We evaluate our model on scientific document data set.The experimental results show that our model is effective,and the generated survey is informative and readable.