Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine th...Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.展开更多
Large language models(LLMs)have made unprecedented progress,demonstrating human-like language proficiency and an extraordinary ability to encode complex knowledge.The emergence of high-level cognitive capabilities in ...Large language models(LLMs)have made unprecedented progress,demonstrating human-like language proficiency and an extraordinary ability to encode complex knowledge.The emergence of high-level cognitive capabilities in LLMs,such as in-context learning and complex reasoning,suggests a path toward the realization of artificial general intelligence(AGI).However,we lack scientific theories and tools to assess and interpret such an emergence of the advanced intelligence of LLMs.Artificial intelligence(AI)has been extensively applied in various areas of fundamental science to accelerate scientific research.展开更多
Recently,many online Karaoke(KTV)platforms have been released,where music lovers sing songs on these platforms.In the meantime,the system automatically evaluates user proficiency according to their singing behavior.Re...Recently,many online Karaoke(KTV)platforms have been released,where music lovers sing songs on these platforms.In the meantime,the system automatically evaluates user proficiency according to their singing behavior.Recommending approximate songs to users can initialize singers5 participation and improve users,loyalty to these platforms.However,this is not an easy task due to the unique characteristics of these platforms.First,since users may be not achieving high scores evaluated by the system on their favorite songs,how to balance user preferences with user proficiency on singing for song recommendation is still open.Second,the sparsity of the user-song interaction behavior may greatly impact the recommendation task.To solve the above two challenges,in this paper,we propose an informationfused song recommendation model by considering the unique characteristics of the singing data.Specifically,we first devise a pseudo-rating matrix by combing users’singing behavior and the system evaluations,thus users'preferences and proficiency are leveraged.Then we mitigate the data sparsity problem by fusing users*and songs'rich information in the matrix factorization process of the pseudo-rating matrix.Finally,extensive experimental results on a real-world dataset show the effectiveness of our proposed model.展开更多
Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"informatio...Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"information(contextual information of the mention in the text),or"global"information(relations among candidate entities).However,either local or global information might be insufficient especially when the given text is short.To get richer local and global information for entity linking,we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines(WSE).Based on the intuition above,two novel attempts are made.The first one adds web-searched results into an embedding-based method to expand the mention's local information,where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method.The second one uses the web contexts to extend the global information,i.e.,finding and utilizing more extra relevant mentions from the web contexts with a graph-based model.Finally,we combine the two models we propose to use both extended local and global information from the extra web contexts.Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.展开更多
基金National Natural Science Foundation of China(Grant Nos.62376166,62306188,61876113)National Key R&D Program of China(No.2022YFC3303504).
文摘Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.
基金This work was funded by National Natural Science Foundation of China(62001205)National Key R&D Program of China(2021YFF1200804)Shenzhen Science and Technology Innovation Committee(2022410129,KCXFZ20201221173400001,and SGDX2020110309280100).
文摘Large language models(LLMs)have made unprecedented progress,demonstrating human-like language proficiency and an extraordinary ability to encode complex knowledge.The emergence of high-level cognitive capabilities in LLMs,such as in-context learning and complex reasoning,suggests a path toward the realization of artificial general intelligence(AGI).However,we lack scientific theories and tools to assess and interpret such an emergence of the advanced intelligence of LLMs.Artificial intelligence(AI)has been extensively applied in various areas of fundamental science to accelerate scientific research.
基金grants from the National Key Research and Development Program of China(2016YFB1000904)the National Natural Science Foundation of China(Grant Nos.61325010 and U 1605251)+3 种基金the Fundamental Research Funds for the Central Universities of China(WK2350000001)Le Wu gratefully acknowledges the support of the Open Project Program of the National Laboratory of Pattern Recognition(201700017)the Fundamental Research Funds for the Central Universities(JZ2016HGBZ0749)Yong Ge acknowledges the support of the National Natural Science Foundation of China(NSFC,Grant Nos.61602234 and 61572032).
文摘Recently,many online Karaoke(KTV)platforms have been released,where music lovers sing songs on these platforms.In the meantime,the system automatically evaluates user proficiency according to their singing behavior.Recommending approximate songs to users can initialize singers5 participation and improve users,loyalty to these platforms.However,this is not an easy task due to the unique characteristics of these platforms.First,since users may be not achieving high scores evaluated by the system on their favorite songs,how to balance user preferences with user proficiency on singing for song recommendation is still open.Second,the sparsity of the user-song interaction behavior may greatly impact the recommendation task.To solve the above two challenges,in this paper,we propose an informationfused song recommendation model by considering the unique characteristics of the singing data.Specifically,we first devise a pseudo-rating matrix by combing users’singing behavior and the system evaluations,thus users'preferences and proficiency are leveraged.Then we mitigate the data sparsity problem by fusing users*and songs'rich information in the matrix factorization process of the pseudo-rating matrix.Finally,extensive experimental results on a real-world dataset show the effectiveness of our proposed model.
基金supported by the National Key Research and Development Program of China under Grant No.2018AAAO10190the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20191420+2 种基金the National Natural Science Foundation of China under Grant No.61632016the Natural Science Research Project of Jiangsu Higher Education Institution under Grant No.17KJA520003the Priority Academic Program Development of JiangsuHigher Education Institutions,and the Suda-Toycloud Data Intelligence Joint Laboratory.
文摘Entity linking(EL)is the task of determining the identity of textual entity mentions given a predefined knowledge base(KB).Plenty of existing efforts have been made on this task using either"local"information(contextual information of the mention in the text),or"global"information(relations among candidate entities).However,either local or global information might be insufficient especially when the given text is short.To get richer local and global information for entity linking,we propose to enrich the context information for mentions by getting extra contexts from the web through web search engines(WSE).Based on the intuition above,two novel attempts are made.The first one adds web-searched results into an embedding-based method to expand the mention's local information,where we try two different methods to help generate high-quality web contexts:one is to apply the attention mechanism and the other is to use the abstract extraction method.The second one uses the web contexts to extend the global information,i.e.,finding and utilizing more extra relevant mentions from the web contexts with a graph-based model.Finally,we combine the two models we propose to use both extended local and global information from the extra web contexts.Our empirical study based on six real-world datasets shows that using extra web contexts to extend the local and the global information could effectively improve the F1 score of entity linking.