Citation Context Analysis(CCA)is a typical data-driven research field based on full-text information,which breaks the limitations of traditional citation analysis using only bibliographic data,and benefits further stu...Citation Context Analysis(CCA)is a typical data-driven research field based on full-text information,which breaks the limitations of traditional citation analysis using only bibliographic data,and benefits further studies on various citation behaviors and other core issues behind them,such as citation motivation,citation function and citation sentiment.Corpus for CCA is the most important guarantee and support for these issues.This paper attempts to discuss the corpus construction and mining for CCA in order to comprehensively review the research significance,research status and existing deficiencies in this area.Two main sections in our paper are:1)corpus construction for CCA,its three building tasks,such as citation sentence extraction,citation-reference mapping and citation context extraction,are discussed;2)corpus mining and utilization for CCA,following related topics or situations are explored,including classification of citation motivation(or behavior)and citation sentiment,indexing and retrieval based on citation,citation recommendation and evaluation,citation-based abstracting and review generation automatically,and domains knowledge metrics.Finally,some suggestions and future research directions are briefly listed.展开更多
文摘Citation Context Analysis(CCA)is a typical data-driven research field based on full-text information,which breaks the limitations of traditional citation analysis using only bibliographic data,and benefits further studies on various citation behaviors and other core issues behind them,such as citation motivation,citation function and citation sentiment.Corpus for CCA is the most important guarantee and support for these issues.This paper attempts to discuss the corpus construction and mining for CCA in order to comprehensively review the research significance,research status and existing deficiencies in this area.Two main sections in our paper are:1)corpus construction for CCA,its three building tasks,such as citation sentence extraction,citation-reference mapping and citation context extraction,are discussed;2)corpus mining and utilization for CCA,following related topics or situations are explored,including classification of citation motivation(or behavior)and citation sentiment,indexing and retrieval based on citation,citation recommendation and evaluation,citation-based abstracting and review generation automatically,and domains knowledge metrics.Finally,some suggestions and future research directions are briefly listed.