Citations play an important role in the scientific community by assisting in measuring multifarious policies like the impact of journals,researchers,institutions,and countries.Authors cite papers for different reasons...Citations play an important role in the scientific community by assisting in measuring multifarious policies like the impact of journals,researchers,institutions,and countries.Authors cite papers for different reasons,such as extending previous work,comparing their study with the state-of-the-art,providing background of the field,etc.In recent years,researchers have tried to conceptualize all citations into two broad categories,important and incidental.Such a categorization is very important to enhance scientific output in multiple ways,for instance,(1)Helping a researcher in identifying meaningful citations from a list of 100 to 1000 citations(2)Enhancing the impact factor calculation mechanism by more strongly weighting important citations,and(3)Improving researcher,institutional,and university rankings by only considering important citations.All of these uses depend upon correctly identifying the important citations from the list of all citations in a paper.To date,researchers have utilized many features to classify citations into these broad categories:cue phrases,in-text citation counts,and metadata features,etc.However,contemporary approaches are based on identification of in-text citation counts,mapping sections onto the Introduction,Methods,Results,and Discussion(IMRAD)structure,identifying cue phrases,etc.Identifying such features accurately is a challenging task and is normally conducted manually,with the accuracy of citation classification demonstrated in terms of these manually extracted features.This research proposes to examine the content of the cited and citing pair to identify important citing papers for each cited paper.This content similarity approach was adopted from research paper recommendation approaches.Furthermore,a novel section-based content similarity approach is also proposed.The results show that solely using the abstract of the cited and citing papers can achieve similar accuracy as the stateof-the-art approaches.This makes the proposed approach a viable technique that does not depend on manual identification of complex features.展开更多
文摘Citations play an important role in the scientific community by assisting in measuring multifarious policies like the impact of journals,researchers,institutions,and countries.Authors cite papers for different reasons,such as extending previous work,comparing their study with the state-of-the-art,providing background of the field,etc.In recent years,researchers have tried to conceptualize all citations into two broad categories,important and incidental.Such a categorization is very important to enhance scientific output in multiple ways,for instance,(1)Helping a researcher in identifying meaningful citations from a list of 100 to 1000 citations(2)Enhancing the impact factor calculation mechanism by more strongly weighting important citations,and(3)Improving researcher,institutional,and university rankings by only considering important citations.All of these uses depend upon correctly identifying the important citations from the list of all citations in a paper.To date,researchers have utilized many features to classify citations into these broad categories:cue phrases,in-text citation counts,and metadata features,etc.However,contemporary approaches are based on identification of in-text citation counts,mapping sections onto the Introduction,Methods,Results,and Discussion(IMRAD)structure,identifying cue phrases,etc.Identifying such features accurately is a challenging task and is normally conducted manually,with the accuracy of citation classification demonstrated in terms of these manually extracted features.This research proposes to examine the content of the cited and citing pair to identify important citing papers for each cited paper.This content similarity approach was adopted from research paper recommendation approaches.Furthermore,a novel section-based content similarity approach is also proposed.The results show that solely using the abstract of the cited and citing papers can achieve similar accuracy as the stateof-the-art approaches.This makes the proposed approach a viable technique that does not depend on manual identification of complex features.