Biography is a direct and extensive way to know the representation of well known peoples, however, for common people, there is poor knowledge for them to be recognized. In recent years, information extraction (IE) t...Biography is a direct and extensive way to know the representation of well known peoples, however, for common people, there is poor knowledge for them to be recognized. In recent years, information extraction (IE) technologies have been used to automatically generate biography for any people with online information. One of the key challenges is the entity linking (EL) which can link biography sentence to corresponding entities. Currently the used general EL systems usually generate errors originated from entity name variation and ambiguity. Compared with general text, biography sentences possess unique yet rarely studied relational knowledge (RK) and temporal knowledge (TK), which could sufficiently distinguish entities. This article proposed a new statistical framework called the knowledge enhanced EL (KeEL) system for automated biography construction. It utilizes commonsense knowledge like PK and TK to enhance Entity Linking. The performance of KeEL on Wikipedia data was evaluated. It is shown that, compared with state-of-the-art method, KeEL significantly improves the precision and recall of Entity Linking.展开更多
Algorithms play an increasingly important role in scientific work,especially in data-driven research.Investigating the mention of algorithms in full-text paper helps us understand the use and development of algorithms...Algorithms play an increasingly important role in scientific work,especially in data-driven research.Investigating the mention of algorithms in full-text paper helps us understand the use and development of algorithms in a specific domain.Current research on the mention of algorithms is limited to the academic papers in one language,which is hard to comprehensively investigate the use of algorithms.For example,in papers of Chinese conference,is the mention of algorithms consistent with it in English conference papers?In order to answer this question,this paper takes NLP as an example,and compares the mention frequency,mention location and mention time of the top10 data-mining algorithms between the papers of the famous international conference,Annual Meeting of the Association for Computational Linguistics(ACL),and the Chinese conference,China National Conference on Computational Linguistics(CCL).The results show that compared with ACL,the mention frequency of top10 data-mining algorithms in CCL is slightly lower and the mention time is slightly delayed,while the distribution of mention location is similar.This study can provide a reference for the research related to the mention,citation and evaluation of knowledge entities.展开更多
基金supported by the National Natural Science Foundation of China (61035004)
文摘Biography is a direct and extensive way to know the representation of well known peoples, however, for common people, there is poor knowledge for them to be recognized. In recent years, information extraction (IE) technologies have been used to automatically generate biography for any people with online information. One of the key challenges is the entity linking (EL) which can link biography sentence to corresponding entities. Currently the used general EL systems usually generate errors originated from entity name variation and ambiguity. Compared with general text, biography sentences possess unique yet rarely studied relational knowledge (RK) and temporal knowledge (TK), which could sufficiently distinguish entities. This article proposed a new statistical framework called the knowledge enhanced EL (KeEL) system for automated biography construction. It utilizes commonsense knowledge like PK and TK to enhance Entity Linking. The performance of KeEL on Wikipedia data was evaluated. It is shown that, compared with state-of-the-art method, KeEL significantly improves the precision and recall of Entity Linking.
基金supported by the National Natural Science Foundation of China(Grant No.72074113)
文摘Algorithms play an increasingly important role in scientific work,especially in data-driven research.Investigating the mention of algorithms in full-text paper helps us understand the use and development of algorithms in a specific domain.Current research on the mention of algorithms is limited to the academic papers in one language,which is hard to comprehensively investigate the use of algorithms.For example,in papers of Chinese conference,is the mention of algorithms consistent with it in English conference papers?In order to answer this question,this paper takes NLP as an example,and compares the mention frequency,mention location and mention time of the top10 data-mining algorithms between the papers of the famous international conference,Annual Meeting of the Association for Computational Linguistics(ACL),and the Chinese conference,China National Conference on Computational Linguistics(CCL).The results show that compared with ACL,the mention frequency of top10 data-mining algorithms in CCL is slightly lower and the mention time is slightly delayed,while the distribution of mention location is similar.This study can provide a reference for the research related to the mention,citation and evaluation of knowledge entities.