In recent years, more and more foreigners begin to learn Chinese characters, but they often make typos when using Chinese. The fundamental reason is that they mainly learn Chinese characters from the glyph and pronunc...In recent years, more and more foreigners begin to learn Chinese characters, but they often make typos when using Chinese. The fundamental reason is that they mainly learn Chinese characters from the glyph and pronunciation, but do not master the semantics of Chinese characters. If they can understand the meaning of Chinese characters and form knowledge groups of the characters with relevant meanings, it can effectively improve learning efficiency. We achieve this goal by building a Chinese character semantic knowledge graph (CCSKG). In the process of building the knowledge graph, the semantic computing capacity of HowNet was utilized, and 104,187 associated edges were finally established for 6752 Chinese characters. Thanks to the development of deep learning, OpenHowNet releases the core data of HowNet and provides useful APIs for calculating the similarity between two words based on sememes. Therefore our method combines the advantages of data-driven and knowledge-driven. The proposed method treats Chinese sentences as subgraphs of the CCSKG and uses graph algorithms to correct Chinese typos and achieve good results. The experimental results show that compared with keras-bert and pycorrector + ernie, our method reduces the false acceptance rate by 38.28% and improves the recall rate by 40.91% in the field of learning Chinese as a foreign language. The CCSKG can help to promote Chinese overseas communication and international education.展开更多
文摘In recent years, more and more foreigners begin to learn Chinese characters, but they often make typos when using Chinese. The fundamental reason is that they mainly learn Chinese characters from the glyph and pronunciation, but do not master the semantics of Chinese characters. If they can understand the meaning of Chinese characters and form knowledge groups of the characters with relevant meanings, it can effectively improve learning efficiency. We achieve this goal by building a Chinese character semantic knowledge graph (CCSKG). In the process of building the knowledge graph, the semantic computing capacity of HowNet was utilized, and 104,187 associated edges were finally established for 6752 Chinese characters. Thanks to the development of deep learning, OpenHowNet releases the core data of HowNet and provides useful APIs for calculating the similarity between two words based on sememes. Therefore our method combines the advantages of data-driven and knowledge-driven. The proposed method treats Chinese sentences as subgraphs of the CCSKG and uses graph algorithms to correct Chinese typos and achieve good results. The experimental results show that compared with keras-bert and pycorrector + ernie, our method reduces the false acceptance rate by 38.28% and improves the recall rate by 40.91% in the field of learning Chinese as a foreign language. The CCSKG can help to promote Chinese overseas communication and international education.