In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve ...In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve documents. This paper proposes a new approach to query expansion based on semantics and statistics Firstly automatic relevance feedback method is used to generate a candidate expansion word set. Then the expanded query words are selected from the set based on the semantic similarity and seman- tic relevancy between the candidate words and the original words. Experiments show the new approach is effective for Web retrieval and out-performs the conventional expansion approaches.展开更多
A large semantic gap exists between content based index retrieval(CBIR) and high-level semantic,additional semantic information should be attached to the images,it refers in three respects including semantic represent...A large semantic gap exists between content based index retrieval(CBIR) and high-level semantic,additional semantic information should be attached to the images,it refers in three respects including semantic representation model,semantic information building and semantic retrieval techniques.In this paper,we introduce an associated semantic network and an automatic semantic annotation system.In the system,a semantic network model is employed as the semantic representation model,it uses semantic Key words,linguistic ontology and low-level features in semantic similarity calculating.Through several times of users' relevance feedback,semantic network is enriched automatically.To speed up the growth of semantic network and get a balance annotation,semantic seeds and semantic loners are employed especially.展开更多
In recent years, more and more foreigners begin to learn Chinese characters, but they often make typos when using Chinese. The fundamental reason is that they mainly learn Chinese characters from the glyph and pronunc...In recent years, more and more foreigners begin to learn Chinese characters, but they often make typos when using Chinese. The fundamental reason is that they mainly learn Chinese characters from the glyph and pronunciation, but do not master the semantics of Chinese characters. If they can understand the meaning of Chinese characters and form knowledge groups of the characters with relevant meanings, it can effectively improve learning efficiency. We achieve this goal by building a Chinese character semantic knowledge graph (CCSKG). In the process of building the knowledge graph, the semantic computing capacity of HowNet was utilized, and 104,187 associated edges were finally established for 6752 Chinese characters. Thanks to the development of deep learning, OpenHowNet releases the core data of HowNet and provides useful APIs for calculating the similarity between two words based on sememes. Therefore our method combines the advantages of data-driven and knowledge-driven. The proposed method treats Chinese sentences as subgraphs of the CCSKG and uses graph algorithms to correct Chinese typos and achieve good results. The experimental results show that compared with keras-bert and pycorrector + ernie, our method reduces the false acceptance rate by 38.28% and improves the recall rate by 40.91% in the field of learning Chinese as a foreign language. The CCSKG can help to promote Chinese overseas communication and international education.展开更多
基金the Specialized Research Program Fundthe Doctoral Program of Higher Education of China (20050007023)the Natural Science Foundation of Shandong Province(Y2004G04)
文摘In Chinese question answering system, because there is more semantic relation in questions than that in query words, the precision can be improved by expanding query while using natural language questions to retrieve documents. This paper proposes a new approach to query expansion based on semantics and statistics Firstly automatic relevance feedback method is used to generate a candidate expansion word set. Then the expanded query words are selected from the set based on the semantic similarity and seman- tic relevancy between the candidate words and the original words. Experiments show the new approach is effective for Web retrieval and out-performs the conventional expansion approaches.
文摘A large semantic gap exists between content based index retrieval(CBIR) and high-level semantic,additional semantic information should be attached to the images,it refers in three respects including semantic representation model,semantic information building and semantic retrieval techniques.In this paper,we introduce an associated semantic network and an automatic semantic annotation system.In the system,a semantic network model is employed as the semantic representation model,it uses semantic Key words,linguistic ontology and low-level features in semantic similarity calculating.Through several times of users' relevance feedback,semantic network is enriched automatically.To speed up the growth of semantic network and get a balance annotation,semantic seeds and semantic loners are employed especially.
文摘In recent years, more and more foreigners begin to learn Chinese characters, but they often make typos when using Chinese. The fundamental reason is that they mainly learn Chinese characters from the glyph and pronunciation, but do not master the semantics of Chinese characters. If they can understand the meaning of Chinese characters and form knowledge groups of the characters with relevant meanings, it can effectively improve learning efficiency. We achieve this goal by building a Chinese character semantic knowledge graph (CCSKG). In the process of building the knowledge graph, the semantic computing capacity of HowNet was utilized, and 104,187 associated edges were finally established for 6752 Chinese characters. Thanks to the development of deep learning, OpenHowNet releases the core data of HowNet and provides useful APIs for calculating the similarity between two words based on sememes. Therefore our method combines the advantages of data-driven and knowledge-driven. The proposed method treats Chinese sentences as subgraphs of the CCSKG and uses graph algorithms to correct Chinese typos and achieve good results. The experimental results show that compared with keras-bert and pycorrector + ernie, our method reduces the false acceptance rate by 38.28% and improves the recall rate by 40.91% in the field of learning Chinese as a foreign language. The CCSKG can help to promote Chinese overseas communication and international education.