Considering the deviation in content of community detection resulting from the low accuracy of resource relevance,an algorithm based on the topology of sites and the similarity between their topics is proposed. With t...Considering the deviation in content of community detection resulting from the low accuracy of resource relevance,an algorithm based on the topology of sites and the similarity between their topics is proposed. With topic content factors fully considered,this algorithm can search for topically similar site clusters on the premise of inter-site topology. The experimental results show that the algorithm can generate a more accurate result of detection in the real network.展开更多
The procedure of hypertext induced topic search based on a semantic relation model is analyzed, and the reason for the topic drift of HITS algorithm was found to prove that Web pages are projected to a wrong latent se...The procedure of hypertext induced topic search based on a semantic relation model is analyzed, and the reason for the topic drift of HITS algorithm was found to prove that Web pages are projected to a wrong latent semantic basis. A new concept-generalized similarity is introduced and, based on this, a new topic distillation algorithm GSTDA(generalized similarity based topic distillation algorithm) was presented to improve the quality of topic distillation. GSTDA was applied not only to avoid the topic drift, but also to explore relative topics to user query. The experimental results on 10 queries show that GSTDA reduces topic drift rate by 10% to 58% compared to that of HITS(hypertext induced topic search) algorithm, and discovers several relative topics to queries that have multiple meanings.展开更多
基金Supported by the National Science and Technology Support Program of China(No.2012BAH45B01)the National Natural Science Foundation of China(No.61100189,61370215,61370211,61402137)the National“242”Project of China(No.2016A104)
文摘Considering the deviation in content of community detection resulting from the low accuracy of resource relevance,an algorithm based on the topology of sites and the similarity between their topics is proposed. With topic content factors fully considered,this algorithm can search for topically similar site clusters on the premise of inter-site topology. The experimental results show that the algorithm can generate a more accurate result of detection in the real network.
基金Supported by the Shaanxi Provincial Educational Depar tment Special-Purpose Technology and Research of China (06JK229)
文摘The procedure of hypertext induced topic search based on a semantic relation model is analyzed, and the reason for the topic drift of HITS algorithm was found to prove that Web pages are projected to a wrong latent semantic basis. A new concept-generalized similarity is introduced and, based on this, a new topic distillation algorithm GSTDA(generalized similarity based topic distillation algorithm) was presented to improve the quality of topic distillation. GSTDA was applied not only to avoid the topic drift, but also to explore relative topics to user query. The experimental results on 10 queries show that GSTDA reduces topic drift rate by 10% to 58% compared to that of HITS(hypertext induced topic search) algorithm, and discovers several relative topics to queries that have multiple meanings.