摘要
针对现有中文报道关系检测的检测代价即误报率和丢失率较高的问题,在多向量空间模型基础上提取不同向量的要素(时间、地点、人物和内容)特征词组成关联词对,使用支持向量机(SVM)方法整合关联词对相似度和余弦相似度,从而提出了一种提取要素关联词对报道关系检测方法。所提方法补充表示了报道内容,为检测提供了更多的比较依据,识别代价降低了将近11%。实验结果验证了算法的有效性。
At present, the cost of Chinese story link detection is high, since the miss rate and false rate are high. Concerning this problem, based on multi-vector space model, the paper joined elements ( time, site, people, content) correlative word to represent the relevance of the different elements, integrated coherence similarity and cosine similarity with Support Vector Machine (SVM), and then proposed an algorithm which was based on the extraction of elements correlative word. The proposed algorithm complementally expressed the story and provided more evidence for detection; the detection cost was decreased by nearly 11%. Finally, the experimental results show the validity of the proposed algorithm.
出处
《计算机应用》
CSCD
北大核心
2013年第1期182-185,共4页
journal of Computer Applications
基金
国家自然科学基金资助项目(61063032)
广西自然科学基金资助项目(2012GXNSFAA053225)