摘要
文章简要地描述了文本过滤的背景,提出了基于潜在语义索引的中英文双语交叉过滤的逻辑模型。其基本思想是改进双语交叉过滤中基于词汇对译的方法,而是利用双语文本中潜在的语义结构,作为用户模板与文本匹配的基础。将出现的双语词汇和文本映射为语义空间的向量,不必翻译对译词,甚至不需要出现相应的对译词,也能匹配成功,极大地改善了交叉过滤的精度,效果良好。
This paper briefly describes the background of text filtering and puts forward the logic model for ChineseEnglish cross-language text filtering based on hatent Semantic Indexing. The main idea is showed as follows; It takesthe latent semantic structures as a basis of the matching between user profile and texts instead of the word to wordtranslations,and either texts or words in bi-language can be represented as vectors in new semantic space. As a result,inner production acts as similarity measure,so profiles and texts could match successfully without corresponding wordsin another language. The experiment shows that it can markedly improve the efficiency of text filtering.
出处
《计算机工程与应用》
CSCD
北大核心
2000年第8期48-50,共3页
Computer Engineering and Applications
基金
国家自然科学基金!69675019
国家教委博士点基金
关键词
中英文双语交叉过滤
用户模板
逻辑模型
Text Filtering, Cross-language Text Filtering, Latent Semantic Indexing, User Profiles, Vector Space Model