期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Enriching short text representation in microblog for clustering 被引量:14
1
作者 Jiliang TANG Xufei WANG Huiji GAO Xia HU Huan LIU 《Frontiers of Computer Science》 SCIE EI CSCD 2012年第1期88-101,共14页
Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks. Their limited length, pervasive abbrevi- ations, and coined acronyms and words exacerbate... Social media websites allow users to exchange short texts such as tweets via microblogs and user status in friendship networks. Their limited length, pervasive abbrevi- ations, and coined acronyms and words exacerbate the prob- lems of synonymy and polysemy, and bring about new chal- lenges to data mining applications such as text clustering and classification. To address these issues, we dissect some poten- tial causes and devise an efficient approach that enriches data representation by employing machine translation to increase the number of features from different languages. Then we propose a novel framework which performs multi-language knowledge integration and feature reduction simultaneously through matrix factorization techniques. The proposed ap- proach is evaluated extensively in terms of effectiveness on two social media datasets from Facebook and Twitter. With its significant performance improvement, we further investi- gate potential factors that contribute to the improved perfor- mance. 展开更多
关键词 short texts text representation multi-languageknowledge matrix factorization social media
原文传递
现代知识观转向与说明文教学的问题审思 被引量:9
2
作者 黄耀红 《课程.教材.教法》 CSSCI 北大核心 2021年第8期77-82,共6页
说明文文本价值的彰显与语文学科的应用取向、体系建构及由古典而现代的语言转向紧密相关。中学说明文教学长期存在着知识板结与知识错位的两大问题,在知识观念上固守着表征主义知识观。由表征主义向生成主义的知识观转向,为说明文教学... 说明文文本价值的彰显与语文学科的应用取向、体系建构及由古典而现代的语言转向紧密相关。中学说明文教学长期存在着知识板结与知识错位的两大问题,在知识观念上固守着表征主义知识观。由表征主义向生成主义的知识观转向,为说明文教学提供了新的审思视角。在生成主义知识观看来,说明文教学期待着由知识客体走向知识主体、由知识概念走向知识情境、由知识结论走向知识过程、由知识接受走向知识建构。 展开更多
关键词 知识观 说明文 表征 生成
下载PDF
Efficient representation of text with multiple perspectives 被引量:1
3
作者 PING Yuan ZHOU Ya-jian +1 位作者 XUE Chao YANG Yi-xian 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2012年第1期101-111,共11页
An effective text representation scheme dominates the performance of text categorization system. However, based on the assumption of independent terms, the traditional schemes which tediously use term frequency (TF)... An effective text representation scheme dominates the performance of text categorization system. However, based on the assumption of independent terms, the traditional schemes which tediously use term frequency (TF) and document frequency (DF) are insufficient for capturing enough information of a document and result in poor performance. To overcome this limitation, we investigate exploring the relationships between different terms of the same class tendency and the way of measuring the importance of a repetitive term in a document. In this paper, a group of novel term weighting factors are proposed to enhance the category contribution for each term. Then, based on a novel strategy of generating passages from document, we present two schemes, the weighted co-contributions of different terms corresponding to the class tendency and the weighted co-contributions for each term in different passages, to achieve improvements on text representation. The prior scheme works in a dimensionality reduction mode while the second one runs in the conventional way. By employing the support vector machine (SVM) classifier, experiments on four benchmark corpora show that the proposed schemes could achieve a consistent better performance than the conventional methods in both efficiency and accuracy. Further analysis also confirms some promising directions for the future works. 展开更多
关键词 text representation support vector machine (SVM) class tendency category contribution passages
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部