期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Using ontology semantics to improve text documents clustering 被引量:8
1
作者 罗娜 左万利 +2 位作者 袁福宇 张靖波 张慧杰 《Journal of Southeast University(English Edition)》 EI CAS 2006年第3期370-374,共5页
In order to improve the clustering results and select in the results, the ontology semantic is combined with document clustering. A new document clustering algorithm based WordNet in the phrase of document processing ... In order to improve the clustering results and select in the results, the ontology semantic is combined with document clustering. A new document clustering algorithm based WordNet in the phrase of document processing is proposed. First, every word vector by new entities is extended after the documents are represented by tf-idf. Then the feature extracting algorithm is applied for the documents. Finally, the algorithm of ontology aggregation clustering (OAC) is proposed to improve the result of document clustering. Experiments are based on the data set of Reuters 20 News Group, and experimental results are compared with the results obtained by mutual information(MI). The conclusion draws that the proposed algorithm of document clustering based on ontology is better than the other existed clustering algorithms such as MNB, CLUTO, co-clustering, etc. 展开更多
关键词 ONTOLOGY text clustering LEXICON WORDNET
下载PDF
Fuzzy c-means text clustering based on topic concept sub-space 被引量:3
2
作者 吉翔华 陈超 +1 位作者 邵正荣 俞能海 《Journal of Southeast University(English Edition)》 EI CAS 2007年第3期439-442,共4页
To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Con... To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision. 展开更多
关键词 TCS2FCM topic concept space fuzzy c-means clustering text clustering
下载PDF
Ontology-based similarity measure for text clustering 被引量:1
3
作者 颜端武 李晓鹏 +1 位作者 王磊 成晓 《Journal of Southeast University(English Edition)》 EI CAS 2006年第3期389-393,共5页
A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywor... A method that combines category-based and keyword-based concepts for a better information retrieval system is introduced. To improve document clustering, a document similarity measure based on cosine vector and keywords frequency in documents is proposed, but also with an input ontology. The ontology is domain specific and includes a list of keywords organized by degree of importance to the categories of the ontology, and by means of semantic knowledge, the ontology can improve the effects of document similarity measure and feedback of information retrieval systems. Two approaches to evaluating the performance of this similarity measure and the comparison with standard cosine vector similarity measure are also described. 展开更多
关键词 similarity measure text clustering ONTOLOGY information retrieval system
下载PDF
关注中学语文的“群阅读” 被引量:3
4
作者 葛德均 《基础教育课程》 北大核心 2019年第10期38-42,共5页
语文阅读教学,不能局限于单篇孤立的阅读,要扩展接触面与阅读量,选择一定的视角,多文本类聚,由此及彼,由浅入深,从而培养和积累学生的语文学习能力。
关键词 中学语文 群阅读 文本类聚 阅读视角
下载PDF
一种改进的K-means算法最佳聚类数确定方法 被引量:12
5
作者 边鹏 赵妍 苏玉召 《现代图书情报技术》 CSSCI 北大核心 2011年第9期34-40,共7页
对BWP方法进行研究,从嵌入式NSTL个性化推荐的文本聚类需求入手,分析BWP方法的不足,提出一种改进的K-means算法最佳聚类数确定方法。对单一样本类的类内距离计算方法进行优化,扩展BWP方法适用的聚类数范围,使原有局部最优的聚类数优化... 对BWP方法进行研究,从嵌入式NSTL个性化推荐的文本聚类需求入手,分析BWP方法的不足,提出一种改进的K-means算法最佳聚类数确定方法。对单一样本类的类内距离计算方法进行优化,扩展BWP方法适用的聚类数范围,使原有局部最优的聚类数优化为全局最优。实验结果可以验证该方法具有良好性能。 展开更多
关键词 K—means聚类聚类数文本聚类推荐系统
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部