Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with g...Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.展开更多
Objective:The aim of this study is to discover research status and hotspots of economic evaluation(EE)in nursing area using co-word cluster analysis.Methods:Medical Subject Heading(MeSH)term“cost–benefit analysis”w...Objective:The aim of this study is to discover research status and hotspots of economic evaluation(EE)in nursing area using co-word cluster analysis.Methods:Medical Subject Heading(MeSH)term“cost–benefit analysis”was searched in PubMed and nursing journals were limited by the function of filter.The information of author,country,year,journal,and keywords of collected paper was extracted and exported to Bicomb 2.0 system,where high-frequency terms and other data could be further mined.SPSS 19.0 was used for cluster analysis to generate dendrogram.Results:In all,3,020 articles were found and 10,573 MeSH terms were detected;among them,1,909 were MeSH major topics and generated 42 high-frequency terms.The consequence of dendrogram showed seven clusters,representing seven research hotspots:skin administration,infection prevention,education program,nurse education and management,EE research,neoplasm patient,and extension of nurse function.Conclusions:Nursing EE research involved multiple aspects in nursing area,which is an important indicator for decision-making.Although the number of papers is increasing,the quality of study is not promising.Therefore,further study may be required to detect nurses’knowledge of economic analysis method and their attitude to apply it into nursing research.More nursing economics course could carry out in nursing school or hospitals.展开更多
文章运用WordSmith 8.0对艾丽斯·沃克小说《紫色》中的关键词和特殊词簇进行分析,揭示了《紫色》在词汇上的整体分布特征,并指出文中所使用的词汇与句式均与主人公非裔女性这一人物形象相吻合。通过Word Smith 8.0检索发现,沃克小...文章运用WordSmith 8.0对艾丽斯·沃克小说《紫色》中的关键词和特殊词簇进行分析,揭示了《紫色》在词汇上的整体分布特征,并指出文中所使用的词汇与句式均与主人公非裔女性这一人物形象相吻合。通过Word Smith 8.0检索发现,沃克小说中的关键词和词簇搭配对于促进故事情节和人物刻画方面有重要作用。研究结果表明,语料库文体学有助于学者发现以往研究中忽视的深层文本含义,是对以往《紫色》文学定性研究结果的再次验证,是定性研究和定量研究的积极结合,也是对学界“经典重读”的积极响应。展开更多
基金Project(60763001) supported by the National Natural Science Foundation of ChinaProject(2010GZS0072) supported by the Natural Science Foundation of Jiangxi Province,ChinaProject(GJJ12271) supported by the Science and Technology Foundation of Provincial Education Department of Jiangxi Province,China
文摘Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.
文摘Objective:The aim of this study is to discover research status and hotspots of economic evaluation(EE)in nursing area using co-word cluster analysis.Methods:Medical Subject Heading(MeSH)term“cost–benefit analysis”was searched in PubMed and nursing journals were limited by the function of filter.The information of author,country,year,journal,and keywords of collected paper was extracted and exported to Bicomb 2.0 system,where high-frequency terms and other data could be further mined.SPSS 19.0 was used for cluster analysis to generate dendrogram.Results:In all,3,020 articles were found and 10,573 MeSH terms were detected;among them,1,909 were MeSH major topics and generated 42 high-frequency terms.The consequence of dendrogram showed seven clusters,representing seven research hotspots:skin administration,infection prevention,education program,nurse education and management,EE research,neoplasm patient,and extension of nurse function.Conclusions:Nursing EE research involved multiple aspects in nursing area,which is an important indicator for decision-making.Although the number of papers is increasing,the quality of study is not promising.Therefore,further study may be required to detect nurses’knowledge of economic analysis method and their attitude to apply it into nursing research.More nursing economics course could carry out in nursing school or hospitals.
文摘文章运用WordSmith 8.0对艾丽斯·沃克小说《紫色》中的关键词和特殊词簇进行分析,揭示了《紫色》在词汇上的整体分布特征,并指出文中所使用的词汇与句式均与主人公非裔女性这一人物形象相吻合。通过Word Smith 8.0检索发现,沃克小说中的关键词和词簇搭配对于促进故事情节和人物刻画方面有重要作用。研究结果表明,语料库文体学有助于学者发现以往研究中忽视的深层文本含义,是对以往《紫色》文学定性研究结果的再次验证,是定性研究和定量研究的积极结合,也是对学界“经典重读”的积极响应。