摘要
首先利用ICTCLAS分词系统和停用词表抽取文档词元,通过改进的TFIDF模型计算词元权重并筛选出热点词元,再通过词间距测算对热点词元按顺序进行组配,经权重计算和阈值筛选后得到术语集,由专家人工判定识别出有效的新技术术语。最后给出了应用实例并进行分析,验证了方法的有效性。
Firstly,the element of terms in patents are extracting by ICTCLAS segmentation system and stop words lists.Then the Hot elements of terms are filtered based on terms weights computing by improving TFIDF model.Secondly,the hot elements of terms are combined orderly by computing the distance between two words,and obtain the terms collection by terms weights computing and threshold filtering.The valid new technology terms are detected by the experts artificially.Finally the availability of the method is proved through the analysis of the applied example.
出处
《情报科学》
CSSCI
北大核心
2013年第2期144-149,共6页
Information Science
关键词
技术生命周期
术语识别
热点词元
technology life cycle
term detection
hot elements of terms