This article proposes a new general, highly efficient algorithm for extracting domain terminologies. This domain-independent algorithm with multi-layers of filters is a hybrid of statistic-oriented and rule-oriented m...This article proposes a new general, highly efficient algorithm for extracting domain terminologies. This domain-independent algorithm with multi-layers of filters is a hybrid of statistic-oriented and rule-oriented methods. Utilizing the features of domain terminologies and the characteristics that are unique to Chinese, this algorithm extracts domain terminologies by generating multi-word unit (MWU) candidates at first and then fihering the candidates through multi-strategies. Our test resuhs show that this algorithm is feasible and effective.展开更多
基金Supported by the National Natural Science Foundation of China(Grant No. 60496326)
文摘This article proposes a new general, highly efficient algorithm for extracting domain terminologies. This domain-independent algorithm with multi-layers of filters is a hybrid of statistic-oriented and rule-oriented methods. Utilizing the features of domain terminologies and the characteristics that are unique to Chinese, this algorithm extracts domain terminologies by generating multi-word unit (MWU) candidates at first and then fihering the candidates through multi-strategies. Our test resuhs show that this algorithm is feasible and effective.