Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with g...Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.展开更多
Twenty isolates of Fusarium oxysporum f. sp. ciceris were isolated from wilted chickpea plants obtained from different districts of north part of Iraq to assess variability in pathogenicity of the populations. Each is...Twenty isolates of Fusarium oxysporum f. sp. ciceris were isolated from wilted chickpea plants obtained from different districts of north part of Iraq to assess variability in pathogenicity of the populations. Each isolate was tested on 12 differential chickpea varieties. Isolates showed highly significant variation in wilt severity on the differential varieties. Based on the reaction types that induced on differential varieties, isolates were grouped into four groups, First group included isolates FocSl, FocQ7, FocQ 10, FocFI3, FocH 17 and FocHl8; the second group included isolates FocS2, FocS3, FocS4, FocQ5, FocQ8, FocQ9, FocF11, FocF12, FocFl4 and FocH19; the third group included isolates FocF15, FocHl6, FocH20; where the isolate FocQ6 was placed in the fourth group. Results showed that the percentage of genetic similarity was ranged 42% to 100% and was 42% between the first group and other groups and 72% between the three groups the rest and thus this indicate the presence of four races of the fungus which are O, 4, 5 and 1B/C, this represent the first record of these races in lraq.展开更多
基金Project(60763001) supported by the National Natural Science Foundation of ChinaProject(2010GZS0072) supported by the Natural Science Foundation of Jiangxi Province,ChinaProject(GJJ12271) supported by the Science and Technology Foundation of Provincial Education Department of Jiangxi Province,China
文摘Category-based statistic language model is an important method to solve the problem of sparse data.But there are two bottlenecks:1) The problem of word clustering.It is hard to find a suitable clustering method with good performance and less computation.2) Class-based method always loses the prediction ability to adapt the text in different domains.In order to solve above problems,a definition of word similarity by utilizing mutual information was presented.Based on word similarity,the definition of word set similarity was given.Experiments show that word clustering algorithm based on similarity is better than conventional greedy clustering method in speed and performance,and the perplexity is reduced from 283 to 218.At the same time,an absolute weighted difference method was presented and was used to construct vari-gram language model which has good prediction ability.The perplexity of vari-gram model is reduced from 234.65 to 219.14 on Chinese corpora,and is reduced from 195.56 to 184.25 on English corpora compared with category-based model.
文摘Twenty isolates of Fusarium oxysporum f. sp. ciceris were isolated from wilted chickpea plants obtained from different districts of north part of Iraq to assess variability in pathogenicity of the populations. Each isolate was tested on 12 differential chickpea varieties. Isolates showed highly significant variation in wilt severity on the differential varieties. Based on the reaction types that induced on differential varieties, isolates were grouped into four groups, First group included isolates FocSl, FocQ7, FocQ 10, FocFI3, FocH 17 and FocHl8; the second group included isolates FocS2, FocS3, FocS4, FocQ5, FocQ8, FocQ9, FocF11, FocF12, FocFl4 and FocH19; the third group included isolates FocF15, FocHl6, FocH20; where the isolate FocQ6 was placed in the fourth group. Results showed that the percentage of genetic similarity was ranged 42% to 100% and was 42% between the first group and other groups and 72% between the three groups the rest and thus this indicate the presence of four races of the fungus which are O, 4, 5 and 1B/C, this represent the first record of these races in lraq.