English as an important international language has obtained increasing attention from Chinese educational institutions of different levels over more than three decades. Currently, in the background of social informati...English as an important international language has obtained increasing attention from Chinese educational institutions of different levels over more than three decades. Currently, in the background of social informatization and economic globalization, as an internationalized language, English is of importance to all the national people Therefore, fostering ethnic minority talents who have higher English quality is the core urgency of our high education as well. We should develop profoundly college English teaching study, summarize and explore new methods, new ideas in the realistic teaching work. However, recently, the college English teaching of ethnic minority class exists many obstacles on language teaching, textbook, examination, and so forth. This paper explores the specific reforms and connotations of ethnic minority college students' class construction by analyzing the current situation of college English teaching of ethnic minority class. Its aim is to enhance the college English teaching quality of ethnic minority class and make teaching quality get a new step in the field of college English teaching展开更多
Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of ...Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of samples in minority classes based on generative adversarial networks(GANs)has been demonstrated as an effective approach.This study proposes a novel GAN-based minority class augmentation approach named classifier-aided minority augmentation generative adversarial network(CMAGAN).In the CMAGAN framework,an outlier elimination strategy is first applied to each class to minimize the negative impacts of outliers.Subsequently,a newly designed boundary-strengthening learning GAN(BSLGAN)is employed to generate additional samples for minority classes.By incorporating a supplementary classifier and innovative training mechanisms,the BSLGAN focuses on learning the distribution of samples near classification boundaries.Consequently,it can fully capture the characteristics of the target class and generate highly realistic samples with clear boundaries.Finally,the new samples are filtered based on the Mahalanobis distance to ensure that they are within the desired distribution.To evaluate the effectiveness of the proposed approach,CMAGAN was used to solve the class imbalance problem in eight real-world fault-prediction applications.The performance of CMAGAN was compared with that of seven other algorithms,including state-of-the-art GAN-based methods,and the results indicated that CMAGAN could provide higher-quality augmented results.展开更多
Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results. But the classification performance on the minority cla...Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results. But the classification performance on the minority classes with a few bytes is still unhopeful because the existing research only focuses on the classes with a large amount of bytes. Therefore, the class-dependent misclassification cost is studied. Firstly, the flow rate based cost matrix (FCM) is investigated. Secondly, a new cost matrix named weighted cost matrix (WCM) is proposed, which calculates a reasonable weight for each cost of FCM by regarding the data imbalance degree and classification accuracy of each class. It is able to further improve the classification performance on the difficult minority class (the class with more flows but worse classification accuracy). Experimental results on twelve real traffic datasets show that FCM and WCM obtain more than 92% flow g-mean and 80% byte g-mean on average; on the test set collected one year later, WCM outperforms FCM in terms of stability.展开更多
Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whos...Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate char- acterization is challenging due to high-skewness and non- separability from majority classes, e.g., fraudulent transac- tions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. This algorithm is semi-supervised in na- ture since it uses both labeled and unlabeled data. It is based on an optimization framework which encloses the rare exam- ples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected sub- gradient method. RACH can be naturally kernelized. Experi- mental results validate the effectiveness of RACH.展开更多
文摘English as an important international language has obtained increasing attention from Chinese educational institutions of different levels over more than three decades. Currently, in the background of social informatization and economic globalization, as an internationalized language, English is of importance to all the national people Therefore, fostering ethnic minority talents who have higher English quality is the core urgency of our high education as well. We should develop profoundly college English teaching study, summarize and explore new methods, new ideas in the realistic teaching work. However, recently, the college English teaching of ethnic minority class exists many obstacles on language teaching, textbook, examination, and so forth. This paper explores the specific reforms and connotations of ethnic minority college students' class construction by analyzing the current situation of college English teaching of ethnic minority class. Its aim is to enhance the college English teaching quality of ethnic minority class and make teaching quality get a new step in the field of college English teaching
基金supported by the National Natural Science Foundation of China(Grant No.52375256)the Natural Science Foundation of Shanghai Municipality(Grant Nos.21ZR1431500 and 23ZR1431600).
文摘Class imbalance is a common characteristic of industrial data that adversely affects industrial data mining because it leads to the biased training of machine learning models.To address this issue,the augmentation of samples in minority classes based on generative adversarial networks(GANs)has been demonstrated as an effective approach.This study proposes a novel GAN-based minority class augmentation approach named classifier-aided minority augmentation generative adversarial network(CMAGAN).In the CMAGAN framework,an outlier elimination strategy is first applied to each class to minimize the negative impacts of outliers.Subsequently,a newly designed boundary-strengthening learning GAN(BSLGAN)is employed to generate additional samples for minority classes.By incorporating a supplementary classifier and innovative training mechanisms,the BSLGAN focuses on learning the distribution of samples near classification boundaries.Consequently,it can fully capture the characteristics of the target class and generate highly realistic samples with clear boundaries.Finally,the new samples are filtered based on the Mahalanobis distance to ensure that they are within the desired distribution.To evaluate the effectiveness of the proposed approach,CMAGAN was used to solve the class imbalance problem in eight real-world fault-prediction applications.The performance of CMAGAN was compared with that of seven other algorithms,including state-of-the-art GAN-based methods,and the results indicated that CMAGAN could provide higher-quality augmented results.
基金supported by the National Basic Research Program of China(2007CB307100,2007CB307106)
文摘Cost-sensitive learning has been applied to resolve the multi-class imbalance problem in Internet traffic classification and it has achieved considerable results. But the classification performance on the minority classes with a few bytes is still unhopeful because the existing research only focuses on the classes with a large amount of bytes. Therefore, the class-dependent misclassification cost is studied. Firstly, the flow rate based cost matrix (FCM) is investigated. Secondly, a new cost matrix named weighted cost matrix (WCM) is proposed, which calculates a reasonable weight for each cost of FCM by regarding the data imbalance degree and classification accuracy of each class. It is able to further improve the classification performance on the difficult minority class (the class with more flows but worse classification accuracy). Experimental results on twelve real traffic datasets show that FCM and WCM obtain more than 92% flow g-mean and 80% byte g-mean on average; on the test set collected one year later, WCM outperforms FCM in terms of stability.
文摘Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate char- acterization is challenging due to high-skewness and non- separability from majority classes, e.g., fraudulent transac- tions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. This algorithm is semi-supervised in na- ture since it uses both labeled and unlabeled data. It is based on an optimization framework which encloses the rare exam- ples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected sub- gradient method. RACH can be naturally kernelized. Experi- mental results validate the effectiveness of RACH.