In recent years, as the enrollment rate of Chinese colleges has increased year by year, the identification of needy undergraduates has become increasingly important. However, the traditional way to identify college st...In recent years, as the enrollment rate of Chinese colleges has increased year by year, the identification of needy undergraduates has become increasingly important. However, the traditional way to identify college students with financial difficulties mainly relies on manual review and collective voting, which easily causes subjectivity and randomness. To alleviate the problem above, this paper establishes an automatic identification model for needy undergraduates based on the 1842 questionnaires collected from undergraduates in WHUT. Firstly, this paper filters the questionnaire preliminary using the local outlier factor algorithm. Secondly, this paper combines mutual information, Spearman rank correlation coefficient and distance correlation coefficient by rank-sum ratio to select features for eliminating noise from irrelevant features. Thirdly, this paper trains filed-aware factor machine model and compares it with other models, such as Logistic Regression, SVM, etc. Eventually, this paper finds that filed-aware factor machine performers much better than other models in the identification of needy undergraduates, and prominent features affecting the identification of needy undergraduates are the year of the family income, cost of living provided parents, etc.展开更多
文摘In recent years, as the enrollment rate of Chinese colleges has increased year by year, the identification of needy undergraduates has become increasingly important. However, the traditional way to identify college students with financial difficulties mainly relies on manual review and collective voting, which easily causes subjectivity and randomness. To alleviate the problem above, this paper establishes an automatic identification model for needy undergraduates based on the 1842 questionnaires collected from undergraduates in WHUT. Firstly, this paper filters the questionnaire preliminary using the local outlier factor algorithm. Secondly, this paper combines mutual information, Spearman rank correlation coefficient and distance correlation coefficient by rank-sum ratio to select features for eliminating noise from irrelevant features. Thirdly, this paper trains filed-aware factor machine model and compares it with other models, such as Logistic Regression, SVM, etc. Eventually, this paper finds that filed-aware factor machine performers much better than other models in the identification of needy undergraduates, and prominent features affecting the identification of needy undergraduates are the year of the family income, cost of living provided parents, etc.