Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-outp...Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method.展开更多
In aerodynamic optimization, global optimization methods such as genetic algorithms are preferred in many cases because of their advantage on reaching global optimum. However,for complex problems in which large number...In aerodynamic optimization, global optimization methods such as genetic algorithms are preferred in many cases because of their advantage on reaching global optimum. However,for complex problems in which large number of design variables are needed, the computational cost becomes prohibitive, and thus original global optimization strategies are required. To address this need, data dimensionality reduction method is combined with global optimization methods, thus forming a new global optimization system, aiming to improve the efficiency of conventional global optimization. The new optimization system involves applying Proper Orthogonal Decomposition(POD) in dimensionality reduction of design space while maintaining the generality of original design space. Besides, an acceleration approach for samples calculation in surrogate modeling is applied to reduce the computational time while providing sufficient accuracy. The optimizations of a transonic airfoil RAE2822 and the transonic wing ONERA M6 are performed to demonstrate the effectiveness of the proposed new optimization system. In both cases, we manage to reduce the number of design variables from 20 to 10 and from 42 to 20 respectively. The new design optimization system converges faster and it takes 1/3 of the total time of traditional optimization to converge to a better design, thus significantly reducing the overall optimization time and improving the efficiency of conventional global design optimization method.展开更多
In the context of increasing dimensionality of design variables and the complexity of constraints, the efficacy of Surrogate-Based Optimization(SBO) is limited. The traditional linear and nonlinear dimensionality redu...In the context of increasing dimensionality of design variables and the complexity of constraints, the efficacy of Surrogate-Based Optimization(SBO) is limited. The traditional linear and nonlinear dimensionality reduction algorithms are mainly to decompose the mathematical matrix composed of design variables or objective functions in various forms, the smoothness of the design space cannot be guaranteed in the process, and additional constraint functions need to be added in the optimization, which increases the calculation cost. This study presents a new parameterization method to improve both problems of SBO. The new parameterization is addressed by decoupling affine transformations(dilation, rotation, shearing, and translation) within the Grassmannian submanifold, which enables a separate representation of the physical information of the airfoil in a highdimensional space. Building upon this, Principal Geodesic Analysis(PGA) is employed to achieve geometric control, compress the design space, reduce the number of design variables, reduce the dimensions of design variables and enhance predictive performance during the surrogate optimization process. For comparison, a dimensionality reduction space is defined using 95% of the energy,and RAE 2822 for transonic conditions are used as demonstrations. This method significantly enhances the optimization efficiency of the surrogate model while effectively enabling geometric constraints. In three-dimensional problems, it enables simultaneous design of planar shapes for various components of the aircraft and high-order perturbation deformations. Optimization was applied to the ONERA M6 wing, achieving a lift-drag ratio of 18.09, representing a 27.25% improvement compared to the baseline configuration. In comparison to conventional surrogate model optimization methods, which only achieved a 17.97% improvement, this approach demonstrates its superiority.展开更多
In order to improve the accuracy of used car price prediction,a machine learning prediction model based on the retention rate is proposed in this paper.Firstly,a random forest algorithm is used to filter the variables...In order to improve the accuracy of used car price prediction,a machine learning prediction model based on the retention rate is proposed in this paper.Firstly,a random forest algorithm is used to filter the variables in the data.Seven main characteristic variables that affect used car prices,such as new car price,service time,mileage and so on,are filtered out.Then,the linear regression classification method is introduced to classify the test data into high and low retention rate data.After that,the extreme gradient boosting(XGBoost)regression model is built for the two datasets respectively.The prediction results show that the comprehensive evaluation index of the proposed model is 0.548,which is significantly improved compared to 0.488 of the original XGBoost model.Finally,compared with other representative machine learning algorithms,this model shows certain advantages in terms of mean absolute percentage error(MAPE),5%accuracy rate and comprehensive evaluation index.As a result,the retention rate-based machine learning model established in this paper has significant advantages in terms of the accuracy of used car price prediction.展开更多
文摘Data with large dimensions will bring various problems to the application of data envelopment analysis(DEA).In this study,we focus on a“big data”problem related to the considerably large dimensions of the input-output data.The four most widely used approaches to guide dimension reduction in DEA are compared via Monte Carlo simulation,including principal component analysis(PCA-DEA),which is based on the idea of aggregating input and output,efficiency contribution measurement(ECM),average efficiency measure(AEC),and regression-based detection(RB),which is based on the idea of variable selection.We compare the performance of these methods under different scenarios and a brand-new comparison benchmark for the simulation test.In addition,we discuss the effect of initial variable selection in RB for the first time.Based on the results,we offer guidelines that are more reliable on how to choose an appropriate method.
基金supported by the National Natural Science Foundation of China (No. 11502211)
文摘In aerodynamic optimization, global optimization methods such as genetic algorithms are preferred in many cases because of their advantage on reaching global optimum. However,for complex problems in which large number of design variables are needed, the computational cost becomes prohibitive, and thus original global optimization strategies are required. To address this need, data dimensionality reduction method is combined with global optimization methods, thus forming a new global optimization system, aiming to improve the efficiency of conventional global optimization. The new optimization system involves applying Proper Orthogonal Decomposition(POD) in dimensionality reduction of design space while maintaining the generality of original design space. Besides, an acceleration approach for samples calculation in surrogate modeling is applied to reduce the computational time while providing sufficient accuracy. The optimizations of a transonic airfoil RAE2822 and the transonic wing ONERA M6 are performed to demonstrate the effectiveness of the proposed new optimization system. In both cases, we manage to reduce the number of design variables from 20 to 10 and from 42 to 20 respectively. The new design optimization system converges faster and it takes 1/3 of the total time of traditional optimization to converge to a better design, thus significantly reducing the overall optimization time and improving the efficiency of conventional global design optimization method.
基金supported by the National Natural Science Foundation of China (No.92371201).
文摘In the context of increasing dimensionality of design variables and the complexity of constraints, the efficacy of Surrogate-Based Optimization(SBO) is limited. The traditional linear and nonlinear dimensionality reduction algorithms are mainly to decompose the mathematical matrix composed of design variables or objective functions in various forms, the smoothness of the design space cannot be guaranteed in the process, and additional constraint functions need to be added in the optimization, which increases the calculation cost. This study presents a new parameterization method to improve both problems of SBO. The new parameterization is addressed by decoupling affine transformations(dilation, rotation, shearing, and translation) within the Grassmannian submanifold, which enables a separate representation of the physical information of the airfoil in a highdimensional space. Building upon this, Principal Geodesic Analysis(PGA) is employed to achieve geometric control, compress the design space, reduce the number of design variables, reduce the dimensions of design variables and enhance predictive performance during the surrogate optimization process. For comparison, a dimensionality reduction space is defined using 95% of the energy,and RAE 2822 for transonic conditions are used as demonstrations. This method significantly enhances the optimization efficiency of the surrogate model while effectively enabling geometric constraints. In three-dimensional problems, it enables simultaneous design of planar shapes for various components of the aircraft and high-order perturbation deformations. Optimization was applied to the ONERA M6 wing, achieving a lift-drag ratio of 18.09, representing a 27.25% improvement compared to the baseline configuration. In comparison to conventional surrogate model optimization methods, which only achieved a 17.97% improvement, this approach demonstrates its superiority.
基金Supported by the Postgraduate Education Reform Project of Yangzhou University (JGLX2021_002)。
文摘In order to improve the accuracy of used car price prediction,a machine learning prediction model based on the retention rate is proposed in this paper.Firstly,a random forest algorithm is used to filter the variables in the data.Seven main characteristic variables that affect used car prices,such as new car price,service time,mileage and so on,are filtered out.Then,the linear regression classification method is introduced to classify the test data into high and low retention rate data.After that,the extreme gradient boosting(XGBoost)regression model is built for the two datasets respectively.The prediction results show that the comprehensive evaluation index of the proposed model is 0.548,which is significantly improved compared to 0.488 of the original XGBoost model.Finally,compared with other representative machine learning algorithms,this model shows certain advantages in terms of mean absolute percentage error(MAPE),5%accuracy rate and comprehensive evaluation index.As a result,the retention rate-based machine learning model established in this paper has significant advantages in terms of the accuracy of used car price prediction.