A novel hydrocracking Ni-W binary catalyst was tentatively designed and prepared by means ofimpregnation on mixed supports of modified Y zeolite and amorphous aluminosilicate . The structure andproperties of catalyst ...A novel hydrocracking Ni-W binary catalyst was tentatively designed and prepared by means ofimpregnation on mixed supports of modified Y zeolite and amorphous aluminosilicate . The structure andproperties of catalyst were extensively characterized by XRD, NH3-TPD, IR and XRF techniques. The perfor-mance of catalyst was evaluated by a 100-ml hydrogenation laboratory test unit with two single-stage fixed-bedreactors connected in series. The characterization results showed that the catalyst has a developed and con-centrated mesopores distribution, suitable acid sites and acid strength distribution, and uniform and highdispersion of metal sites. Under a high conversion rate of 73.8% with the >350℃ feedstock, a 98.1m% of C5+yield and 83.5% of middle distillates selectivity were obtained. The yield of middle distillates boiling between140℃and 370℃ was 68.70m% and its quality could meet the WWFC category III specification. It means thatthis catalyst could be used to produce more high quality clean middle distillates derived from heavy oilhydrocracking. The potential aromatic content of heavy naphtha from 65℃ to 140℃ was 37.5m%. The BMCIvalue of >370℃ tail oil was 6.6. The heavy naphtha and tail oil are premium feedstocks for catalytic reformingand steam cracker units.展开更多
The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classificati...The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classification.However,BERT’s size and computational demands limit its practicality,especially in resource-constrained settings.This research compresses the BERT base model for Bengali emotion classification through knowledge distillation(KD),pruning,and quantization techniques.Despite Bengali being the sixth most spoken language globally,NLP research in this area is limited.Our approach addresses this gap by creating an efficient BERT-based model for Bengali text.We have explored 20 combinations for KD,quantization,and pruning,resulting in improved speedup,fewer parameters,and reduced memory size.Our best results demonstrate significant improvements in both speed and efficiency.For instance,in the case of mBERT,we achieved a 3.87×speedup and 4×compression ratio with a combination of Distil+Prune+Quant that reduced parameters from 178 to 46 M,while the memory size decreased from 711 to 178 MB.These results offer scalable solutions for NLP tasks in various languages and advance the field of model compression,making these models suitable for real-world applications in resource-limited environments.展开更多
The miscibility of flue gas and different types of light oils is investigated through slender-tube miscible displacement experiment at high temperature and high pressure.Under the conditions of high temperature and hi...The miscibility of flue gas and different types of light oils is investigated through slender-tube miscible displacement experiment at high temperature and high pressure.Under the conditions of high temperature and high pressure,the miscible displacement of flue gas and light oil is possible.At the same temperature,there is a linear relationship between oil displacement efficiency and pressure.At the same pressure,the oil displacement efficiency increases gently and then rapidly to more than 90% to achieve miscible displacement with the increase of temperature.The rapid increase of oil displacement efficiency is closely related to the process that the light components of oil transit in phase state due to distillation with the rise of temperature.Moreover,at the same pressure,the lighter the oil,the lower the minimum miscibility temperature between flue gas and oil,which allows easier miscibility and ultimately better performance of thermal miscible flooding by air injection.The miscibility between flue gas and light oil at high temperature and high pressure is more typically characterized by phase transition at high temperature in supercritical state,and it is different from the contact extraction miscibility of CO_(2) under conventional high pressure conditions.展开更多
In this work,the ternary azeotrope of tert-butyl alcohol/ethyl acetate/water is separated by extractive distillation(ED)to recover the available constituents and protect the environment.Based on the conductor like shi...In this work,the ternary azeotrope of tert-butyl alcohol/ethyl acetate/water is separated by extractive distillation(ED)to recover the available constituents and protect the environment.Based on the conductor like shielding model and relative volatility method,ethylene glycol was selected as the extractant in the separation process.In addition,in view of the characteristic that the relative volatility between components changes with pressure,the multi-objective optimization method based on nondominated sorting genetic algorithm II optimizes the pressure and the amount of solvent cooperatively to avoid falling into the optimal local solution.Based on the optimal process parameters,the proposed heat-integrated process can reduce the gas emissions by 29.30%.The heat-integrated ED,further coupled with the pervaporation process,can reduce gas emission by 42.36%and has the highest exergy efficiency of 47.56%.In addition,based on the heat-integrated process,the proposed two heat pump assisted heat-integrated ED processes show good economic and environmental performance.The double heat pump assisted heat-integrated ED can reduce the total annual cost by 28.78%and the gas emissions by 55.83%compared with the basis process,which has a good application prospect.This work provides a feasible approach for the separation of ternary azeotropes.展开更多
Neural networks are often viewed as pure‘black box’models,lacking interpretability and extrapolation capabilities of pure mechanistic models.This work proposes a new approach that,with the help of neural networks,im...Neural networks are often viewed as pure‘black box’models,lacking interpretability and extrapolation capabilities of pure mechanistic models.This work proposes a new approach that,with the help of neural networks,improves the conformity of the first-principal model to the actual plant.The final result is still a first-principal model rather than a hybrid model,which maintains the advantage of the high interpretability of first-principal model.This work better simulates industrial batch distillation which separates four components:water,ethylene glycol,diethylene glycol,and triethylene glycol.GRU(gated recurrent neural network)and LSTM(long short-term memory)were used to obtain empirical parameters of mechanistic model that are difficult to measure directly.These were used to improve the empirical processes in mechanistic model,thus correcting unreasonable model assumptions and achieving better predictability for batch distillation.The proposed method was verified using a case study from one industrial plant case,and the results show its advancement in improving model predictions and the potential to extend to other similar systems.展开更多
Optimizing multistage processes,such as distillation or absorption,is a complex mixed-integer nonlinear programming(MINLP)problem.Relaxing integer into continuous variables and solving the easier nonlinear programming...Optimizing multistage processes,such as distillation or absorption,is a complex mixed-integer nonlinear programming(MINLP)problem.Relaxing integer into continuous variables and solving the easier nonlinear programming(NLP)problem is an optimization idea for the multistage process.In this article,we propose a relaxation method based on the efficiency parameter.When the efficiency parameter is 1or 0,the proposed model is equivalent to the complete existence or inexistence of the equilibrium stage.And non-integer efficiency represents partial existence.A multi-component absorption case shows a natural penalty for non-integer efficiency,which can assist the efficiency parameter converging to 0 or 1.However,its penalty is weaker than the existing relaxation models,such as the bypass efficiency model.In a simple distillation case,we show that this property can weaken the nonconvexity of the optimization problem and increase the probability of obtaining better optimization results.展开更多
Here we demonstrate the proof-of-concept for microchannel reactive distillation for alcohol-to-jet application:combining ethanol/water separation and ethanol dehydration in one unit operation.Ethanol is first distille...Here we demonstrate the proof-of-concept for microchannel reactive distillation for alcohol-to-jet application:combining ethanol/water separation and ethanol dehydration in one unit operation.Ethanol is first distilled into the vapor phase,converted to ethylene and water,and then the water co-product is condensed to shift the reaction equilibrium.Process intensification is achieved through rapid mass transfer-ethanol stripping from thin wicks using novel microchannel architectures-leading to lower residence time and improved separation efficiency.Energy savings are realized with integration of unit operations.For example,heat of condensing water can offset vaporizing ethanol.Furthermore,the dehydration reaction equilibrium shifts towards completion by immediate removal of the water byproduct upon formation while maintaining aqueous feedstock in the condensed phase.For aqueous ethanol feedstock(40%_w),71% ethanol conversion with 91% selectivity to ethylene was demonstrated at 220℃,600psig,and 0.28 h^(-1) wt hour space velocity.2.7 stages of separation were also demonstrated,under these conditions,using a device length of 8.3 cm.This provides a height equivalent of a theoretical plate(HETP),a measure of separation efficiency,of ^(3).3 cm.By comparison,conventional distillation packing provides an HETP of ^(3)0 cm.Thus,9,1 × reduction in HETP was demonstrated over conventional technology,providing a means for significant energy savings and an example of process intensification.Finally,preliminary process economic analysis indicates that by using microchannel reactive distillation technology,the operating and capital costs for the ethanol separation and dehydration portion of an envisioned alcoholto-jet process could be reduced by at least 35% and 55%,respectively,relative to the incumbent technology,provided future improvements to microchannel reactive distillation design and operability are made.展开更多
Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric atta...Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric attack strategies are unsuitable for distillation.Additionally,the reliability of guidance from static teachers diminishes as target models become more robust.This paper proposes an AD method called Learnable Distillation Attack Strategies and Evolvable Teachers Adversarial Distillation(LDAS&ET-AD).Firstly,a learnable distillation attack strategies generating mechanism is developed to automatically generate sample-dependent attack strategies tailored for distillation.A strategy model is introduced to produce attack strategies that enable adversarial examples(AEs)to be created in areas where the target model significantly diverges from the teachers by competing with the target model in minimizing or maximizing the AD loss.Secondly,a teacher evolution strategy is introduced to enhance the reliability and effectiveness of knowledge in improving the generalization performance of the target model.By calculating the experimentally updated target model’s validation performance on both clean samples and AEs,the impact of distillation from each training sample and AE on the target model’s generalization and robustness abilities is assessed to serve as feedback to fine-tune standard and robust teachers accordingly.Experiments evaluate the performance of LDAS&ET-AD against different adversarial attacks on the CIFAR-10 and CIFAR-100 datasets.The experimental results demonstrate that the proposed method achieves a robust precision of 45.39%and 42.63%against AutoAttack(AA)on the CIFAR-10 dataset for ResNet-18 and MobileNet-V2,respectively,marking an improvement of 2.31%and 3.49%over the baseline method.In comparison to state-of-the-art adversarial defense techniques,our method surpasses Introspective Adversarial Distillation,the top-performing method in terms of robustness under AA attack for the CIFAR-10 dataset,with enhancements of 1.40%and 1.43%for ResNet-18 and MobileNet-V2,respectively.These findings demonstrate the effectiveness of our proposed method in enhancing the robustness of deep learning networks(DNNs)against prevalent adversarial attacks when compared to other competing methods.In conclusion,LDAS&ET-AD provides reliable and informative soft labels to one of the most promising defense methods,AT,alleviating the limitations of untrusted teachers and unsuitable AEs in existing AD techniques.We hope this paper promotes the development of DNNs in real-world trust-sensitive fields and helps ensure a more secure and dependable future for artificial intelligence systems.展开更多
A huge amount of energy is always consumed to separate the ternary azeotropic mixtures by distillations.The heterogeneous azeotropic distillation and the pressure-swing distillation are two kinds of effective technolo...A huge amount of energy is always consumed to separate the ternary azeotropic mixtures by distillations.The heterogeneous azeotropic distillation and the pressure-swing distillation are two kinds of effective technologies to separate heterogeneous azeotropes without entrainer addition.To give better play to the synergistic energy-saving effect of these two processes,a novel pressure-swing-assisted ternary heterogeneous azeotropic distillation(THAD)process is proposed firstly.In this process,the ternary heterogeneous azeotrope is decanted into two liquid phases before being refluxed into the azeotropic distillation column to avoid the aqueous phase remixing,and three columns'pressures are modified to decrease the flowrates of the recycle streams.Then the dividing wall column and heat integration technologies are introduced to further reduce its energy consumption,and the pressureswing-assisted ternary heterogeneous azeotropic dividing-wall column and its heat integration structure are achieved.A genetic algorithm procedure is used to optimize the proposed processes.The design results show that the proposed processes have higher energy efficiencies and lower CO_(2)emissions than the published THAD process.展开更多
This study explored the synergistic interaction of sewage sludge(SS)and distillation residue(DR)during co-pyrolysis for the optimized treatment of sewage sludge in cement kiln systems,utilizing thermogravimetric analy...This study explored the synergistic interaction of sewage sludge(SS)and distillation residue(DR)during co-pyrolysis for the optimized treatment of sewage sludge in cement kiln systems,utilizing thermogravimetric analysis(TGA)and thermogravimetric analysis with mass spectrometry(TGA-MS).The results reveal the coexisting synergistic and antagonistic effects in the co-pyrolysis of SS/DR.The synergistic effect arises from hydrogen free radicals in SS and catalytic components in ash fractions,while the antagonistic effect is mainly due to the melting of DR on the surface of SS particles during pyrolysis and the reaction of SS ash with alkali metals to form inert substances.SS/DR co-pyrolysis reduces the yielding of coke and gas while increasing tar production.This study will promote the reduction,recycling,and harmless treatment of hazardous solid waste.展开更多
The coal-to-ethanol process,as the clean coal utilization,faces challenges from the energy-intensive distillation that separates multi-component effluents for pure ethanol.Referring to at least eight columns,the synth...The coal-to-ethanol process,as the clean coal utilization,faces challenges from the energy-intensive distillation that separates multi-component effluents for pure ethanol.Referring to at least eight columns,the synthesis of the ethanol distillation system is impracticable for exhaustive comparison and difficult for conventional superstructure-based optimization as rigorous models are used.This work adopts a superstructure-based framework,which combines the strategy that adaptively selects branches of the state-equipment network and the parallel stochastic algorithm for process synthesis.High-performance computing significantly reduces time consumption,and the adaptive strategy substantially lowers the complexity of the superstructure model.Moreover,parallel computing,elite search,population redistribution,and retention strategies for irrelevant parameters are used to improve the optimization efficiency further.The optimization terminates after 3000 generations,providing a flowsheet solution that applies two non-sharp splitting options in its distillation sequence.As a result,the 59-dimension superstructure-based optimization was solved efficiently via a differential evolution algorithm,and a high-quality solution with a 28.34%lower total annual cost than the benchmark was obtained.Meanwhile,the solution of the superstructure-based optimization is comparable to that obtained by optimizing a single specific configuration one by one.It indicates that the superstructure-based optimization that combines the adaptive strategy can be a promising approach to handling the process synthesis of large-scale and complex chemical processes.展开更多
The liquid hold-up in a reactive distillation(RD)column not only has a significant impact on the extent of reactions,but also affects the pressure drop and hydraulic conditions in the column.Therefore,the liquid hold-...The liquid hold-up in a reactive distillation(RD)column not only has a significant impact on the extent of reactions,but also affects the pressure drop and hydraulic conditions in the column.Therefore,the liquid hold-up would be a critical design factor for RD columns.However,the existing design methods for RD columns typically neglect the influence of considerable amount of liquid hold-up in downcomers owing to the difficulties of solving a large-scale nonlinear model system by considering downcomer hydraulics,resulting in significant deviations from actual situation and even operation infeasibility of the designed column.In this paper,a pseudo-transient(PT)RD model based on equilibrium model considering tray hydraulics was established for rigorous simulation and optimization of RD plate columns considering the liquid hold-up both in downcomers and column trays,and a steady-state optimization algorithm assisted by the PT model was adopted to robustly solve the optimization problem.The optimization results of either ethylene glycol RD or methyl acetate RD demonstrated that assuming all the liquid hold-up of a stage belonged to the tray will cause significant deviations in the column diameter,weir height,and the number of stages,which leads to not meeting the separation requirements and even operation hydraulic infeasibility.The rigorous model proposed in this study which considers the liquid hold-up both on trays and in downcomers as well as hydraulic constraints can be applied to systematically design industrial RD plate columns to simultaneously obtain optimal operating variables and equipment structure variables.展开更多
Time-frequency analysis is a successfully used tool for analyzing the local features of seismic data.However,it suffers from several inevitable limitations,such as the restricted time-frequency resolution,the difficul...Time-frequency analysis is a successfully used tool for analyzing the local features of seismic data.However,it suffers from several inevitable limitations,such as the restricted time-frequency resolution,the difficulty in selecting parameters,and the low computational efficiency.Inspired by deep learning,we suggest a deep learning-based workflow for seismic time-frequency analysis.The sparse S transform network(SSTNet)is first built to map the relationship between synthetic traces and sparse S transform spectra,which can be easily pre-trained by using synthetic traces and training labels.Next,we introduce knowledge distillation(KD)based transfer learning to re-train SSTNet by using a field data set without training labels,which is named the sparse S transform network with knowledge distillation(KD-SSTNet).In this way,we can effectively calculate the sparse time-frequency spectra of field data and avoid the use of field training labels.To test the availability of the suggested KD-SSTNet,we apply it to field data to estimate seismic attenuation for reservoir characterization and make detailed comparisons with the traditional time-frequency analysis methods.展开更多
Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not...Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not consider the offset of pixels along the epipolar lines in complementary views when integrating stereo information.To address these challenges,this paper introduces a novel epipolar line window attention stereo image super-resolution network(EWASSR).For detail feature restoration,we design a feature extractor based on Transformer and convolutional neural network(CNN),which consists of(shifted)window-based self-attention((S)W-MSA)and feature distillation and enhancement blocks(FDEB).This combination effectively solves the problem of global image perception and local feature attention and captures more discriminative high-frequency features of the image.Furthermore,to address the problem of offset of complementary pixels in stereo images,we propose an epipolar line window attention(EWA)mechanism,which divides windows along the epipolar direction to promote efficient matching of shifted pixels,even in pixel smooth areas.More accurate pixel matching can be achieved using adjacent pixels in the window as a reference.Extensive experiments demonstrate that our EWASSR can reconstruct more realistic detailed features.Comparative quantitative results show that in the experimental results of our EWASSR on the Middlebury and Flickr1024 data sets for 2×SR,compared with the recent network,the Peak signal-to-noise ratio(PSNR)increased by 0.37 dB and 0.34 dB,respectively.展开更多
Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillati...Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillation(SKD) extracts dark knowledge from the model itself rather than an external teacher network. However, previous SKD methods performed distillation indiscriminately on full datasets, overlooking the analysis of representative samples. In this work, we present a novel two-stage approach to providing targeted knowledge on specific samples, named two-stage approach self-knowledge distillation(TOAST). We first soften the hard targets using class medoids generated based on logit vectors per class. Then, we iteratively distill the under-trained data with past predictions of half the batch size. The two-stage knowledge is linearly combined, efficiently enhancing model performance. Extensive experiments conducted on five backbone architectures show our method is model-agnostic and achieves the best generalization performance.Besides, TOAST is strongly compatible with existing augmentation-based regularization methods. Our method also obtains a speedup of up to 2.95x compared with a recent state-of-the-art method.展开更多
Research on panicle detection is one of the most important aspects of paddy phenotypic analysis.A phenotyping method that uses unmanned aerial vehicles can be an excellent alternative to field-based methods.Neverthele...Research on panicle detection is one of the most important aspects of paddy phenotypic analysis.A phenotyping method that uses unmanned aerial vehicles can be an excellent alternative to field-based methods.Nevertheless,it entails many other challenges,including different illuminations,panicle sizes,shape distortions,partial occlusions,and complex backgrounds.Object detection algorithms are directly affected by these factors.This work proposes a model for detecting panicles called Border Sensitive Knowledge Distillation(BSKD).It is designed to prioritize the preservation of knowledge in border areas through the use of feature distillation.Our feature-based knowledge distillation method allows us to compress the model without sacrificing its effectiveness.An imitation mask is used to distinguish panicle-related foreground features from irrelevant background features.A significant improvement in Unmanned Aerial Vehicle(UAV)images is achieved when students imitate the teacher’s features.On the UAV rice imagery dataset,the proposed BSKD model shows superior performance with 76.3%mAP,88.3%precision,90.1%recall and 92.6%F1 score.展开更多
Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in t...Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists.To address the inherent biases in knowledge distillation,we propose a de-biased knowledge distillation framework tailored for binary classification tasks.For the pre-trained teacher model,biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques.Based on this,a de-biased distillation loss is introduced,allowing the de-biased labels to replace the soft labels as the fitting target for the student model.This approach enables the student model to learn from the corrected model information,achieving high-performance deployment on lightweight student models.Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics,highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.展开更多
Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a fea...Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a feasible solution due to its low cost.However,camera data lacks geometric depth,and only using camera data to obtain high accuracy is challenging.This paper proposes a multi-modal Bird-Eye-View(BEV)distillation framework(MMDistill)to make a trade-off between them.MMDistill is a carefully crafted two-stage distillation framework based on teacher and student models for learning cross-modal knowledge and generating multi-modal features.It can improve the performance of unimodal detectors without introducing additional costs during inference.Specifically,our method can effectively solve the cross-gap caused by the heterogeneity between data.Furthermore,we further propose a Light Detection and Ranging(LiDAR)-guided geometric compensation module,which can assist the student model in obtaining effective geometric features and reduce the gap between different modalities.Our proposed method generally requires fewer computational resources and faster inference speed than traditional multi-modal models.This advancement enables multi-modal technology to be applied more widely in practical scenarios.Through experiments,we validate the effectiveness and superiority of MMDistill on the nuScenes dataset,achieving an improvement of 4.1%mean Average Precision(mAP)and 4.6%NuScenes Detection Score(NDS)over the baseline detector.In addition,we also present detailed ablation studies to validate our method.展开更多
Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data predic...Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data prediction systems represented by machine learning,it has become possible for real-time prediction systems of petroleum fraction molecular information to replace analyses such as gas chromatography and mass spectrometry.However,the biggest difficulty lies in acquiring the data required for training the neural network.To address these issues,this work proposes an innovative method that utilizes the Aspen HYSYS and full two-dimensional gas chromatography-time-of-flight mass spectrometry to establish a comprehensive training database.Subsequently,a deep neural network prediction model is developed for heavy distillate oil to predict its composition in terms of molecular structure.After training,the model accurately predicts the molecular composition of catalytically cracked raw oil in a refinery.The validation and test sets exhibit R2 values of 0.99769 and 0.99807,respectively,and the average relative error of molecular composition prediction for raw materials of the catalytic cracking unit is less than 7%.Finally,the SHAP(SHapley Additive ExPlanation)interpretation method is used to disclose the relationship among different variables by performing global and local weight comparisons and correlation analyses.展开更多
Achieving reliable and efficient weather classification for autonomous vehicles is crucial for ensuring safety and operational effectiveness.However,accurately classifying diverse and complex weather conditions remain...Achieving reliable and efficient weather classification for autonomous vehicles is crucial for ensuring safety and operational effectiveness.However,accurately classifying diverse and complex weather conditions remains a significant challenge.While advanced techniques such as Vision Transformers have been developed,they face key limitations,including high computational costs and limited generalization across varying weather conditions.These challenges present a critical research gap,particularly in applications where scalable and efficient solutions are needed to handle weather phenomena’intricate and dynamic nature in real-time.To address this gap,we propose a Multi-level Knowledge Distillation(MLKD)framework,which leverages the complementary strengths of state-of-the-art pre-trained models to enhance classification performance while minimizing computational overhead.Specifically,we employ ResNet50V2 and EfficientNetV2B3 as teacher models,known for their ability to capture complex image features and distil their knowledge into a custom lightweight Convolutional Neural Network(CNN)student model.This framework balances the trade-off between high classification accuracy and efficient resource consumption,ensuring real-time applicability in autonomous systems.Our Response-based Multi-level Knowledge Distillation(R-MLKD)approach effectively transfers rich,high-level feature representations from the teacher models to the student model,allowing the student to perform robustly with significantly fewer parameters and lower computational demands.The proposed method was evaluated on three public datasets(DAWN,BDD100K,and CITS traffic alerts),each containing seven weather classes with 2000 samples per class.The results demonstrate the effectiveness of MLKD,achieving a 97.3%accuracy,which surpasses conventional deep learning models.This work improves classification accuracy and tackles the practical challenges of model complexity,resource consumption,and real-time deployment,offering a scalable solution for weather classification in autonomous driving systems.展开更多
文摘A novel hydrocracking Ni-W binary catalyst was tentatively designed and prepared by means ofimpregnation on mixed supports of modified Y zeolite and amorphous aluminosilicate . The structure andproperties of catalyst were extensively characterized by XRD, NH3-TPD, IR and XRF techniques. The perfor-mance of catalyst was evaluated by a 100-ml hydrogenation laboratory test unit with two single-stage fixed-bedreactors connected in series. The characterization results showed that the catalyst has a developed and con-centrated mesopores distribution, suitable acid sites and acid strength distribution, and uniform and highdispersion of metal sites. Under a high conversion rate of 73.8% with the >350℃ feedstock, a 98.1m% of C5+yield and 83.5% of middle distillates selectivity were obtained. The yield of middle distillates boiling between140℃and 370℃ was 68.70m% and its quality could meet the WWFC category III specification. It means thatthis catalyst could be used to produce more high quality clean middle distillates derived from heavy oilhydrocracking. The potential aromatic content of heavy naphtha from 65℃ to 140℃ was 37.5m%. The BMCIvalue of >370℃ tail oil was 6.6. The heavy naphtha and tail oil are premium feedstocks for catalytic reformingand steam cracker units.
文摘The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classification.However,BERT’s size and computational demands limit its practicality,especially in resource-constrained settings.This research compresses the BERT base model for Bengali emotion classification through knowledge distillation(KD),pruning,and quantization techniques.Despite Bengali being the sixth most spoken language globally,NLP research in this area is limited.Our approach addresses this gap by creating an efficient BERT-based model for Bengali text.We have explored 20 combinations for KD,quantization,and pruning,resulting in improved speedup,fewer parameters,and reduced memory size.Our best results demonstrate significant improvements in both speed and efficiency.For instance,in the case of mBERT,we achieved a 3.87×speedup and 4×compression ratio with a combination of Distil+Prune+Quant that reduced parameters from 178 to 46 M,while the memory size decreased from 711 to 178 MB.These results offer scalable solutions for NLP tasks in various languages and advance the field of model compression,making these models suitable for real-world applications in resource-limited environments.
基金Supported by the PetroChina Science and Technology Project(2023ZG18).
文摘The miscibility of flue gas and different types of light oils is investigated through slender-tube miscible displacement experiment at high temperature and high pressure.Under the conditions of high temperature and high pressure,the miscible displacement of flue gas and light oil is possible.At the same temperature,there is a linear relationship between oil displacement efficiency and pressure.At the same pressure,the oil displacement efficiency increases gently and then rapidly to more than 90% to achieve miscible displacement with the increase of temperature.The rapid increase of oil displacement efficiency is closely related to the process that the light components of oil transit in phase state due to distillation with the rise of temperature.Moreover,at the same pressure,the lighter the oil,the lower the minimum miscibility temperature between flue gas and oil,which allows easier miscibility and ultimately better performance of thermal miscible flooding by air injection.The miscibility between flue gas and light oil at high temperature and high pressure is more typically characterized by phase transition at high temperature in supercritical state,and it is different from the contact extraction miscibility of CO_(2) under conventional high pressure conditions.
基金supported by the National Natural Science Foundation of China(22178188).
文摘In this work,the ternary azeotrope of tert-butyl alcohol/ethyl acetate/water is separated by extractive distillation(ED)to recover the available constituents and protect the environment.Based on the conductor like shielding model and relative volatility method,ethylene glycol was selected as the extractant in the separation process.In addition,in view of the characteristic that the relative volatility between components changes with pressure,the multi-objective optimization method based on nondominated sorting genetic algorithm II optimizes the pressure and the amount of solvent cooperatively to avoid falling into the optimal local solution.Based on the optimal process parameters,the proposed heat-integrated process can reduce the gas emissions by 29.30%.The heat-integrated ED,further coupled with the pervaporation process,can reduce gas emission by 42.36%and has the highest exergy efficiency of 47.56%.In addition,based on the heat-integrated process,the proposed two heat pump assisted heat-integrated ED processes show good economic and environmental performance.The double heat pump assisted heat-integrated ED can reduce the total annual cost by 28.78%and the gas emissions by 55.83%compared with the basis process,which has a good application prospect.This work provides a feasible approach for the separation of ternary azeotropes.
基金supported by Beijing Natural Science Foundation(2222037)by the Fundamental Research Funds for the Central Universities.
文摘Neural networks are often viewed as pure‘black box’models,lacking interpretability and extrapolation capabilities of pure mechanistic models.This work proposes a new approach that,with the help of neural networks,improves the conformity of the first-principal model to the actual plant.The final result is still a first-principal model rather than a hybrid model,which maintains the advantage of the high interpretability of first-principal model.This work better simulates industrial batch distillation which separates four components:water,ethylene glycol,diethylene glycol,and triethylene glycol.GRU(gated recurrent neural network)and LSTM(long short-term memory)were used to obtain empirical parameters of mechanistic model that are difficult to measure directly.These were used to improve the empirical processes in mechanistic model,thus correcting unreasonable model assumptions and achieving better predictability for batch distillation.The proposed method was verified using a case study from one industrial plant case,and the results show its advancement in improving model predictions and the potential to extend to other similar systems.
基金Support by the National Natural Science Foundation of China(22308251,22178247,22378304)the Natural Science Foundation of Hebei Province(B2021208026)。
文摘Optimizing multistage processes,such as distillation or absorption,is a complex mixed-integer nonlinear programming(MINLP)problem.Relaxing integer into continuous variables and solving the easier nonlinear programming(NLP)problem is an optimization idea for the multistage process.In this article,we propose a relaxation method based on the efficiency parameter.When the efficiency parameter is 1or 0,the proposed model is equivalent to the complete existence or inexistence of the equilibrium stage.And non-integer efficiency represents partial existence.A multi-component absorption case shows a natural penalty for non-integer efficiency,which can assist the efficiency parameter converging to 0 or 1.However,its penalty is weaker than the existing relaxation models,such as the bypass efficiency model.In a simple distillation case,we show that this property can weaken the nonconvexity of the optimization problem and increase the probability of obtaining better optimization results.
基金financially U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, Bioenergy Technologies Office, and the Office of Technology Transitions Technology Commercialization FundFinancial support also was provided by Lanza Tech through a Cooperative Research and Development Agreement。
文摘Here we demonstrate the proof-of-concept for microchannel reactive distillation for alcohol-to-jet application:combining ethanol/water separation and ethanol dehydration in one unit operation.Ethanol is first distilled into the vapor phase,converted to ethylene and water,and then the water co-product is condensed to shift the reaction equilibrium.Process intensification is achieved through rapid mass transfer-ethanol stripping from thin wicks using novel microchannel architectures-leading to lower residence time and improved separation efficiency.Energy savings are realized with integration of unit operations.For example,heat of condensing water can offset vaporizing ethanol.Furthermore,the dehydration reaction equilibrium shifts towards completion by immediate removal of the water byproduct upon formation while maintaining aqueous feedstock in the condensed phase.For aqueous ethanol feedstock(40%_w),71% ethanol conversion with 91% selectivity to ethylene was demonstrated at 220℃,600psig,and 0.28 h^(-1) wt hour space velocity.2.7 stages of separation were also demonstrated,under these conditions,using a device length of 8.3 cm.This provides a height equivalent of a theoretical plate(HETP),a measure of separation efficiency,of ^(3).3 cm.By comparison,conventional distillation packing provides an HETP of ^(3)0 cm.Thus,9,1 × reduction in HETP was demonstrated over conventional technology,providing a means for significant energy savings and an example of process intensification.Finally,preliminary process economic analysis indicates that by using microchannel reactive distillation technology,the operating and capital costs for the ethanol separation and dehydration portion of an envisioned alcoholto-jet process could be reduced by at least 35% and 55%,respectively,relative to the incumbent technology,provided future improvements to microchannel reactive distillation design and operability are made.
基金the National Key Research and Development Program of China(2021YFB1006200)Major Science and Technology Project of Henan Province in China(221100211200).Grant was received by S.Li.
文摘Adversarial distillation(AD)has emerged as a potential solution to tackle the challenging optimization problem of loss with hard labels in adversarial training.However,fixed sample-agnostic and student-egocentric attack strategies are unsuitable for distillation.Additionally,the reliability of guidance from static teachers diminishes as target models become more robust.This paper proposes an AD method called Learnable Distillation Attack Strategies and Evolvable Teachers Adversarial Distillation(LDAS&ET-AD).Firstly,a learnable distillation attack strategies generating mechanism is developed to automatically generate sample-dependent attack strategies tailored for distillation.A strategy model is introduced to produce attack strategies that enable adversarial examples(AEs)to be created in areas where the target model significantly diverges from the teachers by competing with the target model in minimizing or maximizing the AD loss.Secondly,a teacher evolution strategy is introduced to enhance the reliability and effectiveness of knowledge in improving the generalization performance of the target model.By calculating the experimentally updated target model’s validation performance on both clean samples and AEs,the impact of distillation from each training sample and AE on the target model’s generalization and robustness abilities is assessed to serve as feedback to fine-tune standard and robust teachers accordingly.Experiments evaluate the performance of LDAS&ET-AD against different adversarial attacks on the CIFAR-10 and CIFAR-100 datasets.The experimental results demonstrate that the proposed method achieves a robust precision of 45.39%and 42.63%against AutoAttack(AA)on the CIFAR-10 dataset for ResNet-18 and MobileNet-V2,respectively,marking an improvement of 2.31%and 3.49%over the baseline method.In comparison to state-of-the-art adversarial defense techniques,our method surpasses Introspective Adversarial Distillation,the top-performing method in terms of robustness under AA attack for the CIFAR-10 dataset,with enhancements of 1.40%and 1.43%for ResNet-18 and MobileNet-V2,respectively.These findings demonstrate the effectiveness of our proposed method in enhancing the robustness of deep learning networks(DNNs)against prevalent adversarial attacks when compared to other competing methods.In conclusion,LDAS&ET-AD provides reliable and informative soft labels to one of the most promising defense methods,AT,alleviating the limitations of untrusted teachers and unsuitable AEs in existing AD techniques.We hope this paper promotes the development of DNNs in real-world trust-sensitive fields and helps ensure a more secure and dependable future for artificial intelligence systems.
基金supported by the National Natural Science Foundation of China(21406170)the State Key Laboratory of Chemical Engineering(SKL-ChE-22B02).
文摘A huge amount of energy is always consumed to separate the ternary azeotropic mixtures by distillations.The heterogeneous azeotropic distillation and the pressure-swing distillation are two kinds of effective technologies to separate heterogeneous azeotropes without entrainer addition.To give better play to the synergistic energy-saving effect of these two processes,a novel pressure-swing-assisted ternary heterogeneous azeotropic distillation(THAD)process is proposed firstly.In this process,the ternary heterogeneous azeotrope is decanted into two liquid phases before being refluxed into the azeotropic distillation column to avoid the aqueous phase remixing,and three columns'pressures are modified to decrease the flowrates of the recycle streams.Then the dividing wall column and heat integration technologies are introduced to further reduce its energy consumption,and the pressureswing-assisted ternary heterogeneous azeotropic dividing-wall column and its heat integration structure are achieved.A genetic algorithm procedure is used to optimize the proposed processes.The design results show that the proposed processes have higher energy efficiencies and lower CO_(2)emissions than the published THAD process.
基金Funded by National College Student Innovation and Entrepreneurship Training Program Project(No.CY202036)。
文摘This study explored the synergistic interaction of sewage sludge(SS)and distillation residue(DR)during co-pyrolysis for the optimized treatment of sewage sludge in cement kiln systems,utilizing thermogravimetric analysis(TGA)and thermogravimetric analysis with mass spectrometry(TGA-MS).The results reveal the coexisting synergistic and antagonistic effects in the co-pyrolysis of SS/DR.The synergistic effect arises from hydrogen free radicals in SS and catalytic components in ash fractions,while the antagonistic effect is mainly due to the melting of DR on the surface of SS particles during pyrolysis and the reaction of SS ash with alkali metals to form inert substances.SS/DR co-pyrolysis reduces the yielding of coke and gas while increasing tar production.This study will promote the reduction,recycling,and harmless treatment of hazardous solid waste.
文摘The coal-to-ethanol process,as the clean coal utilization,faces challenges from the energy-intensive distillation that separates multi-component effluents for pure ethanol.Referring to at least eight columns,the synthesis of the ethanol distillation system is impracticable for exhaustive comparison and difficult for conventional superstructure-based optimization as rigorous models are used.This work adopts a superstructure-based framework,which combines the strategy that adaptively selects branches of the state-equipment network and the parallel stochastic algorithm for process synthesis.High-performance computing significantly reduces time consumption,and the adaptive strategy substantially lowers the complexity of the superstructure model.Moreover,parallel computing,elite search,population redistribution,and retention strategies for irrelevant parameters are used to improve the optimization efficiency further.The optimization terminates after 3000 generations,providing a flowsheet solution that applies two non-sharp splitting options in its distillation sequence.As a result,the 59-dimension superstructure-based optimization was solved efficiently via a differential evolution algorithm,and a high-quality solution with a 28.34%lower total annual cost than the benchmark was obtained.Meanwhile,the solution of the superstructure-based optimization is comparable to that obtained by optimizing a single specific configuration one by one.It indicates that the superstructure-based optimization that combines the adaptive strategy can be a promising approach to handling the process synthesis of large-scale and complex chemical processes.
基金supported by the National Natural Science Foundation of China(22378304).
文摘The liquid hold-up in a reactive distillation(RD)column not only has a significant impact on the extent of reactions,but also affects the pressure drop and hydraulic conditions in the column.Therefore,the liquid hold-up would be a critical design factor for RD columns.However,the existing design methods for RD columns typically neglect the influence of considerable amount of liquid hold-up in downcomers owing to the difficulties of solving a large-scale nonlinear model system by considering downcomer hydraulics,resulting in significant deviations from actual situation and even operation infeasibility of the designed column.In this paper,a pseudo-transient(PT)RD model based on equilibrium model considering tray hydraulics was established for rigorous simulation and optimization of RD plate columns considering the liquid hold-up both in downcomers and column trays,and a steady-state optimization algorithm assisted by the PT model was adopted to robustly solve the optimization problem.The optimization results of either ethylene glycol RD or methyl acetate RD demonstrated that assuming all the liquid hold-up of a stage belonged to the tray will cause significant deviations in the column diameter,weir height,and the number of stages,which leads to not meeting the separation requirements and even operation hydraulic infeasibility.The rigorous model proposed in this study which considers the liquid hold-up both on trays and in downcomers as well as hydraulic constraints can be applied to systematically design industrial RD plate columns to simultaneously obtain optimal operating variables and equipment structure variables.
基金supported by the National Natural Science Foundation of China (42274144,42304122,and 41974155)the Key Research and Development Program of Shaanxi (2023-YBGY-076)+1 种基金the National Key R&D Program of China (2020YFA0713404)the China Uranium Industry and East China University of Technology Joint Innovation Fund (NRE202107)。
文摘Time-frequency analysis is a successfully used tool for analyzing the local features of seismic data.However,it suffers from several inevitable limitations,such as the restricted time-frequency resolution,the difficulty in selecting parameters,and the low computational efficiency.Inspired by deep learning,we suggest a deep learning-based workflow for seismic time-frequency analysis.The sparse S transform network(SSTNet)is first built to map the relationship between synthetic traces and sparse S transform spectra,which can be easily pre-trained by using synthetic traces and training labels.Next,we introduce knowledge distillation(KD)based transfer learning to re-train SSTNet by using a field data set without training labels,which is named the sparse S transform network with knowledge distillation(KD-SSTNet).In this way,we can effectively calculate the sparse time-frequency spectra of field data and avoid the use of field training labels.To test the availability of the suggested KD-SSTNet,we apply it to field data to estimate seismic attenuation for reservoir characterization and make detailed comparisons with the traditional time-frequency analysis methods.
基金This work was supported by Sichuan Science and Technology Program(2023YFG0262).
文摘Transformer-based stereo image super-resolution reconstruction(Stereo SR)methods have significantly improved image quality.However,existing methods have deficiencies in paying attention to detailed features and do not consider the offset of pixels along the epipolar lines in complementary views when integrating stereo information.To address these challenges,this paper introduces a novel epipolar line window attention stereo image super-resolution network(EWASSR).For detail feature restoration,we design a feature extractor based on Transformer and convolutional neural network(CNN),which consists of(shifted)window-based self-attention((S)W-MSA)and feature distillation and enhancement blocks(FDEB).This combination effectively solves the problem of global image perception and local feature attention and captures more discriminative high-frequency features of the image.Furthermore,to address the problem of offset of complementary pixels in stereo images,we propose an epipolar line window attention(EWA)mechanism,which divides windows along the epipolar direction to promote efficient matching of shifted pixels,even in pixel smooth areas.More accurate pixel matching can be achieved using adjacent pixels in the window as a reference.Extensive experiments demonstrate that our EWASSR can reconstruct more realistic detailed features.Comparative quantitative results show that in the experimental results of our EWASSR on the Middlebury and Flickr1024 data sets for 2×SR,compared with the recent network,the Peak signal-to-noise ratio(PSNR)increased by 0.37 dB and 0.34 dB,respectively.
基金supported by the National Natural Science Foundation of China (62176061)。
文摘Knowledge distillation(KD) enhances student network generalization by transferring dark knowledge from a complex teacher network. To optimize computational expenditure and memory utilization, self-knowledge distillation(SKD) extracts dark knowledge from the model itself rather than an external teacher network. However, previous SKD methods performed distillation indiscriminately on full datasets, overlooking the analysis of representative samples. In this work, we present a novel two-stage approach to providing targeted knowledge on specific samples, named two-stage approach self-knowledge distillation(TOAST). We first soften the hard targets using class medoids generated based on logit vectors per class. Then, we iteratively distill the under-trained data with past predictions of half the batch size. The two-stage knowledge is linearly combined, efficiently enhancing model performance. Extensive experiments conducted on five backbone architectures show our method is model-agnostic and achieves the best generalization performance.Besides, TOAST is strongly compatible with existing augmentation-based regularization methods. Our method also obtains a speedup of up to 2.95x compared with a recent state-of-the-art method.
文摘Research on panicle detection is one of the most important aspects of paddy phenotypic analysis.A phenotyping method that uses unmanned aerial vehicles can be an excellent alternative to field-based methods.Nevertheless,it entails many other challenges,including different illuminations,panicle sizes,shape distortions,partial occlusions,and complex backgrounds.Object detection algorithms are directly affected by these factors.This work proposes a model for detecting panicles called Border Sensitive Knowledge Distillation(BSKD).It is designed to prioritize the preservation of knowledge in border areas through the use of feature distillation.Our feature-based knowledge distillation method allows us to compress the model without sacrificing its effectiveness.An imitation mask is used to distinguish panicle-related foreground features from irrelevant background features.A significant improvement in Unmanned Aerial Vehicle(UAV)images is achieved when students imitate the teacher’s features.On the UAV rice imagery dataset,the proposed BSKD model shows superior performance with 76.3%mAP,88.3%precision,90.1%recall and 92.6%F1 score.
基金supported by the National Natural Science Foundation of China under Grant No.62172056Young Elite Scientists Sponsorship Program by CAST under Grant No.2022QNRC001.
文摘Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists.To address the inherent biases in knowledge distillation,we propose a de-biased knowledge distillation framework tailored for binary classification tasks.For the pre-trained teacher model,biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques.Based on this,a de-biased distillation loss is introduced,allowing the de-biased labels to replace the soft labels as the fitting target for the student model.This approach enables the student model to learn from the corrected model information,achieving high-performance deployment on lightweight student models.Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics,highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.
基金supported by the National Natural Science Foundation of China(GrantNo.62302086)the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the Fundamental Research Funds for the Central Universities(Grant No.N2317005).
文摘Multi-modal 3D object detection has achieved remarkable progress,but it is often limited in practical industrial production because of its high cost and low efficiency.The multi-view camera-based method provides a feasible solution due to its low cost.However,camera data lacks geometric depth,and only using camera data to obtain high accuracy is challenging.This paper proposes a multi-modal Bird-Eye-View(BEV)distillation framework(MMDistill)to make a trade-off between them.MMDistill is a carefully crafted two-stage distillation framework based on teacher and student models for learning cross-modal knowledge and generating multi-modal features.It can improve the performance of unimodal detectors without introducing additional costs during inference.Specifically,our method can effectively solve the cross-gap caused by the heterogeneity between data.Furthermore,we further propose a Light Detection and Ranging(LiDAR)-guided geometric compensation module,which can assist the student model in obtaining effective geometric features and reduce the gap between different modalities.Our proposed method generally requires fewer computational resources and faster inference speed than traditional multi-modal models.This advancement enables multi-modal technology to be applied more widely in practical scenarios.Through experiments,we validate the effectiveness and superiority of MMDistill on the nuScenes dataset,achieving an improvement of 4.1%mean Average Precision(mAP)and 4.6%NuScenes Detection Score(NDS)over the baseline detector.In addition,we also present detailed ablation studies to validate our method.
基金the National Natural Science Foundation of China(22108307)the Natural Science Foundation of Shandong Province(ZR2020KB006)the Outstanding Youth Fund of Shandong Provincial Natural Science Foundation(ZR2020YQ17).
文摘Acquiring accurate molecular-level information about petroleum is crucial for refining and chemical enterprises to implement the“selection of the optimal processing route”strategy.With the development of data prediction systems represented by machine learning,it has become possible for real-time prediction systems of petroleum fraction molecular information to replace analyses such as gas chromatography and mass spectrometry.However,the biggest difficulty lies in acquiring the data required for training the neural network.To address these issues,this work proposes an innovative method that utilizes the Aspen HYSYS and full two-dimensional gas chromatography-time-of-flight mass spectrometry to establish a comprehensive training database.Subsequently,a deep neural network prediction model is developed for heavy distillate oil to predict its composition in terms of molecular structure.After training,the model accurately predicts the molecular composition of catalytically cracked raw oil in a refinery.The validation and test sets exhibit R2 values of 0.99769 and 0.99807,respectively,and the average relative error of molecular composition prediction for raw materials of the catalytic cracking unit is less than 7%.Finally,the SHAP(SHapley Additive ExPlanation)interpretation method is used to disclose the relationship among different variables by performing global and local weight comparisons and correlation analyses.
文摘Achieving reliable and efficient weather classification for autonomous vehicles is crucial for ensuring safety and operational effectiveness.However,accurately classifying diverse and complex weather conditions remains a significant challenge.While advanced techniques such as Vision Transformers have been developed,they face key limitations,including high computational costs and limited generalization across varying weather conditions.These challenges present a critical research gap,particularly in applications where scalable and efficient solutions are needed to handle weather phenomena’intricate and dynamic nature in real-time.To address this gap,we propose a Multi-level Knowledge Distillation(MLKD)framework,which leverages the complementary strengths of state-of-the-art pre-trained models to enhance classification performance while minimizing computational overhead.Specifically,we employ ResNet50V2 and EfficientNetV2B3 as teacher models,known for their ability to capture complex image features and distil their knowledge into a custom lightweight Convolutional Neural Network(CNN)student model.This framework balances the trade-off between high classification accuracy and efficient resource consumption,ensuring real-time applicability in autonomous systems.Our Response-based Multi-level Knowledge Distillation(R-MLKD)approach effectively transfers rich,high-level feature representations from the teacher models to the student model,allowing the student to perform robustly with significantly fewer parameters and lower computational demands.The proposed method was evaluated on three public datasets(DAWN,BDD100K,and CITS traffic alerts),each containing seven weather classes with 2000 samples per class.The results demonstrate the effectiveness of MLKD,achieving a 97.3%accuracy,which surpasses conventional deep learning models.This work improves classification accuracy and tackles the practical challenges of model complexity,resource consumption,and real-time deployment,offering a scalable solution for weather classification in autonomous driving systems.