Time-frequency analysis is a successfully used tool for analyzing the local features of seismic data.However,it suffers from several inevitable limitations,such as the restricted time-frequency resolution,the difficul...Time-frequency analysis is a successfully used tool for analyzing the local features of seismic data.However,it suffers from several inevitable limitations,such as the restricted time-frequency resolution,the difficulty in selecting parameters,and the low computational efficiency.Inspired by deep learning,we suggest a deep learning-based workflow for seismic time-frequency analysis.The sparse S transform network(SSTNet)is first built to map the relationship between synthetic traces and sparse S transform spectra,which can be easily pre-trained by using synthetic traces and training labels.Next,we introduce knowledge distillation(KD)based transfer learning to re-train SSTNet by using a field data set without training labels,which is named the sparse S transform network with knowledge distillation(KD-SSTNet).In this way,we can effectively calculate the sparse time-frequency spectra of field data and avoid the use of field training labels.To test the availability of the suggested KD-SSTNet,we apply it to field data to estimate seismic attenuation for reservoir characterization and make detailed comparisons with the traditional time-frequency analysis methods.展开更多
Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in t...Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists.To address the inherent biases in knowledge distillation,we propose a de-biased knowledge distillation framework tailored for binary classification tasks.For the pre-trained teacher model,biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques.Based on this,a de-biased distillation loss is introduced,allowing the de-biased labels to replace the soft labels as the fitting target for the student model.This approach enables the student model to learn from the corrected model information,achieving high-performance deployment on lightweight student models.Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics,highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.展开更多
Research on panicle detection is one of the most important aspects of paddy phenotypic analysis.A phenotyping method that uses unmanned aerial vehicles can be an excellent alternative to field-based methods.Neverthele...Research on panicle detection is one of the most important aspects of paddy phenotypic analysis.A phenotyping method that uses unmanned aerial vehicles can be an excellent alternative to field-based methods.Nevertheless,it entails many other challenges,including different illuminations,panicle sizes,shape distortions,partial occlusions,and complex backgrounds.Object detection algorithms are directly affected by these factors.This work proposes a model for detecting panicles called Border Sensitive Knowledge Distillation(BSKD).It is designed to prioritize the preservation of knowledge in border areas through the use of feature distillation.Our feature-based knowledge distillation method allows us to compress the model without sacrificing its effectiveness.An imitation mask is used to distinguish panicle-related foreground features from irrelevant background features.A significant improvement in Unmanned Aerial Vehicle(UAV)images is achieved when students imitate the teacher’s features.On the UAV rice imagery dataset,the proposed BSKD model shows superior performance with 76.3%mAP,88.3%precision,90.1%recall and 92.6%F1 score.展开更多
Strabismus significantly impacts human health as a prevalent ophthalmic condition.Early detection of strabismus is crucial for effective treatment and prognosis.Traditional deep learning models for strabismus detectio...Strabismus significantly impacts human health as a prevalent ophthalmic condition.Early detection of strabismus is crucial for effective treatment and prognosis.Traditional deep learning models for strabismus detection often fail to estimate prediction certainty precisely.This paper employed a Bayesian deep learning algorithm with knowledge distillation,improving the model's performance and uncertainty estimation ability.Trained on 6807 images from two tertiary hospitals,the model showed significantly higher diagnostic accuracy than traditional deep-learning models.Experimental results revealed that knowledge distillation enhanced the Bayesian model’s performance and uncertainty estimation ability.These findings underscore the combined benefits of using Bayesian deep learning algorithms and knowledge distillation,which improve the reliability and accuracy of strabismus diagnostic predictions.展开更多
Unmanned aerial vehicles(UAVs) have gained significant attention in practical applications, especially the low-altitude aerial(LAA) object detection imposes stringent requirements on recognition accuracy and computati...Unmanned aerial vehicles(UAVs) have gained significant attention in practical applications, especially the low-altitude aerial(LAA) object detection imposes stringent requirements on recognition accuracy and computational resources. In this paper, the LAA images-oriented tensor decomposition and knowledge distillation-based network(TDKD-Net) is proposed,where the TT-format TD(tensor decomposition) and equalweighted response-based KD(knowledge distillation) methods are designed to minimize redundant parameters while ensuring comparable performance. Moreover, some robust network structures are developed, including the small object detection head and the dual-domain attention mechanism, which enable the model to leverage the learned knowledge from small-scale targets and selectively focus on salient features. Considering the imbalance of bounding box regression samples and the inaccuracy of regression geometric factors, the focal and efficient IoU(intersection of union) loss with optimal transport assignment(F-EIoU-OTA)mechanism is proposed to improve the detection accuracy. The proposed TDKD-Net is comprehensively evaluated through extensive experiments, and the results have demonstrated the effectiveness and superiority of the developed methods in comparison to other advanced detection algorithms, which also present high generalization and strong robustness. As a resource-efficient precise network, the complex detection of small and occluded LAA objects is also well addressed by TDKD-Net, which provides useful insights on handling imbalanced issues and realizing domain adaptation.展开更多
In this paper,to deal with the heterogeneity in federated learning(FL)systems,a knowledge distillation(KD)driven training framework for FL is proposed,where each user can select its neural network model on demand and ...In this paper,to deal with the heterogeneity in federated learning(FL)systems,a knowledge distillation(KD)driven training framework for FL is proposed,where each user can select its neural network model on demand and distill knowledge from a big teacher model using its own private dataset.To overcome the challenge of train the big teacher model in resource limited user devices,the digital twin(DT)is exploit in the way that the teacher model can be trained at DT located in the server with enough computing resources.Then,during model distillation,each user can update the parameters of its model at either the physical entity or the digital agent.The joint problem of model selection and training offloading and resource allocation for users is formulated as a mixed integer programming(MIP)problem.To solve the problem,Q-learning and optimization are jointly used,where Q-learning selects models for users and determines whether to train locally or on the server,and optimization is used to allocate resources for users based on the output of Q-learning.Simulation results show the proposed DT-assisted KD framework and joint optimization method can significantly improve the average accuracy of users while reducing the total delay.展开更多
Waste pollution is a significant environmental problem worldwide.With the continuous improvement in the living standards of the population and increasing richness of the consumption structure,the amount of domestic wa...Waste pollution is a significant environmental problem worldwide.With the continuous improvement in the living standards of the population and increasing richness of the consumption structure,the amount of domestic waste generated has increased dramatically,and there is an urgent need for further treatment.The rapid development of artificial intelligence has provided an effective solution for automated waste classification.However,the high computational power and complexity of algorithms make convolutional neural networks unsuitable for real-time embedded applications.In this paper,we propose a lightweight network architecture called Focus-RCNet,designed with reference to the sandglass structure of MobileNetV2,which uses deeply separable convolution to extract features from images.The Focus module is introduced to the field of recyclable waste image classification to reduce the dimensionality of features while retaining relevant information.To make the model focus more on waste image features while keeping the number of parameters small,we introduce the SimAM attention mechanism.In addition,knowledge distillation was used to further compress the number of parameters in the model.By training and testing on the TrashNet dataset,the Focus-RCNet model not only achieved an accuracy of 92%but also showed high deployment mobility.展开更多
With the rapid development of the Internet of Things(IoT),the automation of edge-side equipment has emerged as a significant trend.The existing fault diagnosismethods have the characteristics of heavy computing and st...With the rapid development of the Internet of Things(IoT),the automation of edge-side equipment has emerged as a significant trend.The existing fault diagnosismethods have the characteristics of heavy computing and storage load,and most of them have computational redundancy,which is not suitable for deployment on edge devices with limited resources and capabilities.This paper proposes a novel two-stage edge-side fault diagnosis method based on double knowledge distillation.First,we offer a clustering-based self-knowledge distillation approach(Cluster KD),which takes the mean value of the sample diagnosis results,clusters them,and takes the clustering results as the terms of the loss function.It utilizes the correlations between faults of the same type to improve the accuracy of the teacher model,especially for fault categories with high similarity.Then,the double knowledge distillation framework uses ordinary knowledge distillation to build a lightweightmodel for edge-side deployment.We propose a two-stage edge-side fault diagnosismethod(TSM)that separates fault detection and fault diagnosis into different stages:in the first stage,a fault detection model based on a denoising auto-encoder(DAE)is adopted to achieve fast fault responses;in the second stage,a diverse convolutionmodel with variance weighting(DCMVW)is used to diagnose faults in detail,extracting features frommicro andmacro perspectives.Through comparison experiments conducted on two fault datasets,it is proven that the proposed method has high accuracy,low delays,and small computation,which is suitable for intelligent edge-side fault diagnosis.In addition,experiments show that our approach has a smooth training process and good balance.展开更多
Recently, deep convolutional neural networks (DCNNs) have achieved remarkable results in image classification tasks. Despite convolutional networks’ great successes, their training process relies on a large amount of...Recently, deep convolutional neural networks (DCNNs) have achieved remarkable results in image classification tasks. Despite convolutional networks’ great successes, their training process relies on a large amount of data prepared in advance, which is often challenging in real-world applications, such as streaming data and concept drift. For this reason, incremental learning (continual learning) has attracted increasing attention from scholars. However, incremental learning is associated with the challenge of catastrophic forgetting: the performance on previous tasks drastically degrades after learning a new task. In this paper, we propose a new strategy to alleviate catastrophic forgetting when neural networks are trained in continual domains. Specifically, two components are applied: data translation based on transfer learning and knowledge distillation. The former translates a portion of new data to reconstruct the partial data distribution of the old domain. The latter uses an old model as a teacher to guide a new model. The experimental results on three datasets have shown that our work can effectively alleviate catastrophic forgetting by a combination of the two methods aforementioned.展开更多
In this paper,a novel method of ultra-lightweight convolution neural network(CNN)design based on neural architecture search(NAS)and knowledge distillation(KD)is proposed.It can realize the automatic construction of th...In this paper,a novel method of ultra-lightweight convolution neural network(CNN)design based on neural architecture search(NAS)and knowledge distillation(KD)is proposed.It can realize the automatic construction of the space target inverse synthetic aperture radar(ISAR)image recognition model with ultra-lightweight and high accuracy.This method introduces the NAS method into the radar image recognition for the first time,which solves the time-consuming and labor-consuming problems in the artificial design of the space target ISAR image automatic recognition model(STIIARM).On this basis,the NAS model’s knowledge is transferred to the student model with lower computational complexity by the flow of the solution procedure(FSP)distillation method.Thus,the decline of recognition accuracy caused by the direct compression of model structural parameters can be effectively avoided,and the ultralightweight STIIARM can be obtained.In the method,the Inverted Linear Bottleneck(ILB)and Inverted Residual Block(IRB)are firstly taken as each block’s basic structure in CNN.And the expansion ratio,output filter size,number of IRBs,and convolution kernel size are set as the search parameters to construct a hierarchical decomposition search space.Then,the recognition accuracy and computational complexity are taken as the objective function and constraint conditions,respectively,and the global optimization model of the CNN architecture search is established.Next,the simulated annealing(SA)algorithm is used as the search strategy to search out the lightweight and high accuracy STIIARM directly.After that,based on the three principles of similar block structure,the same corresponding channel number,and the minimum computational complexity,the more lightweight student model is designed,and the FSP matrix pairing between the NAS model and student model is completed.Finally,by minimizing the loss between the FSP matrix pairs of the NAS model and student model,the student model’s weight adjustment is completed.Thus the ultra-lightweight and high accuracy STIIARM is obtained.The proposed method’s effectiveness is verified by the simulation experiments on the ISAR image dataset of five types of space targets.展开更多
Deep learning technologies are increasingly used in the fi eld of geophysics,and a variety of algorithms based on shallow convolutional neural networks are more widely used in fault recognition,but these methods are u...Deep learning technologies are increasingly used in the fi eld of geophysics,and a variety of algorithms based on shallow convolutional neural networks are more widely used in fault recognition,but these methods are usually not able to accurately identify complex faults.In this study,using the advantage of deep residual networks to capture strong learning features,we introduce residual blocks to replace all convolutional layers of the three-dimensional(3D)UNet to build a new 3D Res-UNet and select appropriate parameters through experiments to train a large amount of synthesized seismic data.After the training is completed,we introduce the mechanism of knowledge distillation.First,we treat the 3D Res-UNet as a teacher network and then train the 3D Res-UNet as a student network;in this process,the teacher network is in evaluation mode.Finally,we calculate the mixed loss function by combining the teacher model and student network to learn more fault information,improve the performance of the network,and optimize the fault recognition eff ect.The quantitative evaluation result of the synthetic model test proves that the 3D Res-UNet can considerably improve the accuracy of fault recognition from 0.956 to 0.993 after knowledge distillation,and the eff ectiveness and feasibility of our method can be verifi ed based on the application of actual seismic data.展开更多
Fish behavior analysis for recognizing stress is very important for fish welfare and production management in aquaculture.Recent advances have been made in fish behavior analysis based on deep learning.However,most ex...Fish behavior analysis for recognizing stress is very important for fish welfare and production management in aquaculture.Recent advances have been made in fish behavior analysis based on deep learning.However,most existing methods with top performance rely on considerable memory and computational resources,which is impractical in the real-world scenario.In order to overcome the limitations of these methods,a new method based on knowledge distillation is proposed to identify the stress states of fish schools.The knowledge distillation architecture transfers additional inter-class information via a mixed relative loss function,and it forces a lightweight network(GhostNet)to mimic the soft probabilities output of a well-trained fish stress state recognition network(ResNeXt101).The fish school stress state recognition model’s accuracy is improved from 94.17%to 98.12%benefiting from the method.The proposed model has about 5.18 M parameters and requires 0.15 G FLOPs(floating-point operations)to process an image of size 224×224.Furthermore,fish behavior images are collected in a land-based factory,and a dataset is constructed and extended through flip,rotation,and color jitter augmentation techniques.The proposed method is also compared with other state-of-the-art methods.The experimental results show that the proposed model is more suitable for deployment on resource-constrained devices or real-time applications,and it is conducive for real-time monitoring of fish behavior.展开更多
Soybean leaf morphology is one of the most important morphological and biological characteristics of soybean.The germplasm gene differences of soybeans can lead to different phenotypic traits,among which soybean leaf ...Soybean leaf morphology is one of the most important morphological and biological characteristics of soybean.The germplasm gene differences of soybeans can lead to different phenotypic traits,among which soybean leaf morphology is an important parameter that directly reflects the difference in soybean germplasm.To realize the morphological classification of soybean leaves,a method was proposed based on deep learning to automatically detect soybean leaves and classify leaf morphology.The morphology of soybean leaves included lanceolate,oval,ellipse and round.First,an image collection platform was designed to collect images of soybean leaves.Then,the feature pyramid networks–single shot multibox detector(FPN-SSD)model was proposed to detect the top leaflets of soybean leaves on the collected images.Finally,a classification model based on knowledge distillation was proposed to classify different morphologies of soybean leaves.The obtained results indicated an overall classification accuracy of 0.956 over a private dataset of 3200 soybean leaf images,and the accuracy of classification for each morphology was 1.00,0.97,0.93 and 0.94.The results showed that this method could effectively classify soybean leaf morphology and had great application potential in analyzing other phenotypic traits of soybean.展开更多
Edge computation offloading allows mobile end devices to execute compute-inten?sive tasks on edge servers. End devices can decide whether the tasks are offloaded to edge servers, cloud servers or executed locally acco...Edge computation offloading allows mobile end devices to execute compute-inten?sive tasks on edge servers. End devices can decide whether the tasks are offloaded to edge servers, cloud servers or executed locally according to current network condition and devic?es'profiles in an online manner. In this paper, we propose an edge computation offloading framework based on deep imitation learning (DIL) and knowledge distillation (KD), which assists end devices to quickly make fine-grained decisions to optimize the delay of computa?tion tasks online. We formalize a computation offloading problem into a multi-label classifi?cation problem. Training samples for our DIL model are generated in an offline manner. Af?ter the model is trained, we leverage KD to obtain a lightweight DIL model, by which we fur?ther reduce the model's inference delay. Numerical experiment shows that the offloading de?cisions made by our model not only outperform those made by other related policies in laten?cy metric, but also have the shortest inference delay among all policies.展开更多
It is significant for agricultural intelligent knowledge services using knowledge graph technology to integrate multi-source heterogeneous crop and pest data and fully mine the knowledge hidden in the text.However,onl...It is significant for agricultural intelligent knowledge services using knowledge graph technology to integrate multi-source heterogeneous crop and pest data and fully mine the knowledge hidden in the text.However,only some labeled data for agricultural knowledge graph domain training are available.Furthermore,labeling is costly due to the need for more data openness and standardization.This paper proposes a novel model using knowledge distillation for a weakly supervised entity recognition in ontology construction.Knowledge distillation between the target and source data domain is performed,where Bi-LSTM and CRF models are constructed for entity recognition.The experimental result is shown that we only need to label less than one-tenth of the data for model training.Furthermore,the agricultural domain ontology is constructed by BILSTM-CRF named entity recognition model and relationship extraction model.Moreover,there are a total of 13,983 entities and 26,498 relationships built in the neo4j graph database.展开更多
With the advancement of deep learning techniques,the number of model parameters has been increasing,leading to significant memory consumption and limits in the deployment of such models in real-time applications.To re...With the advancement of deep learning techniques,the number of model parameters has been increasing,leading to significant memory consumption and limits in the deployment of such models in real-time applications.To reduce the number of model parameters and enhance the generalization capability of neural networks,we propose a method called Decoupled MetaDistil,which involves decoupled meta-distillation.This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model,thereby improving the generalization ability.Furthermore,we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge.Extensive experiments demonstrate the effectiveness of our method.展开更多
Amid the backdrop of carbon neutrality, traditional energy production is transitioning towards integrated energy systems (IES), where model-based scheduling is key in scenarios with multiple uncertainties on both supp...Amid the backdrop of carbon neutrality, traditional energy production is transitioning towards integrated energy systems (IES), where model-based scheduling is key in scenarios with multiple uncertainties on both supply and demand sides. The development of artificial intelligence algorithms, has resolved issues related to model accuracy. However, under conditions of high proportion renewable energy integration, component load adjustments require increased flexibility, so the mathematical model of the component must adapt to constantly changing operating conditions. Therefore, the identification of operating condition changes and rapid model updating are pressing issues. This study proposes a modeling and updating method for IES components based on knowledge distillation. The core of this modeling method is the light weighting of the model, which is achieved through a knowledge distillation method, using a teacher-student mode to compress complex neural network models. The triggering of model updates is achieved through principal component analysis. The study also analyzes the impact of model errors caused by delayed model updates on the overall scheduling of IES. Case studies are conducted on critical components in IES, including coal-fired boilers and turbines. The results show that the time consumption for model updating is reduced by 76.67 % using the proposed method. Under changing conditions, compared with two traditional models, the average deviation of this method is reduced by 12.61 % and 3.49 %, respectively, thereby improving the model's adaptability. The necessity of updating the component model is further analyzed, as a 1.00 % mean squared error in the component model may lead to a power deviation of 0.075 MW. This method provides real-time, adaptable support for IES data modeling and updates.展开更多
In various fields,knowledge distillation(KD)techniques that combine vision transformers(ViTs)and convolutional neural networks(CNNs)as a hybrid teacher have shown remarkable results in classification.However,in the re...In various fields,knowledge distillation(KD)techniques that combine vision transformers(ViTs)and convolutional neural networks(CNNs)as a hybrid teacher have shown remarkable results in classification.However,in the realm of remote sensing images(RSIs),existing KD research studies are not only scarce but also lack competitiveness.This issue significantly impedes the deployment of the notable advantages of ViTs and CNNs.To tackle this,the authors introduce a novel hybrid‐model KD approach named HMKD‐Net,which comprises a CNN‐ViT ensemble teacher and a CNN student.Contrary to popular opinion,the authors posit that the sparsity in RSI data distribution limits the effectiveness and efficiency of hybrid‐model knowledge transfer.As a solution,a simple yet innovative method to handle variances during the KD phase is suggested,leading to substantial enhancements in the effectiveness and efficiency of hybrid knowledge transfer.The authors assessed the performance of HMKD‐Net on three RSI datasets.The findings indicate that HMKD‐Net significantly outperforms other cuttingedge methods while maintaining a significantly smaller size.Specifically,HMKD‐Net exceeds other KD‐based methods with a maximum accuracy improvement of 22.8%across various datasets.As ablation experiments indicated,HMKD‐Net has cut down on time expenses by about 80%in the KD process.This research study validates that the hybrid‐model KD technique can be more effective and efficient if the data distribution sparsity in RSIs is well handled.展开更多
An object-oriented prototype expert system ORDEES for off-line trouble-shooting of refinery distillation columns is developed. It is found that highly modular knowledge base can be designed, and different types of dat...An object-oriented prototype expert system ORDEES for off-line trouble-shooting of refinery distillation columns is developed. It is found that highly modular knowledge base can be designed, and different types of data (e.g., graphs, numberical data, and algorithms) may be manipulated, by using object-oriented knowledge representation. In addition, a method termed Object-Oriented Multifunction Switcher is proposed for building multifunction expert systems. The results of the study are expected to be useful for designing multifunction expert systems for complex petroleum refining and petro-chemical processes with many kinds of equipment.展开更多
Although few-shot learning(FSL)has achieved great progress,it is still an enormous challenge especially when the source and target set are from different domains,which is also known as cross-domain few-shot learning(C...Although few-shot learning(FSL)has achieved great progress,it is still an enormous challenge especially when the source and target set are from different domains,which is also known as cross-domain few-shot learning(CD-FSL).Utilizing more source domain data is an effective way to improve the performance of CD-FSL.However,knowledge from different source domains may entangle and confuse with each other,which hurts the performance on the target domain.Therefore,we propose team-knowledge distllation networks(TKD-Net)to tackle this problem,which explores a strategy to help the cooperation of multiple teachers.Specifically,we distill knowledge from the cooperation of teacher networks to a single student network in a meta-learning framework.It incorporates task-oriented knowledge distillation and multiple cooperation among teachers to train an efficient student with better generalization ability on unseen tasks.Moreover,our TKD-Net employs both response-based knowledge and relation-based knowledge to transfer more comprehensive and effective knowledge.Extensive experimental results on four fine-grained datasets have demonstrated the effectiveness and superiority of our proposed TKD-Net approach.展开更多
基金supported by the National Natural Science Foundation of China (42274144,42304122,and 41974155)the Key Research and Development Program of Shaanxi (2023-YBGY-076)+1 种基金the National Key R&D Program of China (2020YFA0713404)the China Uranium Industry and East China University of Technology Joint Innovation Fund (NRE202107)。
文摘Time-frequency analysis is a successfully used tool for analyzing the local features of seismic data.However,it suffers from several inevitable limitations,such as the restricted time-frequency resolution,the difficulty in selecting parameters,and the low computational efficiency.Inspired by deep learning,we suggest a deep learning-based workflow for seismic time-frequency analysis.The sparse S transform network(SSTNet)is first built to map the relationship between synthetic traces and sparse S transform spectra,which can be easily pre-trained by using synthetic traces and training labels.Next,we introduce knowledge distillation(KD)based transfer learning to re-train SSTNet by using a field data set without training labels,which is named the sparse S transform network with knowledge distillation(KD-SSTNet).In this way,we can effectively calculate the sparse time-frequency spectra of field data and avoid the use of field training labels.To test the availability of the suggested KD-SSTNet,we apply it to field data to estimate seismic attenuation for reservoir characterization and make detailed comparisons with the traditional time-frequency analysis methods.
基金supported by the National Natural Science Foundation of China under Grant No.62172056Young Elite Scientists Sponsorship Program by CAST under Grant No.2022QNRC001.
文摘Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists.To address the inherent biases in knowledge distillation,we propose a de-biased knowledge distillation framework tailored for binary classification tasks.For the pre-trained teacher model,biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques.Based on this,a de-biased distillation loss is introduced,allowing the de-biased labels to replace the soft labels as the fitting target for the student model.This approach enables the student model to learn from the corrected model information,achieving high-performance deployment on lightweight student models.Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics,highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.
文摘Research on panicle detection is one of the most important aspects of paddy phenotypic analysis.A phenotyping method that uses unmanned aerial vehicles can be an excellent alternative to field-based methods.Nevertheless,it entails many other challenges,including different illuminations,panicle sizes,shape distortions,partial occlusions,and complex backgrounds.Object detection algorithms are directly affected by these factors.This work proposes a model for detecting panicles called Border Sensitive Knowledge Distillation(BSKD).It is designed to prioritize the preservation of knowledge in border areas through the use of feature distillation.Our feature-based knowledge distillation method allows us to compress the model without sacrificing its effectiveness.An imitation mask is used to distinguish panicle-related foreground features from irrelevant background features.A significant improvement in Unmanned Aerial Vehicle(UAV)images is achieved when students imitate the teacher’s features.On the UAV rice imagery dataset,the proposed BSKD model shows superior performance with 76.3%mAP,88.3%precision,90.1%recall and 92.6%F1 score.
基金supported in part by the Guangdong Natu-ral Science Foundation(No.2022A1515011396)in part by the National Key R and D Program of China(No.2021ZD0111502)in part by the Science Research Startup Foundation of Shantou University(No.NTF20021)。
文摘Strabismus significantly impacts human health as a prevalent ophthalmic condition.Early detection of strabismus is crucial for effective treatment and prognosis.Traditional deep learning models for strabismus detection often fail to estimate prediction certainty precisely.This paper employed a Bayesian deep learning algorithm with knowledge distillation,improving the model's performance and uncertainty estimation ability.Trained on 6807 images from two tertiary hospitals,the model showed significantly higher diagnostic accuracy than traditional deep-learning models.Experimental results revealed that knowledge distillation enhanced the Bayesian model’s performance and uncertainty estimation ability.These findings underscore the combined benefits of using Bayesian deep learning algorithms and knowledge distillation,which improve the reliability and accuracy of strabismus diagnostic predictions.
基金supported in part by the National Natural Science Foundation of China (62073271)the Natural Science Foundation for Distinguished Young Scholars of the Fujian Province of China (2023J06010)the Fundamental Research Funds for the Central Universities of China(20720220076)。
文摘Unmanned aerial vehicles(UAVs) have gained significant attention in practical applications, especially the low-altitude aerial(LAA) object detection imposes stringent requirements on recognition accuracy and computational resources. In this paper, the LAA images-oriented tensor decomposition and knowledge distillation-based network(TDKD-Net) is proposed,where the TT-format TD(tensor decomposition) and equalweighted response-based KD(knowledge distillation) methods are designed to minimize redundant parameters while ensuring comparable performance. Moreover, some robust network structures are developed, including the small object detection head and the dual-domain attention mechanism, which enable the model to leverage the learned knowledge from small-scale targets and selectively focus on salient features. Considering the imbalance of bounding box regression samples and the inaccuracy of regression geometric factors, the focal and efficient IoU(intersection of union) loss with optimal transport assignment(F-EIoU-OTA)mechanism is proposed to improve the detection accuracy. The proposed TDKD-Net is comprehensively evaluated through extensive experiments, and the results have demonstrated the effectiveness and superiority of the developed methods in comparison to other advanced detection algorithms, which also present high generalization and strong robustness. As a resource-efficient precise network, the complex detection of small and occluded LAA objects is also well addressed by TDKD-Net, which provides useful insights on handling imbalanced issues and realizing domain adaptation.
基金supported by the National Key Research and Development Program of China (2020YFB1807700)the National Natural Science Foundation of China (NSFC)under Grant No.62071356the Chongqing Key Laboratory of Mobile Communications Technology under Grant cqupt-mct202202。
文摘In this paper,to deal with the heterogeneity in federated learning(FL)systems,a knowledge distillation(KD)driven training framework for FL is proposed,where each user can select its neural network model on demand and distill knowledge from a big teacher model using its own private dataset.To overcome the challenge of train the big teacher model in resource limited user devices,the digital twin(DT)is exploit in the way that the teacher model can be trained at DT located in the server with enough computing resources.Then,during model distillation,each user can update the parameters of its model at either the physical entity or the digital agent.The joint problem of model selection and training offloading and resource allocation for users is formulated as a mixed integer programming(MIP)problem.To solve the problem,Q-learning and optimization are jointly used,where Q-learning selects models for users and determines whether to train locally or on the server,and optimization is used to allocate resources for users based on the output of Q-learning.Simulation results show the proposed DT-assisted KD framework and joint optimization method can significantly improve the average accuracy of users while reducing the total delay.
文摘Waste pollution is a significant environmental problem worldwide.With the continuous improvement in the living standards of the population and increasing richness of the consumption structure,the amount of domestic waste generated has increased dramatically,and there is an urgent need for further treatment.The rapid development of artificial intelligence has provided an effective solution for automated waste classification.However,the high computational power and complexity of algorithms make convolutional neural networks unsuitable for real-time embedded applications.In this paper,we propose a lightweight network architecture called Focus-RCNet,designed with reference to the sandglass structure of MobileNetV2,which uses deeply separable convolution to extract features from images.The Focus module is introduced to the field of recyclable waste image classification to reduce the dimensionality of features while retaining relevant information.To make the model focus more on waste image features while keeping the number of parameters small,we introduce the SimAM attention mechanism.In addition,knowledge distillation was used to further compress the number of parameters in the model.By training and testing on the TrashNet dataset,the Focus-RCNet model not only achieved an accuracy of 92%but also showed high deployment mobility.
基金supported by the National Key R&D Program of China(2019YFB2103202).
文摘With the rapid development of the Internet of Things(IoT),the automation of edge-side equipment has emerged as a significant trend.The existing fault diagnosismethods have the characteristics of heavy computing and storage load,and most of them have computational redundancy,which is not suitable for deployment on edge devices with limited resources and capabilities.This paper proposes a novel two-stage edge-side fault diagnosis method based on double knowledge distillation.First,we offer a clustering-based self-knowledge distillation approach(Cluster KD),which takes the mean value of the sample diagnosis results,clusters them,and takes the clustering results as the terms of the loss function.It utilizes the correlations between faults of the same type to improve the accuracy of the teacher model,especially for fault categories with high similarity.Then,the double knowledge distillation framework uses ordinary knowledge distillation to build a lightweightmodel for edge-side deployment.We propose a two-stage edge-side fault diagnosismethod(TSM)that separates fault detection and fault diagnosis into different stages:in the first stage,a fault detection model based on a denoising auto-encoder(DAE)is adopted to achieve fast fault responses;in the second stage,a diverse convolutionmodel with variance weighting(DCMVW)is used to diagnose faults in detail,extracting features frommicro andmacro perspectives.Through comparison experiments conducted on two fault datasets,it is proven that the proposed method has high accuracy,low delays,and small computation,which is suitable for intelligent edge-side fault diagnosis.In addition,experiments show that our approach has a smooth training process and good balance.
文摘Recently, deep convolutional neural networks (DCNNs) have achieved remarkable results in image classification tasks. Despite convolutional networks’ great successes, their training process relies on a large amount of data prepared in advance, which is often challenging in real-world applications, such as streaming data and concept drift. For this reason, incremental learning (continual learning) has attracted increasing attention from scholars. However, incremental learning is associated with the challenge of catastrophic forgetting: the performance on previous tasks drastically degrades after learning a new task. In this paper, we propose a new strategy to alleviate catastrophic forgetting when neural networks are trained in continual domains. Specifically, two components are applied: data translation based on transfer learning and knowledge distillation. The former translates a portion of new data to reconstruct the partial data distribution of the old domain. The latter uses an old model as a teacher to guide a new model. The experimental results on three datasets have shown that our work can effectively alleviate catastrophic forgetting by a combination of the two methods aforementioned.
文摘In this paper,a novel method of ultra-lightweight convolution neural network(CNN)design based on neural architecture search(NAS)and knowledge distillation(KD)is proposed.It can realize the automatic construction of the space target inverse synthetic aperture radar(ISAR)image recognition model with ultra-lightweight and high accuracy.This method introduces the NAS method into the radar image recognition for the first time,which solves the time-consuming and labor-consuming problems in the artificial design of the space target ISAR image automatic recognition model(STIIARM).On this basis,the NAS model’s knowledge is transferred to the student model with lower computational complexity by the flow of the solution procedure(FSP)distillation method.Thus,the decline of recognition accuracy caused by the direct compression of model structural parameters can be effectively avoided,and the ultralightweight STIIARM can be obtained.In the method,the Inverted Linear Bottleneck(ILB)and Inverted Residual Block(IRB)are firstly taken as each block’s basic structure in CNN.And the expansion ratio,output filter size,number of IRBs,and convolution kernel size are set as the search parameters to construct a hierarchical decomposition search space.Then,the recognition accuracy and computational complexity are taken as the objective function and constraint conditions,respectively,and the global optimization model of the CNN architecture search is established.Next,the simulated annealing(SA)algorithm is used as the search strategy to search out the lightweight and high accuracy STIIARM directly.After that,based on the three principles of similar block structure,the same corresponding channel number,and the minimum computational complexity,the more lightweight student model is designed,and the FSP matrix pairing between the NAS model and student model is completed.Finally,by minimizing the loss between the FSP matrix pairs of the NAS model and student model,the student model’s weight adjustment is completed.Thus the ultra-lightweight and high accuracy STIIARM is obtained.The proposed method’s effectiveness is verified by the simulation experiments on the ISAR image dataset of five types of space targets.
基金supported by the National Natural Science Foundation of China(No.42072169)。
文摘Deep learning technologies are increasingly used in the fi eld of geophysics,and a variety of algorithms based on shallow convolutional neural networks are more widely used in fault recognition,but these methods are usually not able to accurately identify complex faults.In this study,using the advantage of deep residual networks to capture strong learning features,we introduce residual blocks to replace all convolutional layers of the three-dimensional(3D)UNet to build a new 3D Res-UNet and select appropriate parameters through experiments to train a large amount of synthesized seismic data.After the training is completed,we introduce the mechanism of knowledge distillation.First,we treat the 3D Res-UNet as a teacher network and then train the 3D Res-UNet as a student network;in this process,the teacher network is in evaluation mode.Finally,we calculate the mixed loss function by combining the teacher model and student network to learn more fault information,improve the performance of the network,and optimize the fault recognition eff ect.The quantitative evaluation result of the synthetic model test proves that the 3D Res-UNet can considerably improve the accuracy of fault recognition from 0.956 to 0.993 after knowledge distillation,and the eff ectiveness and feasibility of our method can be verifi ed based on the application of actual seismic data.
基金supported by the National Science Foundation of China‘Analysis and feature recognition on feeding behavior of fish school in facility farming based on machine vision’(No.62076244)the National Key R&D Program of China‘Next generation precision aquaculture:R&D on intelligent measurement,control and equipment technologies’(China Grant No.2017YFE0122100).
文摘Fish behavior analysis for recognizing stress is very important for fish welfare and production management in aquaculture.Recent advances have been made in fish behavior analysis based on deep learning.However,most existing methods with top performance rely on considerable memory and computational resources,which is impractical in the real-world scenario.In order to overcome the limitations of these methods,a new method based on knowledge distillation is proposed to identify the stress states of fish schools.The knowledge distillation architecture transfers additional inter-class information via a mixed relative loss function,and it forces a lightweight network(GhostNet)to mimic the soft probabilities output of a well-trained fish stress state recognition network(ResNeXt101).The fish school stress state recognition model’s accuracy is improved from 94.17%to 98.12%benefiting from the method.The proposed model has about 5.18 M parameters and requires 0.15 G FLOPs(floating-point operations)to process an image of size 224×224.Furthermore,fish behavior images are collected in a land-based factory,and a dataset is constructed and extended through flip,rotation,and color jitter augmentation techniques.The proposed method is also compared with other state-of-the-art methods.The experimental results show that the proposed model is more suitable for deployment on resource-constrained devices or real-time applications,and it is conducive for real-time monitoring of fish behavior.
基金Supported by Heilongjiang Province Philosophy and Social Science Research Planning Project(17TQB059)。
文摘Soybean leaf morphology is one of the most important morphological and biological characteristics of soybean.The germplasm gene differences of soybeans can lead to different phenotypic traits,among which soybean leaf morphology is an important parameter that directly reflects the difference in soybean germplasm.To realize the morphological classification of soybean leaves,a method was proposed based on deep learning to automatically detect soybean leaves and classify leaf morphology.The morphology of soybean leaves included lanceolate,oval,ellipse and round.First,an image collection platform was designed to collect images of soybean leaves.Then,the feature pyramid networks–single shot multibox detector(FPN-SSD)model was proposed to detect the top leaflets of soybean leaves on the collected images.Finally,a classification model based on knowledge distillation was proposed to classify different morphologies of soybean leaves.The obtained results indicated an overall classification accuracy of 0.956 over a private dataset of 3200 soybean leaf images,and the accuracy of classification for each morphology was 1.00,0.97,0.93 and 0.94.The results showed that this method could effectively classify soybean leaf morphology and had great application potential in analyzing other phenotypic traits of soybean.
基金This work was supported in part by the National Science Foundation of China under Grant No.61972432the Program for Guangdong Introduc⁃ing Innovative and Entrepreneurial Teams under Grant No.2017ZT07X355.
文摘Edge computation offloading allows mobile end devices to execute compute-inten?sive tasks on edge servers. End devices can decide whether the tasks are offloaded to edge servers, cloud servers or executed locally according to current network condition and devic?es'profiles in an online manner. In this paper, we propose an edge computation offloading framework based on deep imitation learning (DIL) and knowledge distillation (KD), which assists end devices to quickly make fine-grained decisions to optimize the delay of computa?tion tasks online. We formalize a computation offloading problem into a multi-label classifi?cation problem. Training samples for our DIL model are generated in an offline manner. Af?ter the model is trained, we leverage KD to obtain a lightweight DIL model, by which we fur?ther reduce the model's inference delay. Numerical experiment shows that the offloading de?cisions made by our model not only outperform those made by other related policies in laten?cy metric, but also have the shortest inference delay among all policies.
基金supported by Heilongjiang NSF funding,No.LH202F022Heilongjiang research and application of key technologies,No.2021ZXJ05A03New generation artificial intelligent program,No.21ZD0110900 in CHINA.
文摘It is significant for agricultural intelligent knowledge services using knowledge graph technology to integrate multi-source heterogeneous crop and pest data and fully mine the knowledge hidden in the text.However,only some labeled data for agricultural knowledge graph domain training are available.Furthermore,labeling is costly due to the need for more data openness and standardization.This paper proposes a novel model using knowledge distillation for a weakly supervised entity recognition in ontology construction.Knowledge distillation between the target and source data domain is performed,where Bi-LSTM and CRF models are constructed for entity recognition.The experimental result is shown that we only need to label less than one-tenth of the data for model training.Furthermore,the agricultural domain ontology is constructed by BILSTM-CRF named entity recognition model and relationship extraction model.Moreover,there are a total of 13,983 entities and 26,498 relationships built in the neo4j graph database.
基金supported by the Key R&D Program of Shandong Province,China(2022CXGC20106)the Pilot Project for Integrated Innovation of Science,Education,and Industry of Qilu University of Technology(Shandong Academy of Sciences)(2022JBZ01-01)+1 种基金Joint Fund of Shandong Natural Science Foundation(ZR2022LZH010)Shandong Provincial Natural Science Foundation(ZR2021LZH008).
文摘With the advancement of deep learning techniques,the number of model parameters has been increasing,leading to significant memory consumption and limits in the deployment of such models in real-time applications.To reduce the number of model parameters and enhance the generalization capability of neural networks,we propose a method called Decoupled MetaDistil,which involves decoupled meta-distillation.This method utilizes meta-learning to guide the teacher model and dynamically adjusts the knowledge transfer strategy based on feedback from the student model,thereby improving the generalization ability.Furthermore,we introduce a decoupled loss method to explicitly transfer positive sample knowledge and explore the potential of negative samples knowledge.Extensive experiments demonstrate the effectiveness of our method.
基金supported by National Key R&D Program of China(Grant No.2023YFE0108600)National Natural Science Foundation of China(Grant No.51806190)+1 种基金National Key R&D Program of China(Grant No.2022YFB3304502)Self-directed project,State Key Laboratory of Clean Energy Utilization.
文摘Amid the backdrop of carbon neutrality, traditional energy production is transitioning towards integrated energy systems (IES), where model-based scheduling is key in scenarios with multiple uncertainties on both supply and demand sides. The development of artificial intelligence algorithms, has resolved issues related to model accuracy. However, under conditions of high proportion renewable energy integration, component load adjustments require increased flexibility, so the mathematical model of the component must adapt to constantly changing operating conditions. Therefore, the identification of operating condition changes and rapid model updating are pressing issues. This study proposes a modeling and updating method for IES components based on knowledge distillation. The core of this modeling method is the light weighting of the model, which is achieved through a knowledge distillation method, using a teacher-student mode to compress complex neural network models. The triggering of model updates is achieved through principal component analysis. The study also analyzes the impact of model errors caused by delayed model updates on the overall scheduling of IES. Case studies are conducted on critical components in IES, including coal-fired boilers and turbines. The results show that the time consumption for model updating is reduced by 76.67 % using the proposed method. Under changing conditions, compared with two traditional models, the average deviation of this method is reduced by 12.61 % and 3.49 %, respectively, thereby improving the model's adaptability. The necessity of updating the component model is further analyzed, as a 1.00 % mean squared error in the component model may lead to a power deviation of 0.075 MW. This method provides real-time, adaptable support for IES data modeling and updates.
基金Hunan University of Arts and Science,Grant/Award Numbers:JGYB2302Geography Subject[2022]351。
文摘In various fields,knowledge distillation(KD)techniques that combine vision transformers(ViTs)and convolutional neural networks(CNNs)as a hybrid teacher have shown remarkable results in classification.However,in the realm of remote sensing images(RSIs),existing KD research studies are not only scarce but also lack competitiveness.This issue significantly impedes the deployment of the notable advantages of ViTs and CNNs.To tackle this,the authors introduce a novel hybrid‐model KD approach named HMKD‐Net,which comprises a CNN‐ViT ensemble teacher and a CNN student.Contrary to popular opinion,the authors posit that the sparsity in RSI data distribution limits the effectiveness and efficiency of hybrid‐model knowledge transfer.As a solution,a simple yet innovative method to handle variances during the KD phase is suggested,leading to substantial enhancements in the effectiveness and efficiency of hybrid knowledge transfer.The authors assessed the performance of HMKD‐Net on three RSI datasets.The findings indicate that HMKD‐Net significantly outperforms other cuttingedge methods while maintaining a significantly smaller size.Specifically,HMKD‐Net exceeds other KD‐based methods with a maximum accuracy improvement of 22.8%across various datasets.As ablation experiments indicated,HMKD‐Net has cut down on time expenses by about 80%in the KD process.This research study validates that the hybrid‐model KD technique can be more effective and efficient if the data distribution sparsity in RSIs is well handled.
文摘An object-oriented prototype expert system ORDEES for off-line trouble-shooting of refinery distillation columns is developed. It is found that highly modular knowledge base can be designed, and different types of data (e.g., graphs, numberical data, and algorithms) may be manipulated, by using object-oriented knowledge representation. In addition, a method termed Object-Oriented Multifunction Switcher is proposed for building multifunction expert systems. The results of the study are expected to be useful for designing multifunction expert systems for complex petroleum refining and petro-chemical processes with many kinds of equipment.
基金supported by the National Natural Science Foundation of China(NSFC)(Grant No.62176178)the Central Funds Guiding the Local Science and Technology Development(206Z5001G).
文摘Although few-shot learning(FSL)has achieved great progress,it is still an enormous challenge especially when the source and target set are from different domains,which is also known as cross-domain few-shot learning(CD-FSL).Utilizing more source domain data is an effective way to improve the performance of CD-FSL.However,knowledge from different source domains may entangle and confuse with each other,which hurts the performance on the target domain.Therefore,we propose team-knowledge distllation networks(TKD-Net)to tackle this problem,which explores a strategy to help the cooperation of multiple teachers.Specifically,we distill knowledge from the cooperation of teacher networks to a single student network in a meta-learning framework.It incorporates task-oriented knowledge distillation and multiple cooperation among teachers to train an efficient student with better generalization ability on unseen tasks.Moreover,our TKD-Net employs both response-based knowledge and relation-based knowledge to transfer more comprehensive and effective knowledge.Extensive experimental results on four fine-grained datasets have demonstrated the effectiveness and superiority of our proposed TKD-Net approach.