Deep neural networks excel at image identification and computer vision applications such as visual product search, facial recognition, medical image analysis, object detection, semantic segmentation,instance segmentat...Deep neural networks excel at image identification and computer vision applications such as visual product search, facial recognition, medical image analysis, object detection, semantic segmentation,instance segmentation, and many others. In image and video recognition applications, convolutional neural networks(CNNs) are widely employed. These networks provide better performance but at a higher cost of computation. With the advent of big data, the growing scale of datasets has made processing and model training a time-consuming operation, resulting in longer training times. Moreover, these large scale datasets contain redundant data points that have minimum impact on the final outcome of the model. To address these issues, an accelerated CNN system is proposed for speeding up training by eliminating the noncritical data points during training alongwith a model compression method. Furthermore, the identification of the critical input data is performed by aggregating the data points at two levels of granularity which are used for evaluating the impact on the model output.Extensive experiments are conducted using the proposed method on CIFAR-10 dataset on ResNet models giving a 40% reduction in number of FLOPs with a degradation of just 0.11% accuracy.展开更多
Massive computational complexity and memory requirement of artificial intelligence models impede their deploy-ability on edge computing devices of the Internet of Things(IoT).While Power-of-Two(PoT)quantization is pro...Massive computational complexity and memory requirement of artificial intelligence models impede their deploy-ability on edge computing devices of the Internet of Things(IoT).While Power-of-Two(PoT)quantization is pro-posed to improve the efficiency for edge inference of Deep Neural Networks(DNNs),existing PoT schemes require a huge amount of bit-wise manipulation and have large memory overhead,and their efficiency is bounded by the bottleneck of computation latency and memory footprint.To tackle this challenge,we present an efficient inference approach on the basis of PoT quantization and model compression.An integer-only scalar PoT quantization(IOS-PoT)is designed jointly with a distribution loss regularizer,wherein the regularizer minimizes quantization errors and training disturbances.Additionally,two-stage model compression is developed to effectively reduce memory requirement,and alleviate bandwidth usage in communications of networked heterogenous learning systems.The product look-up table(P-LUT)inference scheme is leveraged to replace bit-shifting with only indexing and addition operations for achieving low-latency computation and implementing efficient edge accelerators.Finally,comprehensive experiments on Residual Networks(ResNets)and efficient architectures with Canadian Institute for Advanced Research(CIFAR),ImageNet,and Real-world Affective Faces Database(RAF-DB)datasets,indicate that our approach achieves 2×∼10×improvement in the reduction of both weight size and computation cost in comparison to state-of-the-art methods.A P-LUT accelerator prototype is implemented on the Xilinx KV260 Field Programmable Gate Array(FPGA)platform for accelerating convolution operations,with performance results showing that P-LUT reduces memory footprint by 1.45×,achieves more than 3×power efficiency and 2×resource efficiency,compared to the conventional bit-shifting scheme.展开更多
Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the ...Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.展开更多
Deep neural networks(DNNs)have achieved great success in many data processing applications.However,high computational complexity and storage cost make deep learning difficult to be used on resource-constrained devices...Deep neural networks(DNNs)have achieved great success in many data processing applications.However,high computational complexity and storage cost make deep learning difficult to be used on resource-constrained devices,and it is not environmental-friendly with much power cost.In this paper,we focus on low-rank optimization for efficient deep learning techniques.In the space domain,DNNs are compressed by low rank approximation of the network parameters,which directly reduces the storage requirement with a smaller number of network parameters.In the time domain,the network parameters can be trained in a few subspaces,which enables efficient training for fast convergence.The model compression in the spatial domain is summarized into three categories as pre-train,pre-set,and compression-aware methods,respectively.With a series of integrable techniques discussed,such as sparse pruning,quantization,and entropy coding,we can ensemble them in an integration framework with lower computational complexity and storage.In addition to summary of recent technical advances,we have two findings for motivating future works.One is that the effective rank,derived from the Shannon entropy of the normalized singular values,outperforms other conventional sparse measures such as the?_1 norm for network compression.The other is a spatial and temporal balance for tensorized neural networks.For accelerating the training of tensorized neural networks,it is crucial to leverage redundancy for both model compression and subspace training.展开更多
Asynchronous federated learning(AsynFL)can effectivelymitigate the impact of heterogeneity of edge nodes on joint training while satisfying participant user privacy protection and data security.However,the frequent ex...Asynchronous federated learning(AsynFL)can effectivelymitigate the impact of heterogeneity of edge nodes on joint training while satisfying participant user privacy protection and data security.However,the frequent exchange of massive data can lead to excess communication overhead between edge and central nodes regardless of whether the federated learning(FL)algorithm uses synchronous or asynchronous aggregation.Therefore,there is an urgent need for a method that can simultaneously take into account device heterogeneity and edge node energy consumption reduction.This paper proposes a novel Fixed-point Asynchronous Federated Learning(FixedAsynFL)algorithm,which could mitigate the resource consumption caused by frequent data communication while alleviating the effect of device heterogeneity.FixedAsynFL uses fixed-point quantization to compress the local and global models in AsynFL.In order to balance energy consumption and learning accuracy,this paper proposed a quantization scale selection mechanism.This paper examines the mathematical relationship between the quantization scale and energy consumption of the computation/communication process in the FixedAsynFL.Based on considering the upper bound of quantization noise,this paper optimizes the quantization scale by minimizing communication and computation consumption.This paper performs pertinent experiments on the MNIST dataset with several edge nodes of different computing efficiency.The results show that the FixedAsynFL algorithm with an 8-bit quantization can significantly reduce the communication data size by 81.3%and save the computation energy in the training phase by 74.9%without significant loss of accuracy.According to the experimental results,we can see that the proposed AsynFixedFL algorithm can effectively solve the problem of device heterogeneity and energy consumption limitation of edge nodes.展开更多
[Objectives]To observe the effect of Xianlinggubao Capsule on osteoporotic vertebral compression fracture(OVCF)in rabbits and the influence mechanism of the repair of fractures.[Methods]Female June age 30 rabbits were...[Objectives]To observe the effect of Xianlinggubao Capsule on osteoporotic vertebral compression fracture(OVCF)in rabbits and the influence mechanism of the repair of fractures.[Methods]Female June age 30 rabbits were randomly divided into control group,model control group and Xianlinggubao group.After bilateral ovariectomy,the model control group and Xianlinggubao group were injected with dexamethasone continuously for 4 weeks,and then the OVCF compound model was established by surgery.The Xianlinggubao group was treated with Xianlinggubao at a dose of 300 mg/(kg·d)for 60 d,while the blank control group and the model control group were treated with the same amount of normal saline for 60 d.The number of blood vessels and the expression of bone morphogenetic protein-2(BMP-2)were detected by immunohistochemical staining and the bone mineral density(BMD)in the callus of the third lumbar fracture area of rabbits was measured.The content of serum phosphorus(P),alkaline phosphatase(ALP)and total calcium(TCa)in rabbit venous blood were measured by automatic biochemical analyzer.The content of vascular endothelial growth factor(VEGF)and platelet-derived growth factor(PDGF)in rabbit venous blood were measured by ELISA kit.[Results]The number of blood vessels and the expression of BMP-2 in the callus of the third lumbar fracture area of rabbits was high in Xianlinggubao group,the content of serum P,ALP,TCa,VEGF and PDGF was obviously increased,BMD was obviously increased,the bone microstructure of the third lumbar vertebrae fracture area of rabbits was basically restored.Compared with the model control group(P<0.05),the difference was statistically significant.[Conclusions]Xianlinggubao Capsule can increase calcium and phosphorus deposition,promote the formation of blood vessels in the fracture area of OVCF in rabbits,and have a strong repair effect on OVCF in rabbits.展开更多
BACKGROUND: Varying degrees of inflammatory responses occur during lumbar nerve root compression. Studies have shown that nitric oxide synthase (NOS) and calcitonin gene-related peptide (CGRP) are involved in sec...BACKGROUND: Varying degrees of inflammatory responses occur during lumbar nerve root compression. Studies have shown that nitric oxide synthase (NOS) and calcitonin gene-related peptide (CGRP) are involved in secondary disc inflammation. OBJECTIVE: To observe the effects of warm acupuncture on the ultrastructure of inflammatory mediators in a rat model of lumbar nerve root compression, including NOS and CGRP contents. DESIGN, TIME AND SETTING: Randomized, controlled study, with molecular biological analysis, was performed at the Experimental Center, Sixth People's Hospital Affiliated to Shanghai Jiao Tong University, between September 2006 and April 2007. MATERIALS: Acupuncture needles and refined Moxa grains were purchased from Shanghai Taicheng Technology Development Co., Ltd., China; Mobic tablets were purchased from Shanghai Boehringer Ingelheim Pharmaceuticals Co., Ltd., China; enzyme linked immunosorbent assay (ELISA) kits for NOS and CGRP were purchased from ADL Biotechnology, Inc., USA. METHODS: A total of 50, healthy, adult Sprague-Dawley rats, were randomly divided into five groups normal, model, warm acupuncture, acupuncture, and drug, with 10 rats in each group. Rats in the four groups, excluding the normal group, were used to establish models of lumbar nerve root compression. After 3 days, Jiaji points were set using reinforcing-reducing manipulation in the warm acupuncture group. Moxa grains were burned on each needle, with 2 grains each daily. The acupuncture group was the same as the warm acupuncture group, with the exception of non-moxibustion. Mobic suspension (3.75 mg/kg) was used in the oral drug group, once a day. Treatment of each group lasted for 14 consecutive days. Modeling and medication were not performed in the normal group. MAIN OUTCOME MEASURES: The ultrastructure of damaged nerve roots was observed with transmission electron microscopy; NOS and CGRP contents were measured using ELISA. RESULTS: The changes of the radicular ultramicrostructure were characterized by Wallerian degeneration; nerve fibers were clearly demyelinated; axons collapsed or degenerated; outer Schwann cell cytoplasm was swollen and its nucleus was compacted. Compared with the normal group, NOS and CGRP contents in the nerve root compression zone in the model group were significantly increased (P 〈 0.01). Nerve root edema was improved in the drug, acupuncture and the warm acupuncture groups over the model group. NOS and CGRP expressions were also decreased with the warm acupuncture group having the lowest concentration (P 〈 0.01). CONCLUSION: In comparison to the known effects of Mobic drug and acupuncture treatments, the warm acupuncture significantly decreased NOS and CGRP expression which helped improve the ultrastructure of the compressed nerve root.展开更多
In this article, we consider the blowup criterion for the local strong solution to the compressible fluid-particle interaction model in dimension three with vacuum. We establish a BKM type criterion for possible break...In this article, we consider the blowup criterion for the local strong solution to the compressible fluid-particle interaction model in dimension three with vacuum. We establish a BKM type criterion for possible breakdown of such solutions at critical time in terms of both the L^∞ (0, T; L^6)-norm of the density of particles and the ^L1(0, T; L^∞)-norm of the deformation tensor of velocity gradient.展开更多
The two-phase flow models are commonly used in industrial applications, such as nuclear, power, chemical-process, oil-and-gas, cryogenics, bio-medical, micro-technology and so on. This is a survey paper on the study o...The two-phase flow models are commonly used in industrial applications, such as nuclear, power, chemical-process, oil-and-gas, cryogenics, bio-medical, micro-technology and so on. This is a survey paper on the study of compressible nonconservative two-fluid model, drift-flux model and viscous liquid-gas two-phase flow model. We give the research developments of these three two-phase flow models, respectively. In the last part, we give some open problems about the above models.展开更多
The effect of various process variables on the law of metal flow for semi-solid rolling 60Si2Mn was studied by finite element method. Semi-solid 60Si2Mn can be described as compressible rigid visco-plastic porous mate...The effect of various process variables on the law of metal flow for semi-solid rolling 60Si2Mn was studied by finite element method. Semi-solid 60Si2Mn can be described as compressible rigid visco-plastic porous material saturated with liquid. In terms of ther-mo-mechanical coupling condition, the distributions of stress, velocity and temperature were studied using software MARC. The simulation results show that the rigid visco-plastic model can accurately describe the semi-solid 60Si2Mn rolling process. The great deformation can achieve completely in view of low flow stress of semi-solid slurry.展开更多
In this article,we focus on the short time strong solution to a compressible quantum hydrodynamic model.We establish a blow-up criterion about the solutions of the compressible quantum hydrodynamic model in terms of t...In this article,we focus on the short time strong solution to a compressible quantum hydrodynamic model.We establish a blow-up criterion about the solutions of the compressible quantum hydrodynamic model in terms of the gradient of the velocity,the second spacial derivative of the square root of the density,and the first order time derivative and first order spacial derivative of the square root of the density.展开更多
Recently,many regression models have been presented for prediction of mechanical parameters of rocks regarding to rock index properties.Although statistical analysis is a common method for developing regression models...Recently,many regression models have been presented for prediction of mechanical parameters of rocks regarding to rock index properties.Although statistical analysis is a common method for developing regression models,but still selection of suitable transformation of the independent variables in a regression model is diffcult.In this paper,a genetic algorithm(GA)has been employed as a heuristic search method for selection of best transformation of the independent variables(some index properties of rocks)in regression models for prediction of uniaxial compressive strength(UCS)and modulus of elasticity(E).Firstly,multiple linear regression(MLR)analysis was performed on a data set to establish predictive models.Then,two GA models were developed in which root mean squared error(RMSE)was defned as ftness function.Results have shown that GA models are more precise than MLR models and are able to explain the relation between the intrinsic strength/elasticity properties and index properties of rocks by simple formulation and accepted accuracy.展开更多
Academic and industrial communities have been paying significant attention to the 6th Generation (6G) wireless communication systems after the commercial deployment of 5G cellular communications. Among the emerging te...Academic and industrial communities have been paying significant attention to the 6th Generation (6G) wireless communication systems after the commercial deployment of 5G cellular communications. Among the emerging technologies, Vehicular Edge Computing (VEC) can provide essential assurance for the robustness of Artificial Intelligence (AI) algorithms to be used in the 6G systems. Therefore, in this paper, a strategy for enhancing the robustness of AI model deployment using 6G-VEC is proposed, taking the object detection task as an example. This strategy includes two stages: model stabilization and model adaptation. In the former, the state-of-the-art methods are appended to the model to improve its robustness. In the latter, two targeted compression methods are implemented, namely model parameter pruning and knowledge distillation, which result in a trade-off between model performance and runtime resources. Numerical results indicate that the proposed strategy can be smoothly deployed in the onboard edge terminals, where the introduced trade-off outperforms the other strategies available.展开更多
Deep learning technology has been widely used in computer vision,speech recognition,natural language processing,and other related fields.The deep learning algorithm has high precision and high reliability.However,the ...Deep learning technology has been widely used in computer vision,speech recognition,natural language processing,and other related fields.The deep learning algorithm has high precision and high reliability.However,the lack of resources in the edge terminal equipment makes it difficult to run deep learning algorithms that require more memory and computing power.In this paper,we propose MoTransFrame,a general model processing framework for deep learning models.Instead of designing a model compression algorithm with a high compression ratio,MoTransFrame can transplant popular convolutional neural networks models to resources-starved edge devices promptly and accurately.By the integration method,Deep learning models can be converted into portable projects for Arduino,a typical edge device with limited resources.Our experiments show that MoTransFrame has good adaptability in edge devices with limited memories.It is more flexible than other model transplantation methods.It can keep a small loss of model accuracy when the number of parameters is compressed by tens of times.At the same time,the computational resources needed in the reasoning process are less than what the edge node could handle.展开更多
The isothermal compression tests were carried out on Gleeble-3500 thermal-mechanical simulation machine in a temperature range of 298-473 K and strain rate range of 0.001-10 s^-1. The experimental results show that th...The isothermal compression tests were carried out on Gleeble-3500 thermal-mechanical simulation machine in a temperature range of 298-473 K and strain rate range of 0.001-10 s^-1. The experimental results show that the flow stress data are negatively correlated with temperature for temperature softening, and the strain rates sensitivity of this composite increases with elevating temperature. Based on the experimental data, Johnson-Cook, modified Johnson-Cook and Arrhenius constitutive models were established. The accuracy of these three constitutive models was analyzed and compared. The results show that the values predicted by Johnson-Cook model could not agree well with the experimental values. The prediction accuracy of Arrhenius model is higher than that of Johnson-Cook model but lower than that of the Modified Johnson-Cook model.展开更多
Modeling of a centrifugal compressor is of great significance to surge characteristics and fluid dynamics in the Altitude Ground Test Facilities(AGTF).Real-time Modular Dynamic System Greitzer(MDSG)modeling for dynami...Modeling of a centrifugal compressor is of great significance to surge characteristics and fluid dynamics in the Altitude Ground Test Facilities(AGTF).Real-time Modular Dynamic System Greitzer(MDSG)modeling for dynamic response and simulation of the compression system is introduced.The centrifugal compressor,pipeline network,and valve are divided into pressure output type and mass flow output type for module modeling,and the two types of components alternate when the system is established.The pressure loss and thermodynamics of the system are considered.An air supply compression system of AGTF is modeled and simulated by the MDSG model.The simulation results of mass flow,pressure,and temperature are compared with the experimental results,and the error is less than 5%,which demonstrates the reliability,practicability,and universality of the MDSG model.展开更多
In this study,the potential implementation of three different low-GWP refrigerants(R32,R452B,and R454B)as replacements for R410A was investigated.The study was performed using a simulation tool developed by the author...In this study,the potential implementation of three different low-GWP refrigerants(R32,R452B,and R454B)as replacements for R410A was investigated.The study was performed using a simulation tool developed by the authors called RACHP-Lab,which is a vapor compression system simulation tool developed based on physics-based simulation for typical mini-split air conditioners.The simulation study was carried out and validated using experimental performance data of 10 different air conditioning units available in the Egyptian market.The units included fixed-speed or variable-speed compressors and operated in cooling or heating modes.Drop-in replace-ment with the new refrigerants was carried out.For R32,the capacity increased between 4.9%and 13%for cooling cases,and 6.3%and 12.4%for heating cases.However,COP did not improve in all cases.For R452B and R454B with direct replacement,the capacity nearly remained the same,with an increase of COP between 1.6%and 8.0%.Soft optimization was also conducted on cooling cases where compressor suction superheat,condenser subcooling,and compressor volumetric speed were optimized to maximize COP while maintaining the original capacity of R410A.R32 showed an improvement of COP over R410A between 4.6%and 15.5%,while for R452B and R454B between 2.2%and 13.2%.展开更多
3D modeling and codec of real objects are hot issues in the field of virtual reality. In this paper, we propose an automatic registration two range images method and a cycle based automatic global registration algorit...3D modeling and codec of real objects are hot issues in the field of virtual reality. In this paper, we propose an automatic registration two range images method and a cycle based automatic global registration algorithm for rapidly and automatically registering all range images and constructing a realistic 3D model. Besides, to meet the requirement of huge data transmission over Internet, we present a 3D mesh encoding/decoding method for encoding geometry, topology and attribute data with high compression ratio and supporting progressive transmission. The research results have already been applied successfully in digital museum.展开更多
Warm rotary draw bending provides a feasible method to form the large-diameter thin-walled(LDTW)TC4 bent tubes, which are widely used in the pneumatic system of aircrafts. An accurate prediction of flow behavior of ...Warm rotary draw bending provides a feasible method to form the large-diameter thin-walled(LDTW)TC4 bent tubes, which are widely used in the pneumatic system of aircrafts. An accurate prediction of flow behavior of TC4 tubes considering the couple effects of temperature,strain rate and strain is critical for understanding the deformation behavior of metals and optimizing the processing parameters in warm rotary draw bending of TC4 tubes. In this study, isothermal compression tests of TC4 tube alloy were performed from 573 to 873 K with an interval of 100 K and strain rates of 0.001, 0.010 and0.100 s^(-1). The prediction of flow behavior was done using two constitutive models, namely modified Arrhenius model and artificial neural network(ANN) model. The predictions of these constitutive models were compared using statistical measures like correlation coefficient(R), average absolute relative error(AARE) and its variation with the deformation parameters(temperature, strain rate and strain). Analysis of statistical measures reveals that the two models show high predicted accuracy in terms of R and AARE. Comparatively speaking, the ANN model presents higher predicted accuracy than the modified Arrhenius model. In addition, the predicted accuracy of ANN model presents high stability at the whole deformation parameter ranges, whereas the predictability of the modified Arrhenius model has some fluctuation at different deformation conditions. It presents higher predicted accuracy at temperatures of 573-773 K, strain rates of 0.010-0.100 s^(-1)and strain of 0.04-0.32, while low accuracy at temperature of 873 K, strain rates of 0.001 s^(-1)and strain of 0.36-0.48.Thus, the application of modified Arrhenius model is limited by its relatively low predicted accuracy at some deformation conditions, while the ANN model presents very high predicted accuracy at all deformation conditions,which can be used to study the compression behavior of TC4 tube at the temperature range of 573-873 K and the strain rate of 0.001-0.100 s^(-1). It can provide guideline for the design of processing parameters in warm rotary draw bending of LDTW TC4 tubes.展开更多
To better understand the failure behaviours and strength of bolt-reinforced blocky rocks,large scale extensive laboratory experiments are carried out on blocky rock-like specimens with and without rockbolt reinforceme...To better understand the failure behaviours and strength of bolt-reinforced blocky rocks,large scale extensive laboratory experiments are carried out on blocky rock-like specimens with and without rockbolt reinforcement.The results show that both shear failure and tensile failure along joint surfaces are observed but the shear failure is a main controlling factor for the peak strength of the rock mass with and without rockbolts.The rockbolts are necked and shear deformation simultaneously happens in bolt reinforced rock specimens.As the joint dip angle increases,the joint shear failure becomes more dominant.The number of rockbolts has a significant impact on the peak strain and uniaxial compressive strength(UCS),but little influence on the deformation modulus of the rock mass.Using the Winkler beam model to represent the rockbolt behaviours,an analytical model for the prediction of the strength of boltreinforced blocky rocks is proposed.Good agreement between the UCS values predicted by proposed model and obtained from experiments suggest an encouraging performance of the proposed model.In addition,the performance of the proposed model is further assessed using published results in the literature,indicating the proposed model can be used effectively in the prediction of UCS of bolt-reinforced blocky rocks.展开更多
文摘Deep neural networks excel at image identification and computer vision applications such as visual product search, facial recognition, medical image analysis, object detection, semantic segmentation,instance segmentation, and many others. In image and video recognition applications, convolutional neural networks(CNNs) are widely employed. These networks provide better performance but at a higher cost of computation. With the advent of big data, the growing scale of datasets has made processing and model training a time-consuming operation, resulting in longer training times. Moreover, these large scale datasets contain redundant data points that have minimum impact on the final outcome of the model. To address these issues, an accelerated CNN system is proposed for speeding up training by eliminating the noncritical data points during training alongwith a model compression method. Furthermore, the identification of the critical input data is performed by aggregating the data points at two levels of granularity which are used for evaluating the impact on the model output.Extensive experiments are conducted using the proposed method on CIFAR-10 dataset on ResNet models giving a 40% reduction in number of FLOPs with a degradation of just 0.11% accuracy.
基金This work was supported by Open Fund Project of State Key Laboratory of Intelligent Vehicle Safety Technology by Grant with No.IVSTSKL-202311Key Projects of Science and Technology Research Programme of Chongqing Municipal Education Commission by Grant with No.KJZD-K202301505+1 种基金Cooperation Project between Chongqing Municipal Undergraduate Universities and Institutes Affiliated to the Chinese Academy of Sciences in 2021 by Grant with No.HZ2021015Chongqing Graduate Student Research Innovation Program by Grant with No.CYS240801.
文摘Massive computational complexity and memory requirement of artificial intelligence models impede their deploy-ability on edge computing devices of the Internet of Things(IoT).While Power-of-Two(PoT)quantization is pro-posed to improve the efficiency for edge inference of Deep Neural Networks(DNNs),existing PoT schemes require a huge amount of bit-wise manipulation and have large memory overhead,and their efficiency is bounded by the bottleneck of computation latency and memory footprint.To tackle this challenge,we present an efficient inference approach on the basis of PoT quantization and model compression.An integer-only scalar PoT quantization(IOS-PoT)is designed jointly with a distribution loss regularizer,wherein the regularizer minimizes quantization errors and training disturbances.Additionally,two-stage model compression is developed to effectively reduce memory requirement,and alleviate bandwidth usage in communications of networked heterogenous learning systems.The product look-up table(P-LUT)inference scheme is leveraged to replace bit-shifting with only indexing and addition operations for achieving low-latency computation and implementing efficient edge accelerators.Finally,comprehensive experiments on Residual Networks(ResNets)and efficient architectures with Canadian Institute for Advanced Research(CIFAR),ImageNet,and Real-world Affective Faces Database(RAF-DB)datasets,indicate that our approach achieves 2×∼10×improvement in the reduction of both weight size and computation cost in comparison to state-of-the-art methods.A P-LUT accelerator prototype is implemented on the Xilinx KV260 Field Programmable Gate Array(FPGA)platform for accelerating convolution operations,with performance results showing that P-LUT reduces memory footprint by 1.45×,achieves more than 3×power efficiency and 2×resource efficiency,compared to the conventional bit-shifting scheme.
基金supported by Sichuan Science and Technology Program(2023YFSY0026,2023YFH0004).
文摘Recent advancements in natural language processing have given rise to numerous pre-training language models in question-answering systems.However,with the constant evolution of algorithms,data,and computing power,the increasing size and complexity of these models have led to increased training costs and reduced efficiency.This study aims to minimize the inference time of such models while maintaining computational performance.It also proposes a novel Distillation model for PAL-BERT(DPAL-BERT),specifically,employs knowledge distillation,using the PAL-BERT model as the teacher model to train two student models:DPAL-BERT-Bi and DPAL-BERTC.This research enhances the dataset through techniques such as masking,replacement,and n-gram sampling to optimize knowledge transfer.The experimental results showed that the distilled models greatly outperform models trained from scratch.In addition,although the distilled models exhibit a slight decrease in performance compared to PAL-BERT,they significantly reduce inference time to just 0.25%of the original.This demonstrates the effectiveness of the proposed approach in balancing model performance and efficiency.
基金supported by the National Natural Science Foundation of China(62171088,U19A2052,62020106011)the Medico-Engineering Cooperation Funds from University of Electronic Science and Technology of China(ZYGX2021YGLH215,ZYGX2022YGRH005)。
文摘Deep neural networks(DNNs)have achieved great success in many data processing applications.However,high computational complexity and storage cost make deep learning difficult to be used on resource-constrained devices,and it is not environmental-friendly with much power cost.In this paper,we focus on low-rank optimization for efficient deep learning techniques.In the space domain,DNNs are compressed by low rank approximation of the network parameters,which directly reduces the storage requirement with a smaller number of network parameters.In the time domain,the network parameters can be trained in a few subspaces,which enables efficient training for fast convergence.The model compression in the spatial domain is summarized into three categories as pre-train,pre-set,and compression-aware methods,respectively.With a series of integrable techniques discussed,such as sparse pruning,quantization,and entropy coding,we can ensemble them in an integration framework with lower computational complexity and storage.In addition to summary of recent technical advances,we have two findings for motivating future works.One is that the effective rank,derived from the Shannon entropy of the normalized singular values,outperforms other conventional sparse measures such as the?_1 norm for network compression.The other is a spatial and temporal balance for tensorized neural networks.For accelerating the training of tensorized neural networks,it is crucial to leverage redundancy for both model compression and subspace training.
基金This work was funded by National Key R&D Program of China(Grant No.2020YFB0906003).
文摘Asynchronous federated learning(AsynFL)can effectivelymitigate the impact of heterogeneity of edge nodes on joint training while satisfying participant user privacy protection and data security.However,the frequent exchange of massive data can lead to excess communication overhead between edge and central nodes regardless of whether the federated learning(FL)algorithm uses synchronous or asynchronous aggregation.Therefore,there is an urgent need for a method that can simultaneously take into account device heterogeneity and edge node energy consumption reduction.This paper proposes a novel Fixed-point Asynchronous Federated Learning(FixedAsynFL)algorithm,which could mitigate the resource consumption caused by frequent data communication while alleviating the effect of device heterogeneity.FixedAsynFL uses fixed-point quantization to compress the local and global models in AsynFL.In order to balance energy consumption and learning accuracy,this paper proposed a quantization scale selection mechanism.This paper examines the mathematical relationship between the quantization scale and energy consumption of the computation/communication process in the FixedAsynFL.Based on considering the upper bound of quantization noise,this paper optimizes the quantization scale by minimizing communication and computation consumption.This paper performs pertinent experiments on the MNIST dataset with several edge nodes of different computing efficiency.The results show that the FixedAsynFL algorithm with an 8-bit quantization can significantly reduce the communication data size by 81.3%and save the computation energy in the training phase by 74.9%without significant loss of accuracy.According to the experimental results,we can see that the proposed AsynFixedFL algorithm can effectively solve the problem of device heterogeneity and energy consumption limitation of edge nodes.
基金Supported by Shiyan Taihe Hospital Project(2021JJXM084)General Project of Hubei Provincial Health and Health Commission(ZY2021M006).
文摘[Objectives]To observe the effect of Xianlinggubao Capsule on osteoporotic vertebral compression fracture(OVCF)in rabbits and the influence mechanism of the repair of fractures.[Methods]Female June age 30 rabbits were randomly divided into control group,model control group and Xianlinggubao group.After bilateral ovariectomy,the model control group and Xianlinggubao group were injected with dexamethasone continuously for 4 weeks,and then the OVCF compound model was established by surgery.The Xianlinggubao group was treated with Xianlinggubao at a dose of 300 mg/(kg·d)for 60 d,while the blank control group and the model control group were treated with the same amount of normal saline for 60 d.The number of blood vessels and the expression of bone morphogenetic protein-2(BMP-2)were detected by immunohistochemical staining and the bone mineral density(BMD)in the callus of the third lumbar fracture area of rabbits was measured.The content of serum phosphorus(P),alkaline phosphatase(ALP)and total calcium(TCa)in rabbit venous blood were measured by automatic biochemical analyzer.The content of vascular endothelial growth factor(VEGF)and platelet-derived growth factor(PDGF)in rabbit venous blood were measured by ELISA kit.[Results]The number of blood vessels and the expression of BMP-2 in the callus of the third lumbar fracture area of rabbits was high in Xianlinggubao group,the content of serum P,ALP,TCa,VEGF and PDGF was obviously increased,BMD was obviously increased,the bone microstructure of the third lumbar vertebrae fracture area of rabbits was basically restored.Compared with the model control group(P<0.05),the difference was statistically significant.[Conclusions]Xianlinggubao Capsule can increase calcium and phosphorus deposition,promote the formation of blood vessels in the fracture area of OVCF in rabbits,and have a strong repair effect on OVCF in rabbits.
基金Modern Projects of Traditional Chinese Medicine of Shanghai Science and Technology Commission, No.08DZ1973200Research Projects of Shanghai Bureau of Public Health,No.2006Q004L
文摘BACKGROUND: Varying degrees of inflammatory responses occur during lumbar nerve root compression. Studies have shown that nitric oxide synthase (NOS) and calcitonin gene-related peptide (CGRP) are involved in secondary disc inflammation. OBJECTIVE: To observe the effects of warm acupuncture on the ultrastructure of inflammatory mediators in a rat model of lumbar nerve root compression, including NOS and CGRP contents. DESIGN, TIME AND SETTING: Randomized, controlled study, with molecular biological analysis, was performed at the Experimental Center, Sixth People's Hospital Affiliated to Shanghai Jiao Tong University, between September 2006 and April 2007. MATERIALS: Acupuncture needles and refined Moxa grains were purchased from Shanghai Taicheng Technology Development Co., Ltd., China; Mobic tablets were purchased from Shanghai Boehringer Ingelheim Pharmaceuticals Co., Ltd., China; enzyme linked immunosorbent assay (ELISA) kits for NOS and CGRP were purchased from ADL Biotechnology, Inc., USA. METHODS: A total of 50, healthy, adult Sprague-Dawley rats, were randomly divided into five groups normal, model, warm acupuncture, acupuncture, and drug, with 10 rats in each group. Rats in the four groups, excluding the normal group, were used to establish models of lumbar nerve root compression. After 3 days, Jiaji points were set using reinforcing-reducing manipulation in the warm acupuncture group. Moxa grains were burned on each needle, with 2 grains each daily. The acupuncture group was the same as the warm acupuncture group, with the exception of non-moxibustion. Mobic suspension (3.75 mg/kg) was used in the oral drug group, once a day. Treatment of each group lasted for 14 consecutive days. Modeling and medication were not performed in the normal group. MAIN OUTCOME MEASURES: The ultrastructure of damaged nerve roots was observed with transmission electron microscopy; NOS and CGRP contents were measured using ELISA. RESULTS: The changes of the radicular ultramicrostructure were characterized by Wallerian degeneration; nerve fibers were clearly demyelinated; axons collapsed or degenerated; outer Schwann cell cytoplasm was swollen and its nucleus was compacted. Compared with the normal group, NOS and CGRP contents in the nerve root compression zone in the model group were significantly increased (P 〈 0.01). Nerve root edema was improved in the drug, acupuncture and the warm acupuncture groups over the model group. NOS and CGRP expressions were also decreased with the warm acupuncture group having the lowest concentration (P 〈 0.01). CONCLUSION: In comparison to the known effects of Mobic drug and acupuncture treatments, the warm acupuncture significantly decreased NOS and CGRP expression which helped improve the ultrastructure of the compressed nerve root.
基金supported by the National Basic Research Program of China(973 Program)(2011CB808002)the National Natural Science Foundation of China(11371152,11128102,11071086,and 11571117)+3 种基金the Natural Science Foundation of Guangdong Province(S2012010010408)the Foundation for Distinguished Young Talents in Higher Education of Guangdong(2015KQNCX095)the Major Foundation of Hanshan Normal University(LZ201403)the Scientific Research Foundation of Graduate School of South China Normal University(2014ssxm04)
文摘In this article, we consider the blowup criterion for the local strong solution to the compressible fluid-particle interaction model in dimension three with vacuum. We establish a BKM type criterion for possible breakdown of such solutions at critical time in terms of both the L^∞ (0, T; L^6)-norm of the density of particles and the ^L1(0, T; L^∞)-norm of the deformation tensor of velocity gradient.
基金supported by the National Natural Science Foundation of China(11722104,11671150)supported by the National Natural Science Foundation of China(11571280,11331005)+3 种基金supported by the National Natural Science Foundation of China(11331005,11771150)by GDUPS(2016)the Fundamental Research Funds for the Central Universities of China(D2172260)FANEDD No.201315
文摘The two-phase flow models are commonly used in industrial applications, such as nuclear, power, chemical-process, oil-and-gas, cryogenics, bio-medical, micro-technology and so on. This is a survey paper on the study of compressible nonconservative two-fluid model, drift-flux model and viscous liquid-gas two-phase flow model. We give the research developments of these three two-phase flow models, respectively. In the last part, we give some open problems about the above models.
基金the National Natural Science Foundation of China (No.59995440).
文摘The effect of various process variables on the law of metal flow for semi-solid rolling 60Si2Mn was studied by finite element method. Semi-solid 60Si2Mn can be described as compressible rigid visco-plastic porous material saturated with liquid. In terms of ther-mo-mechanical coupling condition, the distributions of stress, velocity and temperature were studied using software MARC. The simulation results show that the rigid visco-plastic model can accurately describe the semi-solid 60Si2Mn rolling process. The great deformation can achieve completely in view of low flow stress of semi-solid slurry.
基金The first author is supported by the National Natural Science Foundation of China(11801107)the second author is supported by the National Natural Science Foundation of China(11731014).
文摘In this article,we focus on the short time strong solution to a compressible quantum hydrodynamic model.We establish a blow-up criterion about the solutions of the compressible quantum hydrodynamic model in terms of the gradient of the velocity,the second spacial derivative of the square root of the density,and the first order time derivative and first order spacial derivative of the square root of the density.
文摘Recently,many regression models have been presented for prediction of mechanical parameters of rocks regarding to rock index properties.Although statistical analysis is a common method for developing regression models,but still selection of suitable transformation of the independent variables in a regression model is diffcult.In this paper,a genetic algorithm(GA)has been employed as a heuristic search method for selection of best transformation of the independent variables(some index properties of rocks)in regression models for prediction of uniaxial compressive strength(UCS)and modulus of elasticity(E).Firstly,multiple linear regression(MLR)analysis was performed on a data set to establish predictive models.Then,two GA models were developed in which root mean squared error(RMSE)was defned as ftness function.Results have shown that GA models are more precise than MLR models and are able to explain the relation between the intrinsic strength/elasticity properties and index properties of rocks by simple formulation and accepted accuracy.
基金supported by the National Key Research and Development Program of China(2020YFB1807500), the National Natural Science Foundation of China (62072360, 62001357, 62172438,61901367), the key research and development plan of Shaanxi province(2021ZDLGY02-09, 2020JQ-844)the Natural Science Foundation of Guangdong Province of China(2022A1515010988)+2 种基金Key Project on Artificial Intelligence of Xi'an Science and Technology Plan(2022JH-RGZN-0003)Xi'an Science and Technology Plan(20RGZN0005)the Xi'an Key Laboratory of Mobile Edge Computing and Security (201805052-ZD3CG36).
文摘Academic and industrial communities have been paying significant attention to the 6th Generation (6G) wireless communication systems after the commercial deployment of 5G cellular communications. Among the emerging technologies, Vehicular Edge Computing (VEC) can provide essential assurance for the robustness of Artificial Intelligence (AI) algorithms to be used in the 6G systems. Therefore, in this paper, a strategy for enhancing the robustness of AI model deployment using 6G-VEC is proposed, taking the object detection task as an example. This strategy includes two stages: model stabilization and model adaptation. In the former, the state-of-the-art methods are appended to the model to improve its robustness. In the latter, two targeted compression methods are implemented, namely model parameter pruning and knowledge distillation, which result in a trade-off between model performance and runtime resources. Numerical results indicate that the proposed strategy can be smoothly deployed in the onboard edge terminals, where the introduced trade-off outperforms the other strategies available.
基金supported by The National Key Research and Development Program of China(2018YFB1800202,2016YFB1000302,SQ2019ZD090149,2018YFB0204301)the CETC Joint Advanced Research Foundation(6141B08080101)+1 种基金The Major Special Science and Technology Project of Hainan Province(ZDKJ2019008)The New Generation of Artificial Intelligence Special Action Project(AI20191125008).
文摘Deep learning technology has been widely used in computer vision,speech recognition,natural language processing,and other related fields.The deep learning algorithm has high precision and high reliability.However,the lack of resources in the edge terminal equipment makes it difficult to run deep learning algorithms that require more memory and computing power.In this paper,we propose MoTransFrame,a general model processing framework for deep learning models.Instead of designing a model compression algorithm with a high compression ratio,MoTransFrame can transplant popular convolutional neural networks models to resources-starved edge devices promptly and accurately.By the integration method,Deep learning models can be converted into portable projects for Arduino,a typical edge device with limited resources.Our experiments show that MoTransFrame has good adaptability in edge devices with limited memories.It is more flexible than other model transplantation methods.It can keep a small loss of model accuracy when the number of parameters is compressed by tens of times.At the same time,the computational resources needed in the reasoning process are less than what the edge node could handle.
基金Funded by the Program of International S&T Cooperation(No.2013DFA51230)the Opening Subject Fund of Ningbo University(No.zj1226)
文摘The isothermal compression tests were carried out on Gleeble-3500 thermal-mechanical simulation machine in a temperature range of 298-473 K and strain rate range of 0.001-10 s^-1. The experimental results show that the flow stress data are negatively correlated with temperature for temperature softening, and the strain rates sensitivity of this composite increases with elevating temperature. Based on the experimental data, Johnson-Cook, modified Johnson-Cook and Arrhenius constitutive models were established. The accuracy of these three constitutive models was analyzed and compared. The results show that the values predicted by Johnson-Cook model could not agree well with the experimental values. The prediction accuracy of Arrhenius model is higher than that of Johnson-Cook model but lower than that of the Modified Johnson-Cook model.
基金supported in part by the Stable Support Research Project of AECC Sichuan Gas Turbine Establishment,China(No.GJCZ-0013-19)the Open Foundation of State Key Laboratory of Compressor Technology,China(Compressor Technology Laboratory of Anhui Province)(No.SKL-YSJ2020007).
文摘Modeling of a centrifugal compressor is of great significance to surge characteristics and fluid dynamics in the Altitude Ground Test Facilities(AGTF).Real-time Modular Dynamic System Greitzer(MDSG)modeling for dynamic response and simulation of the compression system is introduced.The centrifugal compressor,pipeline network,and valve are divided into pressure output type and mass flow output type for module modeling,and the two types of components alternate when the system is established.The pressure loss and thermodynamics of the system are considered.An air supply compression system of AGTF is modeled and simulated by the MDSG model.The simulation results of mass flow,pressure,and temperature are compared with the experimental results,and the error is less than 5%,which demonstrates the reliability,practicability,and universality of the MDSG model.
文摘In this study,the potential implementation of three different low-GWP refrigerants(R32,R452B,and R454B)as replacements for R410A was investigated.The study was performed using a simulation tool developed by the authors called RACHP-Lab,which is a vapor compression system simulation tool developed based on physics-based simulation for typical mini-split air conditioners.The simulation study was carried out and validated using experimental performance data of 10 different air conditioning units available in the Egyptian market.The units included fixed-speed or variable-speed compressors and operated in cooling or heating modes.Drop-in replace-ment with the new refrigerants was carried out.For R32,the capacity increased between 4.9%and 13%for cooling cases,and 6.3%and 12.4%for heating cases.However,COP did not improve in all cases.For R452B and R454B with direct replacement,the capacity nearly remained the same,with an increase of COP between 1.6%and 8.0%.Soft optimization was also conducted on cooling cases where compressor suction superheat,condenser subcooling,and compressor volumetric speed were optimized to maximize COP while maintaining the original capacity of R410A.R32 showed an improvement of COP over R410A between 4.6%and 15.5%,while for R452B and R454B between 2.2%and 13.2%.
基金Supported by the National Natural Science Foundation of China (Grant Nos. 60533070, 60773153)the Key Grant Project of Chinese Ministry of Education (Grant No. 308004)+1 种基金the Project of Chinese Ministry of Science and Technology (Grant No. 2006BAK12B09)the Project of Beijing Municipal Science and Technology Commission (Grant No. Z07000100560714)
文摘3D modeling and codec of real objects are hot issues in the field of virtual reality. In this paper, we propose an automatic registration two range images method and a cycle based automatic global registration algorithm for rapidly and automatically registering all range images and constructing a realistic 3D model. Besides, to meet the requirement of huge data transmission over Internet, we present a 3D mesh encoding/decoding method for encoding geometry, topology and attribute data with high compression ratio and supporting progressive transmission. The research results have already been applied successfully in digital museum.
基金financially supported by the National Natural Science Foundation of China(Nos.51275415 and50905144)the Natural Science Basic Research Plan in Shanxi Province(No.2011JQ6004)the Program of the Ministry of Education of China for Introducing Talents of Discipline to Universities(No.B08040)
文摘Warm rotary draw bending provides a feasible method to form the large-diameter thin-walled(LDTW)TC4 bent tubes, which are widely used in the pneumatic system of aircrafts. An accurate prediction of flow behavior of TC4 tubes considering the couple effects of temperature,strain rate and strain is critical for understanding the deformation behavior of metals and optimizing the processing parameters in warm rotary draw bending of TC4 tubes. In this study, isothermal compression tests of TC4 tube alloy were performed from 573 to 873 K with an interval of 100 K and strain rates of 0.001, 0.010 and0.100 s^(-1). The prediction of flow behavior was done using two constitutive models, namely modified Arrhenius model and artificial neural network(ANN) model. The predictions of these constitutive models were compared using statistical measures like correlation coefficient(R), average absolute relative error(AARE) and its variation with the deformation parameters(temperature, strain rate and strain). Analysis of statistical measures reveals that the two models show high predicted accuracy in terms of R and AARE. Comparatively speaking, the ANN model presents higher predicted accuracy than the modified Arrhenius model. In addition, the predicted accuracy of ANN model presents high stability at the whole deformation parameter ranges, whereas the predictability of the modified Arrhenius model has some fluctuation at different deformation conditions. It presents higher predicted accuracy at temperatures of 573-773 K, strain rates of 0.010-0.100 s^(-1)and strain of 0.04-0.32, while low accuracy at temperature of 873 K, strain rates of 0.001 s^(-1)and strain of 0.36-0.48.Thus, the application of modified Arrhenius model is limited by its relatively low predicted accuracy at some deformation conditions, while the ANN model presents very high predicted accuracy at all deformation conditions,which can be used to study the compression behavior of TC4 tube at the temperature range of 573-873 K and the strain rate of 0.001-0.100 s^(-1). It can provide guideline for the design of processing parameters in warm rotary draw bending of LDTW TC4 tubes.
基金supported by the National Key Research and Development Projects of China(No.2021YFB2600402)National Natural Science Foundation of China(Nos.52209148 and 52374119)+1 种基金the opening fund of State Key Laboratory of Geomechanics and Geotechnical Engineering,Institute of Rock and Soil Mechanics,Chinese Academy of Sciences(No.SKLGME023023)the opening fund of Key Laboratory of Water Management and Water Security for Yellow River Basin,Ministry of Water Resources(No.2023-SYSJJ-02)。
文摘To better understand the failure behaviours and strength of bolt-reinforced blocky rocks,large scale extensive laboratory experiments are carried out on blocky rock-like specimens with and without rockbolt reinforcement.The results show that both shear failure and tensile failure along joint surfaces are observed but the shear failure is a main controlling factor for the peak strength of the rock mass with and without rockbolts.The rockbolts are necked and shear deformation simultaneously happens in bolt reinforced rock specimens.As the joint dip angle increases,the joint shear failure becomes more dominant.The number of rockbolts has a significant impact on the peak strain and uniaxial compressive strength(UCS),but little influence on the deformation modulus of the rock mass.Using the Winkler beam model to represent the rockbolt behaviours,an analytical model for the prediction of the strength of boltreinforced blocky rocks is proposed.Good agreement between the UCS values predicted by proposed model and obtained from experiments suggest an encouraging performance of the proposed model.In addition,the performance of the proposed model is further assessed using published results in the literature,indicating the proposed model can be used effectively in the prediction of UCS of bolt-reinforced blocky rocks.