The membrane fouling phenomenon,reflected with various fouling characterization in the membrane bioreactor(MBR)process,is so complicated to distinguish.This paper proposes a multivariable identification model(MIM)base...The membrane fouling phenomenon,reflected with various fouling characterization in the membrane bioreactor(MBR)process,is so complicated to distinguish.This paper proposes a multivariable identification model(MIM)based on a compacted cascade neural network to identify membrane fouling accurately.Firstly,a multivariable model is proposed to calculate multiple indicators of membrane fouling using a cascade neural network,which could avoid the interference of the overlap inputs.Secondly,an unsupervised pretraining algorithm was developed with periodic information of membrane fouling to obtain the compact structure of MIM.Thirdly,a hierarchical learning algorithm was proposed to update the parameters of MIM for improving the identification accuracy online.Finally,the proposed model was tested in real plants to evaluate its efficiency and effectiveness.Experimental results have verified the benefits of the proposed method.展开更多
This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image fe...This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image features is proposed for locating bolts on axlebox covers.And then,theA2-PFN is proposed according to the slender features of the marker lines for extracting more accurate marker lines regions of the bolts.Finally,a rectangular approximationmethod is proposed to regularize themarker line regions asaway tocalculate the angle of themarker line and plot all the angle values into an angle table,according to which the criteria of the angle table can determine whether the bolt with the marker line is in danger of loosening.Meanwhile,our improved algorithm is compared with the pre-improved algorithmin the object localization stage.The results show that our proposed method has a significant improvement in both detection accuracy and detection speed,where ourmAP(IoU=0.75)reaches 0.77 and fps reaches 16.6.And in the saliency detection stage,after qualitative comparison and quantitative comparison,our method significantly outperforms other state-of-the-art methods,where our MAE reaches 0.092,F-measure reaches 0.948 and AUC reaches 0.943.Ultimately,according to the angle table,out of 676 bolt samples,a total of 60 bolts are loose,69 bolts are at risk of loosening,and 547 bolts are tightened.展开更多
A hybrid algorithm is presented for nonuniform lossy multiconductor transmission lines (MTL) connected by arbitrary linear load networks. The networks are characterized by a state-variable equation which allows a gene...A hybrid algorithm is presented for nonuniform lossy multiconductor transmission lines (MTL) connected by arbitrary linear load networks. The networks are characterized by a state-variable equation which allows a general characterization of dynamic elements in the cascade networks. The method is achieved by the finite difference-time domain (FDTD) algorithm for the MTL, and the skin effect is taken into account, the more accurate method is used to compute the skin effect. And this method is combined with the computation of the nonuniform transmission lines. Finally, several numerical examples are given, these results indicate that: the current of the lossy MTL is smaller than the lossless of the MTL; and when the load networks contain the dynamic element, the transition time of the current is longer than the MTL connected by resistance only.展开更多
A 3D laser scanning strategy based on cascaded deep neural network is proposed for the scanning system converted from 2D Lidar with a pitching motion device. The strategy is aimed at moving target detection and monito...A 3D laser scanning strategy based on cascaded deep neural network is proposed for the scanning system converted from 2D Lidar with a pitching motion device. The strategy is aimed at moving target detection and monitoring. Combining the device characteristics, the strategy first proposes a cascaded deep neural network, which inputs 2D point cloud, color image and pitching angle. The outputs are target distance and speed classification. And the cross-entropy loss function of network is modified by using focal loss and uniform distribution to improve the recognition accuracy. Then a pitching range and speed model are proposed to determine pitching motion parameters. Finally, the adaptive scanning is realized by integral separate speed PID. The experimental results show that the accuracies of the improved network target detection box, distance and speed classification are 90.17%, 96.87% and 96.97%, respectively. The average speed error of the improved PID is 0.4239°/s, and the average strategy execution time is 0.1521 s.The range and speed model can effectively reduce the collection of useless information and the deformation of the target point cloud. Conclusively, the experimental of overall scanning strategy show that it can improve target point cloud integrity and density while ensuring the capture of target.展开更多
Nowadays,the cloud environment faces numerous issues like synchronizing information before the switch over the data migration.The requirement for a centralized internet of things(IoT)-based system has been restricted ...Nowadays,the cloud environment faces numerous issues like synchronizing information before the switch over the data migration.The requirement for a centralized internet of things(IoT)-based system has been restricted to some extent.Due to low scalability on security considerations,the cloud seems uninteresting.Since healthcare networks demand computer operations on large amounts of data,the sensitivity of device latency evolved among health networks is a challenging issue.In comparison to cloud domains,the new paradigms of fog computing give fresh alternatives by bringing resources closer to users by providing low latency and energy-efficient data processing solutions.Previous fog computing frameworks have various flaws,such as overvaluing response time or ignoring the accuracy of the result yet handling both at the same time compromises the network community.In this proposed work,Health Fog is integrated with the Optimized Cascaded Convolution Neural Network framework for diagnosing heart disease.Initially,the data is collected,and then pre-processing is done by Linear Discriminant Analysis.Then the features are extracted and optimized using Galactic Swarm Optimization.The optimized features are given into the Health Fog framework for diagnosing heart disease patients.It uses ensemble-based deep learning in edge computing devices,which automatically monitors real-life health networks such as heart disease analysis.Finally,the classifiers such as bagging,boosting,XGBoost,Multi-Layer Perceptron(MLP),and Partitions(PART)are used for classifying the data.Then the majority voting classifier predicts the result.This work uses FogBus architecture and evaluates the execution of power usage,bandwidth of the network,latency,execution time,and accuracy.展开更多
The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly...The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory.展开更多
To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by exist...To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space.展开更多
Due to the complexity of emotional expression, recognizing emotions from the speech is a critical and challenging task. In most of the studies, some specific emotions are easily classified incorrectly. In this paper, ...Due to the complexity of emotional expression, recognizing emotions from the speech is a critical and challenging task. In most of the studies, some specific emotions are easily classified incorrectly. In this paper, we propose a new framework that integrates cascade attention mechanism and joint loss for speech emotion recognition (SER), aiming to solve feature confusions for emotions that are difficult to be classified correctly. First, we extract the mel frequency cepstrum coefficients (MFCCs), deltas, and delta-deltas from MFCCs to form 3-dimensional (3D) features, thus effectively reducing the interference of external factors. Second, we employ spatiotemporal attention to selectively discover target emotion regions from the input features, where self-attention with head fusion captures the long-range dependency of temporal features. Finally, the joint loss function is employed to distinguish emotional embeddings with high similarity to enhance the overall performance. Experiments on interactive emotional dyadic motion capture (IEMOCAP) database indicate that the method achieves a positive improvement of 2.49% and 1.13% in weighted accuracy (WA) and unweighted accuracy (UA), respectively, compared to the state-of-the-art strategies.展开更多
The macula is an imperative part present in our human visual system which is most responsible for clear and colour vision. For the people suffering from diabetes, the various parts of the body including the retina of ...The macula is an imperative part present in our human visual system which is most responsible for clear and colour vision. For the people suffering from diabetes, the various parts of the body including the retina of the eye are affected. These retinal damages cause swelling and other abnormalities nearby macula. The pathologies in macula due to diabetes are called Diabetic Macular oEdema (DME). It affects patients’ vision that may lead to vision loss. It can be overcome by advance identification of causes for swelling. The major causes for the swelling are neovascularization and other abnormalities occurring in the blood vessels nearby the macula. The aim of this work is to avoid vision loss by detecting the presence of abnormalities in macula in advance. The pathologies present in the abnormal images are detected by image segmentation technique viz. Fuzzy K-means algorithm. The classification is done by two different classifiers namely Cascade Neural Network and Partial Least Square which are employed to identify whether the image is normal or abnormal. The results of both the classifiers are compared with respect to classifier accuracy, sensitivity and specificity. The classifier accuracies of Cascade Neural Network and Partial Least Square are 96.84% and 94.36%, respectively. The information about the severity of the disease and the localization of pathologies are very useful to the ophthalmologist for diagnosing the disease and apply proper treatments to the patients to avoid the formation of any lesion and prevent vision loss.展开更多
This study proposed a design and optimization strategy for a tandem arranged cascade using the Non-dominated Sorting Genetic Algorithm(NSGA) Ⅱ multi-objective optimization algorithm and Back Propagation(BP) neural ne...This study proposed a design and optimization strategy for a tandem arranged cascade using the Non-dominated Sorting Genetic Algorithm(NSGA) Ⅱ multi-objective optimization algorithm and Back Propagation(BP) neural network technology. The NASA Stage 35 was employed as the initial bench mark in the present study and five geometric control parameters were working as the optimization parameters aiming to enhance the aerodynamic performance in terms of total pressure rise and efficiency. Results showed that the feasibility and capability of the proposed optimization strategy was successfully examined. In view of the fact that the initial tandem cascade(directly scaling down from NASA Stage 35) cannot guarantee the aerodynamic performance, first optimization trial was conducted to optimize the initial design. Results showed that the optimum can improve the flow quality whereas the separation on the blade is decayed or even eliminated particularly at the tip and root regions. However, compared with the initial tandem design, the enhancement in total pressure ratio(0.47%) and efficiency(1%) are too small to be noticed. Second investigation was particularly emphasizing on a high turning tandem compressor with an increment by 28°. The pressure rise and efficiency were augmented by 1.44% and 2.34%(compared to the initial tandem design), respectively. An important conclusion can be drawn that the optimization strategy is worthy to be used in high turning compressors with a considerable performance improvement.展开更多
Automatic modulation classification(AMC)aims to identify the modulation format of the received signals corrupted by the noise,which plays a major role in radio monitoring.In this paper,we propose a novel cascaded conv...Automatic modulation classification(AMC)aims to identify the modulation format of the received signals corrupted by the noise,which plays a major role in radio monitoring.In this paper,we propose a novel cascaded convolutional neural network(CasCNN)-based hierarchical digital modulation classification scheme,where M-ary phase shift keying(PSK)and M-ary quadrature amplitude modulation(QAM)modulation formats are considered to be classified.In CasCNN,two-block convolutional neural networks are cascaded.The first block network is utilized to classify the different classes of modulation formats,namely PSK and QAM.The second block is designed to identify the indexes of the modulations in the same PSK or QAM class.Moreover,it is noted that the gird constellation diagram extracted from the received signal is utilized as the inputs to the CasCNN.Extensive simulations demonstrate that CasCNN yields performance gain and performs stronger robustness to frequency offset compared with other recent methods.Specifically,CasCNN achieves 90%classification accuracy at 4 dB signal-to-noise ratio when the symbol length is set as 256.展开更多
Recent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results.In this work,a novel end-to-en...Recent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results.In this work,a novel end-to-end framework is proposed based on accurate text stroke detection.Specifically,the text removal problem is decoupled into text stroke detection and stroke removal;we design separate networks to solve these two subproblems,the latter being a generative network.These two networks are combined as a processing unit,which is cascaded to obtain our final model for text removal.Experimental results demonstrate that the proposed method substantially outperforms the state-of-the-art for locating and erasing scene text.A new large-scale real-world dataset with 12,120 images has been constructed and is being made available to facilitate research,as current publicly available datasets are mainly synthetic so cannot properly measure the performance of different methods.展开更多
Thaw slumping is a periglacial process that occurs on slopes in cold environments,where the ground becomes unstable and the surface slides downhill due to saturation with water during thawing.In this study,GaoFen-1 re...Thaw slumping is a periglacial process that occurs on slopes in cold environments,where the ground becomes unstable and the surface slides downhill due to saturation with water during thawing.In this study,GaoFen-1 remote sensing and fused multi-source feature data were used to automatically map thaw slumping landforms in the Beilu River Basin of the Qinghai–Tibet Plateau.The bi-directional cascade network structure was used to extract edges at diferent scales,where an individual layer was supervised by labeled edges at its specifc scale,rather than directly applying the same supervision to all convolutional neural network outputs.Additionally,we conducted a 5-year multi-scale feature analysis of small baseline subset interferometric synthetic aperture radar deformation,normalized diference vegetation index,and slope,among other features.Our study analyzed the performance and accuracy of three methods based on edge object supervised learning and three preconfgured neural networks,ResNet101,VGG16,and ResNet152.Through verifcation using site surveys and multi-data fusion results,we obtained the best ResNet101 model score of intersection over union of 0.85(overall accuracy of 84.59%).The value of intersection over union of the VGG and ResNet152 are 0.569 and 0.773,respectively.This work provides a new insight for the potential feasibility of applying the designed edge detection method to map diverse thaw slumping landforms in larger areas with high-resolution images.展开更多
The manifold matrix of the received signals can be destroyed when the array is with the gain and phase errors,which will affect the performance of the traditional direction of arrival(DOA)estimation approaches.In this...The manifold matrix of the received signals can be destroyed when the array is with the gain and phase errors,which will affect the performance of the traditional direction of arrival(DOA)estimation approaches.In this paper,a novel active array calibration method for the gain and phase errors based on a cascaded neural network(GPECNN)was proposed.The cascaded neural network contains two parts:signal-to-noise ratio(SNR)classification network and two sets of error estimation subnetworks.Error calibration subnetworks are activated according to the output of the SNR classification network,each of which consists of a gain error estimation network(GEEN)and a phase error estimation network(PEEN),respectively.The disadvantage of neural network topology architecture is changing when the number of array elements varies is addressed by the proposed group calibration strategy.Moreover,due to the data characteristics of the input vector,the cascaded neural network can be applied to arrays with arbitrary geometry without repetitive training.Simulation results demonstrate that the GPECNN not only achieves a better balance between calibration performance and calibration complexity than other methods but also can be applied to arrays with different numbers of sensors or different shapes without repetitive training.展开更多
Transient stability batch assessment(TSBA)is es-sential for dynamic security check in both power system planning and day-ahead dispatch.It is also a necessary technique to generate sufficient training data for data-dr...Transient stability batch assessment(TSBA)is es-sential for dynamic security check in both power system planning and day-ahead dispatch.It is also a necessary technique to generate sufficient training data for data-driven online transient stability assessment(TSA).However,most existing work suffers from various problems including high computational burden,low model adaptability,and low performance robustness.Therefore,it is still a significant challenge in modern power systems,with numerous scenarios(e.g.,operating conditions and"N-k"contin-gencies)to be assessed at the same time.The purpose of this work is to construct a data-driven method to early terminate time-domain simulation(TDS)and dynamically schedule TSBA task queue a prior,in order to reduce computational burden without compromising accuracy.To achieve this goal,a time-adaptive cas-caded convolutional neural networks(CNNs)model is developed to predict stability and early terminate TDS.Additionally,an information entropy based prioritization strategy is designed to distinguish informative samples,dynamically schedule TSBA task queue and timely update model,thus further reducing simulation time.Case study in IEEE 39-bus system validates the effectiveness of the proposed method.展开更多
基金supports by National Key Research and Development Project(2018YFC1900800-5)National Natural Science Foundation of China(61890930-5,62021003,61903010 and 62103012)+1 种基金Beijing Outstanding Young Scientist Program(BJJWZYJH01201910005020)Beijing Natural Science Foundation(KZ202110005009 and 4214068).
文摘The membrane fouling phenomenon,reflected with various fouling characterization in the membrane bioreactor(MBR)process,is so complicated to distinguish.This paper proposes a multivariable identification model(MIM)based on a compacted cascade neural network to identify membrane fouling accurately.Firstly,a multivariable model is proposed to calculate multiple indicators of membrane fouling using a cascade neural network,which could avoid the interference of the overlap inputs.Secondly,an unsupervised pretraining algorithm was developed with periodic information of membrane fouling to obtain the compact structure of MIM.Thirdly,a hierarchical learning algorithm was proposed to update the parameters of MIM for improving the identification accuracy online.Finally,the proposed model was tested in real plants to evaluate its efficiency and effectiveness.Experimental results have verified the benefits of the proposed method.
文摘This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image features is proposed for locating bolts on axlebox covers.And then,theA2-PFN is proposed according to the slender features of the marker lines for extracting more accurate marker lines regions of the bolts.Finally,a rectangular approximationmethod is proposed to regularize themarker line regions asaway tocalculate the angle of themarker line and plot all the angle values into an angle table,according to which the criteria of the angle table can determine whether the bolt with the marker line is in danger of loosening.Meanwhile,our improved algorithm is compared with the pre-improved algorithmin the object localization stage.The results show that our proposed method has a significant improvement in both detection accuracy and detection speed,where ourmAP(IoU=0.75)reaches 0.77 and fps reaches 16.6.And in the saliency detection stage,after qualitative comparison and quantitative comparison,our method significantly outperforms other state-of-the-art methods,where our MAE reaches 0.092,F-measure reaches 0.948 and AUC reaches 0.943.Ultimately,according to the angle table,out of 676 bolt samples,a total of 60 bolts are loose,69 bolts are at risk of loosening,and 547 bolts are tightened.
文摘A hybrid algorithm is presented for nonuniform lossy multiconductor transmission lines (MTL) connected by arbitrary linear load networks. The networks are characterized by a state-variable equation which allows a general characterization of dynamic elements in the cascade networks. The method is achieved by the finite difference-time domain (FDTD) algorithm for the MTL, and the skin effect is taken into account, the more accurate method is used to compute the skin effect. And this method is combined with the computation of the nonuniform transmission lines. Finally, several numerical examples are given, these results indicate that: the current of the lossy MTL is smaller than the lossless of the MTL; and when the load networks contain the dynamic element, the transition time of the current is longer than the MTL connected by resistance only.
基金funded by National Natural Science Foundation of China(Grant No. 51805146)the Fundamental Research Funds for the Central Universities (Grant No. B200202221)+1 种基金Jiangsu Key R&D Program (Grant Nos. BE2018004-1, BE2018004)College Students’ Innovative Entrepreneurial Training Plan Program (Grant No. 2020102941513)。
文摘A 3D laser scanning strategy based on cascaded deep neural network is proposed for the scanning system converted from 2D Lidar with a pitching motion device. The strategy is aimed at moving target detection and monitoring. Combining the device characteristics, the strategy first proposes a cascaded deep neural network, which inputs 2D point cloud, color image and pitching angle. The outputs are target distance and speed classification. And the cross-entropy loss function of network is modified by using focal loss and uniform distribution to improve the recognition accuracy. Then a pitching range and speed model are proposed to determine pitching motion parameters. Finally, the adaptive scanning is realized by integral separate speed PID. The experimental results show that the accuracies of the improved network target detection box, distance and speed classification are 90.17%, 96.87% and 96.97%, respectively. The average speed error of the improved PID is 0.4239°/s, and the average strategy execution time is 0.1521 s.The range and speed model can effectively reduce the collection of useless information and the deformation of the target point cloud. Conclusively, the experimental of overall scanning strategy show that it can improve target point cloud integrity and density while ensuring the capture of target.
基金This work was supported by Taif University Researchers Supporting Project(TURSP)under number(TURSP-2020/73),Taif University,Taif,Saudi Arabia.
文摘Nowadays,the cloud environment faces numerous issues like synchronizing information before the switch over the data migration.The requirement for a centralized internet of things(IoT)-based system has been restricted to some extent.Due to low scalability on security considerations,the cloud seems uninteresting.Since healthcare networks demand computer operations on large amounts of data,the sensitivity of device latency evolved among health networks is a challenging issue.In comparison to cloud domains,the new paradigms of fog computing give fresh alternatives by bringing resources closer to users by providing low latency and energy-efficient data processing solutions.Previous fog computing frameworks have various flaws,such as overvaluing response time or ignoring the accuracy of the result yet handling both at the same time compromises the network community.In this proposed work,Health Fog is integrated with the Optimized Cascaded Convolution Neural Network framework for diagnosing heart disease.Initially,the data is collected,and then pre-processing is done by Linear Discriminant Analysis.Then the features are extracted and optimized using Galactic Swarm Optimization.The optimized features are given into the Health Fog framework for diagnosing heart disease patients.It uses ensemble-based deep learning in edge computing devices,which automatically monitors real-life health networks such as heart disease analysis.Finally,the classifiers such as bagging,boosting,XGBoost,Multi-Layer Perceptron(MLP),and Partitions(PART)are used for classifying the data.Then the majority voting classifier predicts the result.This work uses FogBus architecture and evaluates the execution of power usage,bandwidth of the network,latency,execution time,and accuracy.
基金National Key Research and Development Project,China(No.2018YFB1308800)。
文摘The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory.
基金This research was funded by College Student Innovation and Entrepreneurship Training Program,grant number 2021055Z and S202110082031the Special Project for Cultivating Scientific and Technological Innovation Ability of College and Middle School Students in Hebei Province,Grant Number 2021H011404.
文摘To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space.
基金supported by Natural Science Foundation of Shandong Province,China(No.ZR2020QF007).
文摘Due to the complexity of emotional expression, recognizing emotions from the speech is a critical and challenging task. In most of the studies, some specific emotions are easily classified incorrectly. In this paper, we propose a new framework that integrates cascade attention mechanism and joint loss for speech emotion recognition (SER), aiming to solve feature confusions for emotions that are difficult to be classified correctly. First, we extract the mel frequency cepstrum coefficients (MFCCs), deltas, and delta-deltas from MFCCs to form 3-dimensional (3D) features, thus effectively reducing the interference of external factors. Second, we employ spatiotemporal attention to selectively discover target emotion regions from the input features, where self-attention with head fusion captures the long-range dependency of temporal features. Finally, the joint loss function is employed to distinguish emotional embeddings with high similarity to enhance the overall performance. Experiments on interactive emotional dyadic motion capture (IEMOCAP) database indicate that the method achieves a positive improvement of 2.49% and 1.13% in weighted accuracy (WA) and unweighted accuracy (UA), respectively, compared to the state-of-the-art strategies.
文摘The macula is an imperative part present in our human visual system which is most responsible for clear and colour vision. For the people suffering from diabetes, the various parts of the body including the retina of the eye are affected. These retinal damages cause swelling and other abnormalities nearby macula. The pathologies in macula due to diabetes are called Diabetic Macular oEdema (DME). It affects patients’ vision that may lead to vision loss. It can be overcome by advance identification of causes for swelling. The major causes for the swelling are neovascularization and other abnormalities occurring in the blood vessels nearby the macula. The aim of this work is to avoid vision loss by detecting the presence of abnormalities in macula in advance. The pathologies present in the abnormal images are detected by image segmentation technique viz. Fuzzy K-means algorithm. The classification is done by two different classifiers namely Cascade Neural Network and Partial Least Square which are employed to identify whether the image is normal or abnormal. The results of both the classifiers are compared with respect to classifier accuracy, sensitivity and specificity. The classifier accuracies of Cascade Neural Network and Partial Least Square are 96.84% and 94.36%, respectively. The information about the severity of the disease and the localization of pathologies are very useful to the ophthalmologist for diagnosing the disease and apply proper treatments to the patients to avoid the formation of any lesion and prevent vision loss.
基金financially supported by the National Natural Science Foundation of China(No.51376150)
文摘This study proposed a design and optimization strategy for a tandem arranged cascade using the Non-dominated Sorting Genetic Algorithm(NSGA) Ⅱ multi-objective optimization algorithm and Back Propagation(BP) neural network technology. The NASA Stage 35 was employed as the initial bench mark in the present study and five geometric control parameters were working as the optimization parameters aiming to enhance the aerodynamic performance in terms of total pressure rise and efficiency. Results showed that the feasibility and capability of the proposed optimization strategy was successfully examined. In view of the fact that the initial tandem cascade(directly scaling down from NASA Stage 35) cannot guarantee the aerodynamic performance, first optimization trial was conducted to optimize the initial design. Results showed that the optimum can improve the flow quality whereas the separation on the blade is decayed or even eliminated particularly at the tip and root regions. However, compared with the initial tandem design, the enhancement in total pressure ratio(0.47%) and efficiency(1%) are too small to be noticed. Second investigation was particularly emphasizing on a high turning tandem compressor with an increment by 28°. The pressure rise and efficiency were augmented by 1.44% and 2.34%(compared to the initial tandem design), respectively. An important conclusion can be drawn that the optimization strategy is worthy to be used in high turning compressors with a considerable performance improvement.
基金National Key Research and Development Program of China under(2019YFB1804404)Beijing Natural Science Foundation(4202046)+1 种基金National Natural Science Foundation of China(61801052)Guangdong Key Field R&D Program(2018B010124001)。
文摘Automatic modulation classification(AMC)aims to identify the modulation format of the received signals corrupted by the noise,which plays a major role in radio monitoring.In this paper,we propose a novel cascaded convolutional neural network(CasCNN)-based hierarchical digital modulation classification scheme,where M-ary phase shift keying(PSK)and M-ary quadrature amplitude modulation(QAM)modulation formats are considered to be classified.In CasCNN,two-block convolutional neural networks are cascaded.The first block network is utilized to classify the different classes of modulation formats,namely PSK and QAM.The second block is designed to identify the indexes of the modulations in the same PSK or QAM class.Moreover,it is noted that the gird constellation diagram extracted from the received signal is utilized as the inputs to the CasCNN.Extensive simulations demonstrate that CasCNN yields performance gain and performs stronger robustness to frequency offset compared with other recent methods.Specifically,CasCNN achieves 90%classification accuracy at 4 dB signal-to-noise ratio when the symbol length is set as 256.
基金supported by the National Natural Science Foundation of China(62102418 and 62172415)the National Key R&D Program of China(2019YFB2204104)the Open Research Fund Program of State key Laboratory of Hydroscience and Engineering,Tsinghua University(sklhse-2020-D-07).
文摘Recent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results.In this work,a novel end-to-end framework is proposed based on accurate text stroke detection.Specifically,the text removal problem is decoupled into text stroke detection and stroke removal;we design separate networks to solve these two subproblems,the latter being a generative network.These two networks are combined as a processing unit,which is cascaded to obtain our final model for text removal.Experimental results demonstrate that the proposed method substantially outperforms the state-of-the-art for locating and erasing scene text.A new large-scale real-world dataset with 12,120 images has been constructed and is being made available to facilitate research,as current publicly available datasets are mainly synthetic so cannot properly measure the performance of different methods.
基金supported by the Second Tibetan Plateau Scientifc Expedition and Research Program(STEP)(Grant No.2019QZKK0905)the National Science Foundation of China(Grant No.42071097)+1 种基金the foundation of the State Key Laboratory of Frozen Soil Engineering(Grant No.SKLFSE202003)the 14th Graduate Education Innovation Fund of Wuhan Institute of Technology(Grant No.CX2022164).
文摘Thaw slumping is a periglacial process that occurs on slopes in cold environments,where the ground becomes unstable and the surface slides downhill due to saturation with water during thawing.In this study,GaoFen-1 remote sensing and fused multi-source feature data were used to automatically map thaw slumping landforms in the Beilu River Basin of the Qinghai–Tibet Plateau.The bi-directional cascade network structure was used to extract edges at diferent scales,where an individual layer was supervised by labeled edges at its specifc scale,rather than directly applying the same supervision to all convolutional neural network outputs.Additionally,we conducted a 5-year multi-scale feature analysis of small baseline subset interferometric synthetic aperture radar deformation,normalized diference vegetation index,and slope,among other features.Our study analyzed the performance and accuracy of three methods based on edge object supervised learning and three preconfgured neural networks,ResNet101,VGG16,and ResNet152.Through verifcation using site surveys and multi-data fusion results,we obtained the best ResNet101 model score of intersection over union of 0.85(overall accuracy of 84.59%).The value of intersection over union of the VGG and ResNet152 are 0.569 and 0.773,respectively.This work provides a new insight for the potential feasibility of applying the designed edge detection method to map diverse thaw slumping landforms in larger areas with high-resolution images.
基金supported by the Key R&D Program of Shandong Province(2020CXGC010109)the Beijing Municipal Science and Technology Project(Z181100003218015)。
文摘The manifold matrix of the received signals can be destroyed when the array is with the gain and phase errors,which will affect the performance of the traditional direction of arrival(DOA)estimation approaches.In this paper,a novel active array calibration method for the gain and phase errors based on a cascaded neural network(GPECNN)was proposed.The cascaded neural network contains two parts:signal-to-noise ratio(SNR)classification network and two sets of error estimation subnetworks.Error calibration subnetworks are activated according to the output of the SNR classification network,each of which consists of a gain error estimation network(GEEN)and a phase error estimation network(PEEN),respectively.The disadvantage of neural network topology architecture is changing when the number of array elements varies is addressed by the proposed group calibration strategy.Moreover,due to the data characteristics of the input vector,the cascaded neural network can be applied to arrays with arbitrary geometry without repetitive training.Simulation results demonstrate that the GPECNN not only achieves a better balance between calibration performance and calibration complexity than other methods but also can be applied to arrays with different numbers of sensors or different shapes without repetitive training.
基金This work was supported by China scholarship council under Grant 201906320221.
文摘Transient stability batch assessment(TSBA)is es-sential for dynamic security check in both power system planning and day-ahead dispatch.It is also a necessary technique to generate sufficient training data for data-driven online transient stability assessment(TSA).However,most existing work suffers from various problems including high computational burden,low model adaptability,and low performance robustness.Therefore,it is still a significant challenge in modern power systems,with numerous scenarios(e.g.,operating conditions and"N-k"contin-gencies)to be assessed at the same time.The purpose of this work is to construct a data-driven method to early terminate time-domain simulation(TDS)and dynamically schedule TSBA task queue a prior,in order to reduce computational burden without compromising accuracy.To achieve this goal,a time-adaptive cas-caded convolutional neural networks(CNNs)model is developed to predict stability and early terminate TDS.Additionally,an information entropy based prioritization strategy is designed to distinguish informative samples,dynamically schedule TSBA task queue and timely update model,thus further reducing simulation time.Case study in IEEE 39-bus system validates the effectiveness of the proposed method.