Hypoxia is a typical feature of the tumor microenvironment,one of the most critical factors affecting cell behavior and tumor progression.However,the lack of tumor models able to precisely emulate natural brain tumor ...Hypoxia is a typical feature of the tumor microenvironment,one of the most critical factors affecting cell behavior and tumor progression.However,the lack of tumor models able to precisely emulate natural brain tumor tissue has impeded the study of the effects of hypoxia on the progression and growth of tumor cells.This study reports a three-dimensional(3D)brain tumor model obtained by encapsulating U87MG(U87)cells in a hydrogel containing type I collagen.It also documents the effect of various oxygen concentrations(1%,7%,and 21%)in the culture environment on U87 cell morphology,proliferation,viability,cell cycle,apoptosis rate,and migration.Finally,it compares two-dimensional(2D)and 3D cultures.For comparison purposes,cells cultured in flat culture dishes were used as the control(2D model).Cells cultured in the 3D model proliferated more slowly but had a higher apoptosis rate and proportion of cells in the resting phase(G0 phase)/gap I phase(G1 phase)than those cultured in the 2D model.Besides,the two models yielded significantly different cell morphologies.Finally,hypoxia(e.g.,1%O2)affected cell morphology,slowed cell growth,reduced cell viability,and increased the apoptosis rate in the 3D model.These results indicate that the constructed 3D model is effective for investigating the effects of biological and chemical factors on cell morphology and function,and can be more representative of the tumor microenvironment than 2D culture systems.The developed 3D glioblastoma tumor model is equally applicable to other studies in pharmacology and pathology.展开更多
Cyberspace is extremely dynamic,with new attacks arising daily.Protecting cybersecurity controls is vital for network security.Deep Learning(DL)models find widespread use across various fields,with cybersecurity being...Cyberspace is extremely dynamic,with new attacks arising daily.Protecting cybersecurity controls is vital for network security.Deep Learning(DL)models find widespread use across various fields,with cybersecurity being one of the most crucial due to their rapid cyberattack detection capabilities on networks and hosts.The capabilities of DL in feature learning and analyzing extensive data volumes lead to the recognition of network traffic patterns.This study presents novel lightweight DL models,known as Cybernet models,for the detection and recognition of various cyber Distributed Denial of Service(DDoS)attacks.These models were constructed to have a reasonable number of learnable parameters,i.e.,less than 225,000,hence the name“lightweight.”This not only helps reduce the number of computations required but also results in faster training and inference times.Additionally,these models were designed to extract features in parallel from 1D Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM),which makes them unique compared to earlier existing architectures and results in better performance measures.To validate their robustness and effectiveness,they were tested on the CIC-DDoS2019 dataset,which is an imbalanced and large dataset that contains different types of DDoS attacks.Experimental results revealed that bothmodels yielded promising results,with 99.99% for the detectionmodel and 99.76% for the recognition model in terms of accuracy,precision,recall,and F1 score.Furthermore,they outperformed the existing state-of-the-art models proposed for the same task.Thus,the proposed models can be used in cyber security research domains to successfully identify different types of attacks with a high detection and recognition rate.展开更多
Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and c...Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and categorize them into predefined entity types.This process can provide basic support for the automatic construction of knowledge bases.In contrast to general texts,biomedical texts frequently contain numerous nested entities and local dependencies among these entities,presenting significant challenges to prevailing NER models.To address these issues,we propose a novel Chinese nested biomedical NER model based on RoBERTa and Global Pointer(RoBGP).Our model initially utilizes the RoBERTa-wwm-ext-large pretrained language model to dynamically generate word-level initial vectors.It then incorporates a Bidirectional Long Short-Term Memory network for capturing bidirectional semantic information,effectively addressing the issue of long-distance dependencies.Furthermore,the Global Pointer model is employed to comprehensively recognize all nested entities in the text.We conduct extensive experiments on the Chinese medical dataset CMeEE and the results demonstrate the superior performance of RoBGP over several baseline models.This research confirms the effectiveness of RoBGP in Chinese biomedical NER,providing reliable technical support for biomedical information extraction and knowledge base construction.展开更多
Internet of Vehicles (IoV) is a new system that enables individual vehicles to connect with nearby vehicles,people, transportation infrastructure, and networks, thereby realizing amore intelligent and efficient transp...Internet of Vehicles (IoV) is a new system that enables individual vehicles to connect with nearby vehicles,people, transportation infrastructure, and networks, thereby realizing amore intelligent and efficient transportationsystem. The movement of vehicles and the three-dimensional (3D) nature of the road network cause the topologicalstructure of IoV to have the high space and time complexity.Network modeling and structure recognition for 3Droads can benefit the description of topological changes for IoV. This paper proposes a 3Dgeneral roadmodel basedon discrete points of roads obtained from GIS. First, the constraints imposed by 3D roads on moving vehicles areanalyzed. Then the effects of road curvature radius (Ra), longitudinal slope (Slo), and length (Len) on speed andacceleration are studied. Finally, a general 3D road network model based on road section features is established.This paper also presents intersection and road section recognition methods based on the structural features ofthe 3D road network model and the road features. Real GIS data from a specific region of Beijing is adopted tocreate the simulation scenario, and the simulation results validate the general 3D road network model and therecognitionmethod. Therefore, thiswork makes contributions to the field of intelligent transportation by providinga comprehensive approach tomodeling the 3Droad network and its topological changes in achieving efficient trafficflowand improved road safety.展开更多
Expanding photovoltaic(PV)resources in rural-grid areas is an essential means to augment the share of solar energy in the energy landscape,aligning with the“carbon peaking and carbon neutrality”objectives.However,ru...Expanding photovoltaic(PV)resources in rural-grid areas is an essential means to augment the share of solar energy in the energy landscape,aligning with the“carbon peaking and carbon neutrality”objectives.However,rural power grids often lack digitalization;thus,the load distribution within these areas is not fully known.This hinders the calculation of the available PV capacity and deduction of node voltages.This study proposes a load-distribution modeling approach based on remote-sensing image recognition in pursuit of a scientific framework for developing distributed PV resources in rural grid areas.First,houses in remote-sensing images are accurately recognized using deep-learning techniques based on the YOLOv5 model.The distribution of the houses is then used to estimate the load distribution in the grid area.Next,equally spaced and clustered distribution models are used to adaptively determine the location of the nodes and load power in the distribution lines.Finally,by calculating the connectivity matrix of the nodes,a minimum spanning tree is extracted,the topology of the network is constructed,and the node parameters of the load-distribution model are calculated.The proposed scheme is implemented in a software package and its efficacy is demonstrated by analyzing typical remote-sensing images of rural grid areas.The results underscore the ability of the proposed approach to effectively discern the distribution-line structure and compute the node parameters,thereby offering vital support for determining PV access capability.展开更多
With drilling and seismic data of Transtensional(strike-slip)Fault System in the Ziyang area of the central Sichuan Basin,SW China plane-section integrated structural interpretation,3-D fault framework model building,...With drilling and seismic data of Transtensional(strike-slip)Fault System in the Ziyang area of the central Sichuan Basin,SW China plane-section integrated structural interpretation,3-D fault framework model building,fault throw analyzing,and balanced profile restoration,it is pointed out that the transtensional fault system in the Ziyang 3-D seismic survey consists of the northeast-trending F_(I)19 and F_(I)20 fault zones dominated by extensional deformation,as well as 3 sets of northwest-trending en echelon normal faults experienced dextral shear deformation.Among them,the F_(I)19 and F_(I)20 fault zones cut through the Neoproterozoic to Lower Triassic Jialingjiang Formation,presenting a 3-D structure of an“S”-shaped ribbon.And before Permian and during the Early Triassic,the F_(I)19 and F_(I)20 fault zones underwent at least two periods of structural superimposition.Besides,the 3 sets of northwest-trending en echelon normal faults are composed of small normal faults arranged in pairs,with opposite dip directions and partially left-stepped arrangement.And before Permian,they had formed almost,restricting the eastward growth and propagation of the F_(I)19 fault zone.The F_(I)19 and F_(I)20 fault zones communicate multiple sets of source rocks and reservoirs from deep to shallow,and the timing of fault activity matches well with oil and gas generation peaks.If there were favorable Cambrian-Triassic sedimentary facies and reservoirs developing on the local anticlinal belts of both sides of the F_(I)19 and F_(I)20 fault zones,the major reservoirs in this area are expected to achieve breakthroughs in oil and gas exploration.展开更多
Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format fo...Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset.展开更多
In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classif...In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classifiers on large signal datasets with redundant samples requires significant memory and high costs.This paper proposes a support databased core-set selection method(SD)for signal recognition,aiming to screen a representative subset that approximates the large signal dataset.Specifically,this subset can be identified by employing the labeled information during the early stages of model training,as some training samples are labeled as supporting data frequently.This support data is crucial for model training and can be found using a border sample selector.Simulation results demonstrate that the SD method minimizes the impact on model recognition performance while reducing the dataset size,and outperforms five other state-of-the-art core-set selection methods when the fraction of training sample kept is less than or equal to 0.3 on the RML2016.04C dataset or 0.5 on the RML22 dataset.The SD method is particularly helpful for signal recognition tasks with limited memory and computing resources.展开更多
Research on Chinese Sign Language(CSL)provides convenience and support for individuals with hearing impairments to communicate and integrate into society.This article reviews the relevant literature on Chinese Sign La...Research on Chinese Sign Language(CSL)provides convenience and support for individuals with hearing impairments to communicate and integrate into society.This article reviews the relevant literature on Chinese Sign Language Recognition(CSLR)in the past 20 years.Hidden Markov Models(HMM),Support Vector Machines(SVM),and Dynamic Time Warping(DTW)were found to be the most commonly employed technologies among traditional identificationmethods.Benefiting from the rapid development of computer vision and artificial intelligence technology,Convolutional Neural Networks(CNN),3D-CNN,YOLO,Capsule Network(CapsNet)and various deep neural networks have sprung up.Deep Neural Networks(DNNs)and their derived models are integral tomodern artificial intelligence recognitionmethods.In addition,technologies thatwerewidely used in the early days have also been integrated and applied to specific hybrid models and customized identification methods.Sign language data collection includes acquiring data from data gloves,data sensors(such as Kinect,LeapMotion,etc.),and high-definition photography.Meanwhile,facial expression recognition,complex background processing,and 3D sign language recognition have also attracted research interests among scholars.Due to the uniqueness and complexity of Chinese sign language,accuracy,robustness,real-time performance,and user independence are significant challenges for future sign language recognition research.Additionally,suitable datasets and evaluation criteria are also worth pursuing.展开更多
Dynamic signature is a biometric modality that recognizes an individual’s anatomic and behavioural characteristics when signing their name. The rampant case of signature falsification (Identity Theft) was the key mot...Dynamic signature is a biometric modality that recognizes an individual’s anatomic and behavioural characteristics when signing their name. The rampant case of signature falsification (Identity Theft) was the key motivating factor for embarking on this study. This study was necessitated by the damages and dangers posed by signature forgery coupled with the intractable nature of the problem. The aim and objectives of this study is to design a proactive and responsive system that could compare two signature samples and detect the correct signature against the forged one. Dynamic Signature verification is an important biometric technique that aims to detect whether a given signature is genuine or forged. In this research work, Convolutional Neural Networks (CNNsor ConvNet) which is a class of deep, feed forward artificial neural networks that has successfully been applied to analysing visual imagery was used to train the model. The signature images are stored in a file directory structure which the Keras Python library can work with. Then the CNN was implemented in python using the Keras with the TensorFlow backend to learn the patterns associated with the signature. The result showed that for the same CNNs-based network experimental result of average accuracy, the larger the training dataset, the higher the test accuracy. However, when the training dataset are insufficient, better results can be obtained. The paper concluded that by training datasets using CNNs network, 98% accuracy in the result was recorded, in the experimental part, the model achieved a high degree of accuracy in the classification of the biometric parameters used.展开更多
BACKGROUND Esophageal cancer is one of the most common malignant tumors.The three-dimensional quality structure model is a quality assessment theory that includes three dimensions:Structure,process,and results.AIM To ...BACKGROUND Esophageal cancer is one of the most common malignant tumors.The three-dimensional quality structure model is a quality assessment theory that includes three dimensions:Structure,process,and results.AIM To investigate the effects of nursing interventions with three-dimensional quality assessment on the efficacy and disease management ability of patients undergoing esophageal cancer surgery.METHODS In this prospective study,the control group received routine nursing,and the intervention group additionally received a three-dimensional quality assessment intervention based on the above routine care.Self-efficacy and patient disease management abilities were evaluated using the General Self-Efficacy Scale(GSES)and Exercise of Self-Care Agency scale,respectively.IBM SPSS Statistics for Windows,version 17.0,was used for the data processing.RESULTS This study recruited 112 patients who were assigned to the control and experi-mental groups(n=56 per group).Before the intervention,there was no significant difference in GSES scores between the two groups(P>0.05).After the inter-vention,the GSES scores of both groups increased,with the experimental group showing higher values(P<0.05).At the time of discharge and three months after discharge,the scores for positive attitudes,self-stress reduction,and total score of health promotion in the experimental group were higher than those in the control group(P<0.05).CONCLUSION The implementation of a three-dimensional quality structure model for postoperative patients with esophageal cancer can effectively improve their self-management ability and self-efficacy of postoperative patients.展开更多
The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand an...The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand and face information in visual backbones or use expensive and time-consuming external extractors to explore this information.In addition,the signs have different lengths,whereas previous CSLR methods typically use a fixed-length window to segment the video to capture sequential features and then perform global temporal modeling,which disturbs the perception of complete signs.In this study,we propose a Multi-Scale Context-Aware network(MSCA-Net)to solve the aforementioned problems.Our MSCA-Net contains two main modules:(1)Multi-Scale Motion Attention(MSMA),which uses the differences among frames to perceive information of the hands and face in multiple spatial scales,replacing the heavy feature extractors;and(2)Multi-Scale Temporal Modeling(MSTM),which explores crucial temporal information in the sign language video from different temporal scales.We conduct extensive experiments using three widely used sign language datasets,i.e.,RWTH-PHOENIX-Weather-2014,RWTH-PHOENIX-Weather-2014T,and CSL-Daily.The proposed MSCA-Net achieve state-of-the-art performance,demonstrating the effectiveness of our approach.展开更多
Aiming at the problems that the simulation accuracy which is reduced due to the simplification of the model,a three-dimensional simulation method based on solid modeling is being proposed.By analyzing the motion relat...Aiming at the problems that the simulation accuracy which is reduced due to the simplification of the model,a three-dimensional simulation method based on solid modeling is being proposed.By analyzing the motion relationship and positional relationship between the caries knife and the workpiece,the coordinate system of the caries machining was established.With the MATLAB software,the cutting edge model and the blade sweeping surface model of the boring cutter are sequentially established.Boolean operation is performed on the blade swept surface formed by the tooth cutter teeth with time t and the workpiece tooth geometry as well as the undeformed three-dimensional chip geometry model and the instantaneous cogging geometry model are obtained at different times.Through the compare between gear end face simulation tooth profile and the theoretical inner arc tooth profile,we verified the accuracy and rationality of the proposed method.展开更多
We combined domestic ground-based and satellite magnetic measurements to create a regional three-dimensional surface Spline(3DSS)gradient model of the main geomagnetic field over the Chinese continent.To improve the p...We combined domestic ground-based and satellite magnetic measurements to create a regional three-dimensional surface Spline(3DSS)gradient model of the main geomagnetic field over the Chinese continent.To improve the precision of the model,we considered the data gap between the ground and satellite data.We compared and analyzed the results of the Taylor polynomial,surface Spline,and CHAOS-6(the CHAMP,?rsted and SAC-C model of Earth’s magnetic field)gradient models.Results showed that the gradients in the south-north and east-west directions of the four models were consistent.The 3DSS model was able to express not only gradients at different altitudes,but also average gradients inside the research area.The two Spline models were able to capture more information on gradient anomalies than were the fitted models.Strong local anomalies were observed in northern Xinjiang,Beijing,and the junction area between Jiangsu and Zhejiang,and the total intensity F decreased whereas the altitude increased.The gradient decreased by 21.69%in the south-north direction and increased by 11.78%in the east-west direction.In addition,the altitude gradient turned from negative to positive while the altitude increased.The Spline model and the two fitted models differed mainly in the field sources they expressed and the modeling theory.展开更多
Objective To evaluate the predictive validity of IRIS™(Intuitive Surgical®,Sunnyvale,CA,USA)as a planning tool for robot-assisted partial nephrectomy(RAPN)by assessing the degree of overlap with intraoperative ex...Objective To evaluate the predictive validity of IRIS™(Intuitive Surgical®,Sunnyvale,CA,USA)as a planning tool for robot-assisted partial nephrectomy(RAPN)by assessing the degree of overlap with intraoperative execution.Methods Thirty-one patients scheduled for RAPN by four experienced urologists were enrolled in a prospective study.Prior to surgery,urologists reviewed the IRIS™three-dimensional model on an iphone Operating System(iOS)app and completed a questionnaire outlining their surgical plan including surgical approach,and ischemia technique as well as confidence in executing this plan.Postoperatively,questionnaires assessing the procedural approach,clinical utility,efficiency,and effectiveness of IRIS™were completed.The degree of overlap between the preoperative and intraoperative questionnaires and between the planned approach and actual execution of the procedure was analyzed.Questionnaires were answered on a 5-point Likert scale and scores of 4 or greater were considered positive.Results Mean age was 65.1 years with a mean tumor size of 27.7 mm(interquartile range 17.5-44.0 mm).Hilar tumors consisted of 32.3%;48.4%of patients had R.E.N.A.L.nephrometry scores of 7-9.On preoperative questionnaires,the surgeons reported that in 67.7%cases they were confident that they can perform the procedure successfully,and on intraoperative questionnaires,the surgeons reported that in 96.8%cases IRIS™helped achieve good spatial sensation of the anatomy.There was a high degree of overlap between preoperative and intraoperative questionnaires for the surgical approach,interpreting anatomical details and clinical utility.When comparing plans for selective or off-clamp,the preoperative plan was executed in 90.0%of cases intraoperatively.Conclusion A high degree of overlap between the preoperative surgical approach and intraoperative RAPN execution was found using IRIS™.This is the first study to evaluate the predictive accuracy of IRIS™during RAPN by comparing preoperative plan and intraoperative execution.展开更多
In this paper, the axial-flux permanent magnet driver is modeledand analyzed in a simple and novel way under three-dimensional cylindricalcoordinates. The inherent three-dimensional characteristics of the deviceare co...In this paper, the axial-flux permanent magnet driver is modeledand analyzed in a simple and novel way under three-dimensional cylindricalcoordinates. The inherent three-dimensional characteristics of the deviceare comprehensively considered, and the governing equations are solved bysimplifying the boundary conditions. The axial magnetization of the sectorshapedpermanent magnets is accurately described in an algebraic form bythe parameters, which makes the physical meaning more explicit than thepurely mathematical expression in general series forms. The parameters of theBessel function are determined simply and the magnetic field distribution ofpermanent magnets and the air-gap is solved. Furthermore, the field solutionsare completely analytical, which provides convenience and satisfactoryaccuracy for modeling a series of electromagnetic performance parameters,such as the axial electromagnetic force density, axial electromagnetic force,and electromagnetic torque. The correctness and accuracy of the analyticalmodels are fully verified by three-dimensional finite element simulations and a15 kW prototype and the results of calculations, simulations, and experimentsunder three methods are highly consistent. The influence of several designparameters on magnetic field distribution and performance is studied and discussed.The results indicate that the modeling method proposed in this papercan calculate the magnetic field distribution and performance accurately andrapidly, which affords an important reference for the design and optimizationof axial-flux permanent magnet drivers.展开更多
This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition(ASR)technology in the Air Traffic Control(ATC)field.This paper presents ...This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition(ASR)technology in the Air Traffic Control(ATC)field.This paper presents a novel cascaded model architecture,namely Conformer-CTC/Attention-T5(CCAT),to build a highly accurate and robust ATC speech recognition model.To tackle the challenges posed by noise and fast speech rate in ATC,the Conformer model is employed to extract robust and discriminative speech representations from raw waveforms.On the decoding side,the Attention mechanism is integrated to facilitate precise alignment between input features and output characters.The Text-To-Text Transfer Transformer(T5)language model is also introduced to handle particular pronunciations and code-mixing issues,providing more accurate and concise textual output for downstream tasks.To enhance the model’s robustness,transfer learning and data augmentation techniques are utilized in the training strategy.The model’s performance is optimized by performing hyperparameter tunings,such as adjusting the number of attention heads,encoder layers,and the weights of the loss function.The experimental results demonstrate the significant contributions of data augmentation,hyperparameter tuning,and error correction models to the overall model performance.On the Our ATC Corpus dataset,the proposed model achieves a Character Error Rate(CER)of 3.44%,representing a 3.64%improvement compared to the baseline model.Moreover,the effectiveness of the proposed model is validated on two publicly available datasets.On the AISHELL-1 dataset,the CCAT model achieves a CER of 3.42%,showcasing a 1.23%improvement over the baseline model.Similarly,on the LibriSpeech dataset,the CCAT model achieves a Word Error Rate(WER)of 5.27%,demonstrating a performance improvement of 7.67%compared to the baseline model.Additionally,this paper proposes an evaluation criterion for assessing the robustness of ATC speech recognition systems.In robustness evaluation experiments based on this criterion,the proposed model demonstrates a performance improvement of 22%compared to the baseline model.展开更多
Automatic speech recognition(ASR)systems have emerged as indispensable tools across a wide spectrum of applications,ranging from transcription services to voice-activated assistants.To enhance the performance of these...Automatic speech recognition(ASR)systems have emerged as indispensable tools across a wide spectrum of applications,ranging from transcription services to voice-activated assistants.To enhance the performance of these systems,it is important to deploy efficient models capable of adapting to diverse deployment conditions.In recent years,on-demand pruning methods have obtained significant attention within the ASR domain due to their adaptability in various deployment scenarios.However,these methods often confront substantial trade-offs,particularly in terms of unstable accuracy when reducing the model size.To address challenges,this study introduces two crucial empirical findings.Firstly,it proposes the incorporation of an online distillation mechanism during on-demand pruning training,which holds the promise of maintaining more consistent accuracy levels.Secondly,it proposes the utilization of the Mogrifier long short-term memory(LSTM)language model(LM),an advanced iteration of the conventional LSTM LM,as an effective alternative for pruning targets within the ASR framework.Through rigorous experimentation on the ASR system,employing the Mogrifier LSTM LM and training it using the suggested joint on-demand pruning and online distillation method,this study provides compelling evidence.The results exhibit that the proposed methods significantly outperform a benchmark model trained solely with on-demand pruning methods.Impressively,the proposed strategic configuration successfully reduces the parameter count by approximately 39%,all the while minimizing trade-offs.展开更多
Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Car...Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.展开更多
Micro-expressions are spontaneous, unconscious movements that reveal true emotions.Accurate facial movement information and network training learning methods are crucial for micro-expression recognition.However, most ...Micro-expressions are spontaneous, unconscious movements that reveal true emotions.Accurate facial movement information and network training learning methods are crucial for micro-expression recognition.However, most existing micro-expression recognition technologies so far focus on modeling the single category of micro-expression images and neural network structure.Aiming at the problems of low recognition rate and weak model generalization ability in micro-expression recognition, a micro-expression recognition algorithm is proposed based on graph convolution network(GCN) and Transformer model.Firstly, action unit(AU) feature detection is extracted and facial muscle nodes in the neighborhood are divided into three subsets for recognition.Then, graph convolution layer is used to find the layout of dependencies between AU nodes of micro-expression classification.Finally, multiple attentional features of each facial action are enriched with Transformer model to include more sequence information before calculating the overall correlation of each region.The proposed method is validated in CASME II and CAS(ME)^2 datasets, and the recognition rate reached 69.85%.展开更多
基金supported by the National Natural Science Foundation of China (No. 52275291)the Fundamental Research Funds for the Central Universitiesthe Program for Innovation Team of Shaanxi Province,China (No. 2023-CX-TD-17)
文摘Hypoxia is a typical feature of the tumor microenvironment,one of the most critical factors affecting cell behavior and tumor progression.However,the lack of tumor models able to precisely emulate natural brain tumor tissue has impeded the study of the effects of hypoxia on the progression and growth of tumor cells.This study reports a three-dimensional(3D)brain tumor model obtained by encapsulating U87MG(U87)cells in a hydrogel containing type I collagen.It also documents the effect of various oxygen concentrations(1%,7%,and 21%)in the culture environment on U87 cell morphology,proliferation,viability,cell cycle,apoptosis rate,and migration.Finally,it compares two-dimensional(2D)and 3D cultures.For comparison purposes,cells cultured in flat culture dishes were used as the control(2D model).Cells cultured in the 3D model proliferated more slowly but had a higher apoptosis rate and proportion of cells in the resting phase(G0 phase)/gap I phase(G1 phase)than those cultured in the 2D model.Besides,the two models yielded significantly different cell morphologies.Finally,hypoxia(e.g.,1%O2)affected cell morphology,slowed cell growth,reduced cell viability,and increased the apoptosis rate in the 3D model.These results indicate that the constructed 3D model is effective for investigating the effects of biological and chemical factors on cell morphology and function,and can be more representative of the tumor microenvironment than 2D culture systems.The developed 3D glioblastoma tumor model is equally applicable to other studies in pharmacology and pathology.
文摘Cyberspace is extremely dynamic,with new attacks arising daily.Protecting cybersecurity controls is vital for network security.Deep Learning(DL)models find widespread use across various fields,with cybersecurity being one of the most crucial due to their rapid cyberattack detection capabilities on networks and hosts.The capabilities of DL in feature learning and analyzing extensive data volumes lead to the recognition of network traffic patterns.This study presents novel lightweight DL models,known as Cybernet models,for the detection and recognition of various cyber Distributed Denial of Service(DDoS)attacks.These models were constructed to have a reasonable number of learnable parameters,i.e.,less than 225,000,hence the name“lightweight.”This not only helps reduce the number of computations required but also results in faster training and inference times.Additionally,these models were designed to extract features in parallel from 1D Convolutional Neural Networks(CNN)and Long Short-Term Memory(LSTM),which makes them unique compared to earlier existing architectures and results in better performance measures.To validate their robustness and effectiveness,they were tested on the CIC-DDoS2019 dataset,which is an imbalanced and large dataset that contains different types of DDoS attacks.Experimental results revealed that bothmodels yielded promising results,with 99.99% for the detectionmodel and 99.76% for the recognition model in terms of accuracy,precision,recall,and F1 score.Furthermore,they outperformed the existing state-of-the-art models proposed for the same task.Thus,the proposed models can be used in cyber security research domains to successfully identify different types of attacks with a high detection and recognition rate.
基金supported by the Outstanding Youth Team Project of Central Universities(QNTD202308)the Ant Group through CCF-Ant Research Fund(CCF-AFSG 769498 RF20220214).
文摘Named Entity Recognition(NER)stands as a fundamental task within the field of biomedical text mining,aiming to extract specific types of entities such as genes,proteins,and diseases from complex biomedical texts and categorize them into predefined entity types.This process can provide basic support for the automatic construction of knowledge bases.In contrast to general texts,biomedical texts frequently contain numerous nested entities and local dependencies among these entities,presenting significant challenges to prevailing NER models.To address these issues,we propose a novel Chinese nested biomedical NER model based on RoBERTa and Global Pointer(RoBGP).Our model initially utilizes the RoBERTa-wwm-ext-large pretrained language model to dynamically generate word-level initial vectors.It then incorporates a Bidirectional Long Short-Term Memory network for capturing bidirectional semantic information,effectively addressing the issue of long-distance dependencies.Furthermore,the Global Pointer model is employed to comprehensively recognize all nested entities in the text.We conduct extensive experiments on the Chinese medical dataset CMeEE and the results demonstrate the superior performance of RoBGP over several baseline models.This research confirms the effectiveness of RoBGP in Chinese biomedical NER,providing reliable technical support for biomedical information extraction and knowledge base construction.
基金the National Natural Science Foundation of China(Nos.62272063,62072056 and 61902041)the Natural Science Foundation of Hunan Province(Nos.2022JJ30617 and 2020JJ2029)+4 种基金Open Research Fund of Key Lab of Broadband Wireless Communication and Sensor Network Technology,Nanjing University of Posts and Telecommunications(No.JZNY202102)the Traffic Science and Technology Project of Hunan Province,China(No.202042)Hunan Provincial Key Research and Development Program(No.2022GK2019)this work was funded by the Researchers Supporting Project Number(RSPD2023R681)King Saud University,Riyadh,Saudi Arabia.
文摘Internet of Vehicles (IoV) is a new system that enables individual vehicles to connect with nearby vehicles,people, transportation infrastructure, and networks, thereby realizing amore intelligent and efficient transportationsystem. The movement of vehicles and the three-dimensional (3D) nature of the road network cause the topologicalstructure of IoV to have the high space and time complexity.Network modeling and structure recognition for 3Droads can benefit the description of topological changes for IoV. This paper proposes a 3Dgeneral roadmodel basedon discrete points of roads obtained from GIS. First, the constraints imposed by 3D roads on moving vehicles areanalyzed. Then the effects of road curvature radius (Ra), longitudinal slope (Slo), and length (Len) on speed andacceleration are studied. Finally, a general 3D road network model based on road section features is established.This paper also presents intersection and road section recognition methods based on the structural features ofthe 3D road network model and the road features. Real GIS data from a specific region of Beijing is adopted tocreate the simulation scenario, and the simulation results validate the general 3D road network model and therecognitionmethod. Therefore, thiswork makes contributions to the field of intelligent transportation by providinga comprehensive approach tomodeling the 3Droad network and its topological changes in achieving efficient trafficflowand improved road safety.
基金supported by the State Grid Science&Technology Project of China(5400-202224153A-1-1-ZN).
文摘Expanding photovoltaic(PV)resources in rural-grid areas is an essential means to augment the share of solar energy in the energy landscape,aligning with the“carbon peaking and carbon neutrality”objectives.However,rural power grids often lack digitalization;thus,the load distribution within these areas is not fully known.This hinders the calculation of the available PV capacity and deduction of node voltages.This study proposes a load-distribution modeling approach based on remote-sensing image recognition in pursuit of a scientific framework for developing distributed PV resources in rural grid areas.First,houses in remote-sensing images are accurately recognized using deep-learning techniques based on the YOLOv5 model.The distribution of the houses is then used to estimate the load distribution in the grid area.Next,equally spaced and clustered distribution models are used to adaptively determine the location of the nodes and load power in the distribution lines.Finally,by calculating the connectivity matrix of the nodes,a minimum spanning tree is extracted,the topology of the network is constructed,and the node parameters of the load-distribution model are calculated.The proposed scheme is implemented in a software package and its efficacy is demonstrated by analyzing typical remote-sensing images of rural grid areas.The results underscore the ability of the proposed approach to effectively discern the distribution-line structure and compute the node parameters,thereby offering vital support for determining PV access capability.
基金Supported by the Key Project of National Natural Science Foundation of China(42330810).
文摘With drilling and seismic data of Transtensional(strike-slip)Fault System in the Ziyang area of the central Sichuan Basin,SW China plane-section integrated structural interpretation,3-D fault framework model building,fault throw analyzing,and balanced profile restoration,it is pointed out that the transtensional fault system in the Ziyang 3-D seismic survey consists of the northeast-trending F_(I)19 and F_(I)20 fault zones dominated by extensional deformation,as well as 3 sets of northwest-trending en echelon normal faults experienced dextral shear deformation.Among them,the F_(I)19 and F_(I)20 fault zones cut through the Neoproterozoic to Lower Triassic Jialingjiang Formation,presenting a 3-D structure of an“S”-shaped ribbon.And before Permian and during the Early Triassic,the F_(I)19 and F_(I)20 fault zones underwent at least two periods of structural superimposition.Besides,the 3 sets of northwest-trending en echelon normal faults are composed of small normal faults arranged in pairs,with opposite dip directions and partially left-stepped arrangement.And before Permian,they had formed almost,restricting the eastward growth and propagation of the F_(I)19 fault zone.The F_(I)19 and F_(I)20 fault zones communicate multiple sets of source rocks and reservoirs from deep to shallow,and the timing of fault activity matches well with oil and gas generation peaks.If there were favorable Cambrian-Triassic sedimentary facies and reservoirs developing on the local anticlinal belts of both sides of the F_(I)19 and F_(I)20 fault zones,the major reservoirs in this area are expected to achieve breakthroughs in oil and gas exploration.
文摘Handwritten character recognition(HCR)involves identifying characters in images,documents,and various sources such as forms surveys,questionnaires,and signatures,and transforming them into a machine-readable format for subsequent processing.Successfully recognizing complex and intricately shaped handwritten characters remains a significant obstacle.The use of convolutional neural network(CNN)in recent developments has notably advanced HCR,leveraging the ability to extract discriminative features from extensive sets of raw data.Because of the absence of pre-existing datasets in the Kurdish language,we created a Kurdish handwritten dataset called(KurdSet).The dataset consists of Kurdish characters,digits,texts,and symbols.The dataset consists of 1560 participants and contains 45,240 characters.In this study,we chose characters only from our dataset.We utilized a Kurdish dataset for handwritten character recognition.The study also utilizes various models,including InceptionV3,Xception,DenseNet121,and a customCNNmodel.To show the performance of the KurdSet dataset,we compared it to Arabic handwritten character recognition dataset(AHCD).We applied the models to both datasets to show the performance of our dataset.Additionally,the performance of the models is evaluated using test accuracy,which measures the percentage of correctly classified characters in the evaluation phase.All models performed well in the training phase,DenseNet121 exhibited the highest accuracy among the models,achieving a high accuracy of 99.80%on the Kurdish dataset.And Xception model achieved 98.66%using the Arabic dataset.
基金supported by National Natural Science Foundation of China(62371098)Natural Science Foundation of Sichuan Province(2023NSFSC1422)+1 种基金National Key Research and Development Program of China(2021YFB2900404)Central Universities of South west Minzu University(ZYN2022032).
文摘In recent years,deep learning-based signal recognition technology has gained attention and emerged as an important approach for safeguarding the electromagnetic environment.However,training deep learning-based classifiers on large signal datasets with redundant samples requires significant memory and high costs.This paper proposes a support databased core-set selection method(SD)for signal recognition,aiming to screen a representative subset that approximates the large signal dataset.Specifically,this subset can be identified by employing the labeled information during the early stages of model training,as some training samples are labeled as supporting data frequently.This support data is crucial for model training and can be found using a border sample selector.Simulation results demonstrate that the SD method minimizes the impact on model recognition performance while reducing the dataset size,and outperforms five other state-of-the-art core-set selection methods when the fraction of training sample kept is less than or equal to 0.3 on the RML2016.04C dataset or 0.5 on the RML22 dataset.The SD method is particularly helpful for signal recognition tasks with limited memory and computing resources.
基金supported by National Social Science Foundation Annual Project“Research on Evaluation and Improvement Paths of Integrated Development of Disabled Persons”(Grant No.20BRK029)the National Language Commission’s“14th Five-Year Plan”Scientific Research Plan 2023 Project“Domain Digital Language Service Resource Construction and Key Technology Research”(YB145-72)the National Philosophy and Social Sciences Foundation(Grant No.20BTQ065).
文摘Research on Chinese Sign Language(CSL)provides convenience and support for individuals with hearing impairments to communicate and integrate into society.This article reviews the relevant literature on Chinese Sign Language Recognition(CSLR)in the past 20 years.Hidden Markov Models(HMM),Support Vector Machines(SVM),and Dynamic Time Warping(DTW)were found to be the most commonly employed technologies among traditional identificationmethods.Benefiting from the rapid development of computer vision and artificial intelligence technology,Convolutional Neural Networks(CNN),3D-CNN,YOLO,Capsule Network(CapsNet)and various deep neural networks have sprung up.Deep Neural Networks(DNNs)and their derived models are integral tomodern artificial intelligence recognitionmethods.In addition,technologies thatwerewidely used in the early days have also been integrated and applied to specific hybrid models and customized identification methods.Sign language data collection includes acquiring data from data gloves,data sensors(such as Kinect,LeapMotion,etc.),and high-definition photography.Meanwhile,facial expression recognition,complex background processing,and 3D sign language recognition have also attracted research interests among scholars.Due to the uniqueness and complexity of Chinese sign language,accuracy,robustness,real-time performance,and user independence are significant challenges for future sign language recognition research.Additionally,suitable datasets and evaluation criteria are also worth pursuing.
文摘Dynamic signature is a biometric modality that recognizes an individual’s anatomic and behavioural characteristics when signing their name. The rampant case of signature falsification (Identity Theft) was the key motivating factor for embarking on this study. This study was necessitated by the damages and dangers posed by signature forgery coupled with the intractable nature of the problem. The aim and objectives of this study is to design a proactive and responsive system that could compare two signature samples and detect the correct signature against the forged one. Dynamic Signature verification is an important biometric technique that aims to detect whether a given signature is genuine or forged. In this research work, Convolutional Neural Networks (CNNsor ConvNet) which is a class of deep, feed forward artificial neural networks that has successfully been applied to analysing visual imagery was used to train the model. The signature images are stored in a file directory structure which the Keras Python library can work with. Then the CNN was implemented in python using the Keras with the TensorFlow backend to learn the patterns associated with the signature. The result showed that for the same CNNs-based network experimental result of average accuracy, the larger the training dataset, the higher the test accuracy. However, when the training dataset are insufficient, better results can be obtained. The paper concluded that by training datasets using CNNs network, 98% accuracy in the result was recorded, in the experimental part, the model achieved a high degree of accuracy in the classification of the biometric parameters used.
文摘BACKGROUND Esophageal cancer is one of the most common malignant tumors.The three-dimensional quality structure model is a quality assessment theory that includes three dimensions:Structure,process,and results.AIM To investigate the effects of nursing interventions with three-dimensional quality assessment on the efficacy and disease management ability of patients undergoing esophageal cancer surgery.METHODS In this prospective study,the control group received routine nursing,and the intervention group additionally received a three-dimensional quality assessment intervention based on the above routine care.Self-efficacy and patient disease management abilities were evaluated using the General Self-Efficacy Scale(GSES)and Exercise of Self-Care Agency scale,respectively.IBM SPSS Statistics for Windows,version 17.0,was used for the data processing.RESULTS This study recruited 112 patients who were assigned to the control and experi-mental groups(n=56 per group).Before the intervention,there was no significant difference in GSES scores between the two groups(P>0.05).After the inter-vention,the GSES scores of both groups increased,with the experimental group showing higher values(P<0.05).At the time of discharge and three months after discharge,the scores for positive attitudes,self-stress reduction,and total score of health promotion in the experimental group were higher than those in the control group(P<0.05).CONCLUSION The implementation of a three-dimensional quality structure model for postoperative patients with esophageal cancer can effectively improve their self-management ability and self-efficacy of postoperative patients.
基金Supported by the National Natural Science Foundation of China(62072334).
文摘The hands and face are the most important parts for expressing sign language morphemes in sign language videos.However,we find that existing Continuous Sign Language Recognition(CSLR)methods lack the mining of hand and face information in visual backbones or use expensive and time-consuming external extractors to explore this information.In addition,the signs have different lengths,whereas previous CSLR methods typically use a fixed-length window to segment the video to capture sequential features and then perform global temporal modeling,which disturbs the perception of complete signs.In this study,we propose a Multi-Scale Context-Aware network(MSCA-Net)to solve the aforementioned problems.Our MSCA-Net contains two main modules:(1)Multi-Scale Motion Attention(MSMA),which uses the differences among frames to perceive information of the hands and face in multiple spatial scales,replacing the heavy feature extractors;and(2)Multi-Scale Temporal Modeling(MSTM),which explores crucial temporal information in the sign language video from different temporal scales.We conduct extensive experiments using three widely used sign language datasets,i.e.,RWTH-PHOENIX-Weather-2014,RWTH-PHOENIX-Weather-2014T,and CSL-Daily.The proposed MSCA-Net achieve state-of-the-art performance,demonstrating the effectiveness of our approach.
基金The National Natural Science Foundation of China (No.52165060,12272189)Program for Young Talents of Science and Technology in Universities of Inner Mongolia Autonomous Region: (NJYT23022)+2 种基金Science and Technology Projects of Inner Mongolia Autonomous Region: (2021GG0432)Central Guiding Local Science and Technology Development Plan (2022ZY0013)Basic research business fee project for universities directly under Inner Mongolia Autonomous Region (GXKY22046).
文摘Aiming at the problems that the simulation accuracy which is reduced due to the simplification of the model,a three-dimensional simulation method based on solid modeling is being proposed.By analyzing the motion relationship and positional relationship between the caries knife and the workpiece,the coordinate system of the caries machining was established.With the MATLAB software,the cutting edge model and the blade sweeping surface model of the boring cutter are sequentially established.Boolean operation is performed on the blade swept surface formed by the tooth cutter teeth with time t and the workpiece tooth geometry as well as the undeformed three-dimensional chip geometry model and the instantaneous cogging geometry model are obtained at different times.Through the compare between gear end face simulation tooth profile and the theoretical inner arc tooth profile,we verified the accuracy and rationality of the proposed method.
基金the support of the National Natural Science Foundation of China(Nos.41974073,41404053)the Macao Foundation and the pre-research project of Civil Aerospace Technologies(Nos.D020308 and D020303)+2 种基金funded by the National Space Administration of Chinathe opening fund of the State Key Laboratory of Lunar and Planetary Sciences(Macao University of Science and Technology,Macao Science and Technology Development Fund No.119/2017/A3)the Specialized Research Fund for State Key Laboratories,and the NUIST-UoR International Research Institute。
文摘We combined domestic ground-based and satellite magnetic measurements to create a regional three-dimensional surface Spline(3DSS)gradient model of the main geomagnetic field over the Chinese continent.To improve the precision of the model,we considered the data gap between the ground and satellite data.We compared and analyzed the results of the Taylor polynomial,surface Spline,and CHAOS-6(the CHAMP,?rsted and SAC-C model of Earth’s magnetic field)gradient models.Results showed that the gradients in the south-north and east-west directions of the four models were consistent.The 3DSS model was able to express not only gradients at different altitudes,but also average gradients inside the research area.The two Spline models were able to capture more information on gradient anomalies than were the fitted models.Strong local anomalies were observed in northern Xinjiang,Beijing,and the junction area between Jiangsu and Zhejiang,and the total intensity F decreased whereas the altitude increased.The gradient decreased by 21.69%in the south-north direction and increased by 11.78%in the east-west direction.In addition,the altitude gradient turned from negative to positive while the altitude increased.The Spline model and the two fitted models differed mainly in the field sources they expressed and the modeling theory.
文摘Objective To evaluate the predictive validity of IRIS™(Intuitive Surgical®,Sunnyvale,CA,USA)as a planning tool for robot-assisted partial nephrectomy(RAPN)by assessing the degree of overlap with intraoperative execution.Methods Thirty-one patients scheduled for RAPN by four experienced urologists were enrolled in a prospective study.Prior to surgery,urologists reviewed the IRIS™three-dimensional model on an iphone Operating System(iOS)app and completed a questionnaire outlining their surgical plan including surgical approach,and ischemia technique as well as confidence in executing this plan.Postoperatively,questionnaires assessing the procedural approach,clinical utility,efficiency,and effectiveness of IRIS™were completed.The degree of overlap between the preoperative and intraoperative questionnaires and between the planned approach and actual execution of the procedure was analyzed.Questionnaires were answered on a 5-point Likert scale and scores of 4 or greater were considered positive.Results Mean age was 65.1 years with a mean tumor size of 27.7 mm(interquartile range 17.5-44.0 mm).Hilar tumors consisted of 32.3%;48.4%of patients had R.E.N.A.L.nephrometry scores of 7-9.On preoperative questionnaires,the surgeons reported that in 67.7%cases they were confident that they can perform the procedure successfully,and on intraoperative questionnaires,the surgeons reported that in 96.8%cases IRIS™helped achieve good spatial sensation of the anatomy.There was a high degree of overlap between preoperative and intraoperative questionnaires for the surgical approach,interpreting anatomical details and clinical utility.When comparing plans for selective or off-clamp,the preoperative plan was executed in 90.0%of cases intraoperatively.Conclusion A high degree of overlap between the preoperative surgical approach and intraoperative RAPN execution was found using IRIS™.This is the first study to evaluate the predictive accuracy of IRIS™during RAPN by comparing preoperative plan and intraoperative execution.
基金supported by the National Natural Science Foundation of China under Grant[52077027]Liaoning Province Science and Technology Major Project[No.2020JH1/10100020].
文摘In this paper, the axial-flux permanent magnet driver is modeledand analyzed in a simple and novel way under three-dimensional cylindricalcoordinates. The inherent three-dimensional characteristics of the deviceare comprehensively considered, and the governing equations are solved bysimplifying the boundary conditions. The axial magnetization of the sectorshapedpermanent magnets is accurately described in an algebraic form bythe parameters, which makes the physical meaning more explicit than thepurely mathematical expression in general series forms. The parameters of theBessel function are determined simply and the magnetic field distribution ofpermanent magnets and the air-gap is solved. Furthermore, the field solutionsare completely analytical, which provides convenience and satisfactoryaccuracy for modeling a series of electromagnetic performance parameters,such as the axial electromagnetic force density, axial electromagnetic force,and electromagnetic torque. The correctness and accuracy of the analyticalmodels are fully verified by three-dimensional finite element simulations and a15 kW prototype and the results of calculations, simulations, and experimentsunder three methods are highly consistent. The influence of several designparameters on magnetic field distribution and performance is studied and discussed.The results indicate that the modeling method proposed in this papercan calculate the magnetic field distribution and performance accurately andrapidly, which affords an important reference for the design and optimizationof axial-flux permanent magnet drivers.
基金This study was co-supported by the National Key R&D Program of China(No.2021YFF0603904)National Natural Science Foundation of China(U1733203)Safety Capacity Building Project of Civil Aviation Administration of China(TM2019-16-1/3).
文摘This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition(ASR)technology in the Air Traffic Control(ATC)field.This paper presents a novel cascaded model architecture,namely Conformer-CTC/Attention-T5(CCAT),to build a highly accurate and robust ATC speech recognition model.To tackle the challenges posed by noise and fast speech rate in ATC,the Conformer model is employed to extract robust and discriminative speech representations from raw waveforms.On the decoding side,the Attention mechanism is integrated to facilitate precise alignment between input features and output characters.The Text-To-Text Transfer Transformer(T5)language model is also introduced to handle particular pronunciations and code-mixing issues,providing more accurate and concise textual output for downstream tasks.To enhance the model’s robustness,transfer learning and data augmentation techniques are utilized in the training strategy.The model’s performance is optimized by performing hyperparameter tunings,such as adjusting the number of attention heads,encoder layers,and the weights of the loss function.The experimental results demonstrate the significant contributions of data augmentation,hyperparameter tuning,and error correction models to the overall model performance.On the Our ATC Corpus dataset,the proposed model achieves a Character Error Rate(CER)of 3.44%,representing a 3.64%improvement compared to the baseline model.Moreover,the effectiveness of the proposed model is validated on two publicly available datasets.On the AISHELL-1 dataset,the CCAT model achieves a CER of 3.42%,showcasing a 1.23%improvement over the baseline model.Similarly,on the LibriSpeech dataset,the CCAT model achieves a Word Error Rate(WER)of 5.27%,demonstrating a performance improvement of 7.67%compared to the baseline model.Additionally,this paper proposes an evaluation criterion for assessing the robustness of ATC speech recognition systems.In robustness evaluation experiments based on this criterion,the proposed model demonstrates a performance improvement of 22%compared to the baseline model.
基金supported by Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.2022-0-00377,Development of Intelligent Analysis and Classification Based Contents Class Categorization Technique to Prevent Imprudent Harmful Media Distribution).
文摘Automatic speech recognition(ASR)systems have emerged as indispensable tools across a wide spectrum of applications,ranging from transcription services to voice-activated assistants.To enhance the performance of these systems,it is important to deploy efficient models capable of adapting to diverse deployment conditions.In recent years,on-demand pruning methods have obtained significant attention within the ASR domain due to their adaptability in various deployment scenarios.However,these methods often confront substantial trade-offs,particularly in terms of unstable accuracy when reducing the model size.To address challenges,this study introduces two crucial empirical findings.Firstly,it proposes the incorporation of an online distillation mechanism during on-demand pruning training,which holds the promise of maintaining more consistent accuracy levels.Secondly,it proposes the utilization of the Mogrifier long short-term memory(LSTM)language model(LM),an advanced iteration of the conventional LSTM LM,as an effective alternative for pruning targets within the ASR framework.Through rigorous experimentation on the ASR system,employing the Mogrifier LSTM LM and training it using the suggested joint on-demand pruning and online distillation method,this study provides compelling evidence.The results exhibit that the proposed methods significantly outperform a benchmark model trained solely with on-demand pruning methods.Impressively,the proposed strategic configuration successfully reduces the parameter count by approximately 39%,all the while minimizing trade-offs.
基金supported by Fujian Provincial Science and Technology Major Project(No.2020HZ02014)by the grants from National Natural Science Foundation of Fujian(2021J01133,2021J011404)by the Quanzhou Scientific and Technological Planning Projects(Nos.2018C113R,2019C028R,2019C029R,2019C076R and 2019C099R).
文摘Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.
基金Supported by Shaanxi Province Key Research and Development Project (2021GY-280)the National Natural Science Foundation of China (No.61834005,61772417,61802304)。
文摘Micro-expressions are spontaneous, unconscious movements that reveal true emotions.Accurate facial movement information and network training learning methods are crucial for micro-expression recognition.However, most existing micro-expression recognition technologies so far focus on modeling the single category of micro-expression images and neural network structure.Aiming at the problems of low recognition rate and weak model generalization ability in micro-expression recognition, a micro-expression recognition algorithm is proposed based on graph convolution network(GCN) and Transformer model.Firstly, action unit(AU) feature detection is extracted and facial muscle nodes in the neighborhood are divided into three subsets for recognition.Then, graph convolution layer is used to find the layout of dependencies between AU nodes of micro-expression classification.Finally, multiple attentional features of each facial action are enriched with Transformer model to include more sequence information before calculating the overall correlation of each region.The proposed method is validated in CASME II and CAS(ME)^2 datasets, and the recognition rate reached 69.85%.