The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources.In this paper,we propose an end-to-end(E2E)semantic molecular communication system,aim...The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources.In this paper,we propose an end-to-end(E2E)semantic molecular communication system,aiming to enhance the efficiency of molecular communication systems by reducing the transmitted information.Specifically,following the joint source channel coding paradigm,the network is designed to encode the task-relevant information into the concentration of the information molecules,which is robust to the degradation of the molecular communication channel.Furthermore,we propose a channel network to enable the E2E learning over the non-differentiable molecular channel.Experimental results demonstrate the superior performance of the semantic molecular communication system over the conventional methods in classification tasks.展开更多
Person search mainly consists of two submissions,namely Person Detection and Person Re-identification(reID).Existing approaches are primarily based on Faster R-CNN and Convolutional Neural Network(CNN)(e.g.,ResNet).Wh...Person search mainly consists of two submissions,namely Person Detection and Person Re-identification(reID).Existing approaches are primarily based on Faster R-CNN and Convolutional Neural Network(CNN)(e.g.,ResNet).While these structures may detect high-quality bounding boxes,they seem to degrade the performance of re-ID.To address this issue,this paper proposes a Dual-Transformer Head Network(DTHN)for end-to-end person search,which contains two independent Transformer heads,a box head for detecting the bounding box and extracting efficient bounding box feature,and a re-ID head for capturing high-quality re-ID features for the re-ID task.Specifically,after the image goes through the ResNet backbone network to extract features,the Region Proposal Network(RPN)proposes possible bounding boxes.The box head then extracts more efficient features within these bounding boxes for detection.Following this,the re-ID head computes the occluded attention of the features in these bounding boxes and distinguishes them from other persons or backgrounds.Extensive experiments on two widely used benchmark datasets,CUHK-SYSU and PRW,achieve state-of-the-art performance levels,94.9 mAP and 95.3 top-1 scores on the CUHK-SYSU dataset,and 51.6 mAP and 87.6 top-1 scores on the PRW dataset,which demonstrates the advantages of this paper’s approach.The efficiency comparison also shows our method is highly efficient in both time and space.展开更多
Environment perception is one of the most critical technology of intelligent transportation systems(ITS).Motion interaction between multiple vehicles in ITS makes it important to perform multi-object tracking(MOT).How...Environment perception is one of the most critical technology of intelligent transportation systems(ITS).Motion interaction between multiple vehicles in ITS makes it important to perform multi-object tracking(MOT).However,most existing MOT algorithms follow the tracking-by-detection framework,which separates detection and tracking into two independent segments and limit the global efciency.Recently,a few algorithms have combined feature extraction into one network;however,the tracking portion continues to rely on data association,and requires com‑plex post-processing for life cycle management.Those methods do not combine detection and tracking efciently.This paper presents a novel network to realize joint multi-object detection and tracking in an end-to-end manner for ITS,named as global correlation network(GCNet).Unlike most object detection methods,GCNet introduces a global correlation layer for regression of absolute size and coordinates of bounding boxes,instead of ofsetting predictions.The pipeline of detection and tracking in GCNet is conceptually simple,and does not require compli‑cated tracking strategies such as non-maximum suppression and data association.GCNet was evaluated on a multivehicle tracking dataset,UA-DETRAC,demonstrating promising performance compared to state-of-the-art detectors and trackers.展开更多
Interdisciplinary applications between information technology and geriatrics have been accelerated in recent years by the advancement of artificial intelligence,cloud computing,and 5G technology,among others.Meanwhile...Interdisciplinary applications between information technology and geriatrics have been accelerated in recent years by the advancement of artificial intelligence,cloud computing,and 5G technology,among others.Meanwhile,applications developed by using the above technologies make it possible to predict the risk of age-related diseases early,which can give caregivers time to intervene and reduce the risk,potentially improving the health span of the elderly.However,the popularity of these applications is still limited for several reasons.For example,many older people are unable or unwilling to use mobile applications or devices(e.g.smartphones)because they are relatively complex operations or time-consuming for older people.In this work,we design and implement an end-to-end framework and integrate it with the WeChat platform to make it easily accessible to elders.In this work,multifactorial geriatric assessment data can be collected.Then,stacked machine learning models are trained to assess and predict the incidence of common diseases in the elderly.Experimental results show that our framework can not only provide more accurate prediction(precision:0.8713,recall:0.8212)for several common elderly diseases,but also very low timeconsuming(28.6 s)within a workflow compared to some existing similar applications.展开更多
The end-to-end separation algorithm with superior performance in the field of speech separation has not been effectively used in music separation.Moreover,since music signals are often dual channel data with a high sa...The end-to-end separation algorithm with superior performance in the field of speech separation has not been effectively used in music separation.Moreover,since music signals are often dual channel data with a high sampling rate,how to model longsequence data and make rational use of the relevant information between channels is also an urgent problem to be solved.In order to solve the above problems,the performance of the end-to-end music separation algorithm is enhanced by improving the network structure.Our main contributions include the following:(1)A more reasonable densely connected U-Net is designed to capture the long-term characteristics of music,such as main melody,tone and so on.(2)On this basis,the multi-head attention and dualpath transformer are introduced in the separation module.Channel attention units are applied recursively on the feature map of each layer of the network,enabling the network to perform long-sequence separation.Experimental results show that after the introduction of the channel attention,the performance of the proposed algorithm has a stable improvement compared with the baseline system.On the MUSDB18 dataset,the average score of the separated audio exceeds that of the current best-performing music separation algorithm based on the time-frequency domain(T-F domain).展开更多
With the rapid development of deep learning methods, the data-driven approach has shown powerful advantages over the model-driven one. In this paper, we propose an end-to-end autoencoder communication system based on ...With the rapid development of deep learning methods, the data-driven approach has shown powerful advantages over the model-driven one. In this paper, we propose an end-to-end autoencoder communication system based on Deep Residual Shrinkage Networks (DRSNs), where neural networks (DNNs) are used to implement the coding, decoding, modulation and demodulation functions of the communication system. Our proposed autoencoder communication system can better reduce the signal noise by adding an “attention mechanism” and “soft thresholding” modules and has better performance at various signal-to-noise ratios (SNR). Also, we have shown through comparative experiments that the system can operate at moderate block lengths and support different throughputs. It has been shown to work efficiently in the AWGN channel. Simulation results show that our model has a higher Bit-Error-Rate (BER) gain and greatly improved decoding performance compared to conventional modulation and classical autoencoder systems at various signal-to-noise ratios.展开更多
In smart classrooms, conducting multi-face expression recognition based on existing hardware devices to assessstudents’ group emotions can provide educators with more comprehensive and intuitive classroom effect anal...In smart classrooms, conducting multi-face expression recognition based on existing hardware devices to assessstudents’ group emotions can provide educators with more comprehensive and intuitive classroom effect analysis,thereby continuouslypromotingthe improvementof teaching quality.However,most existingmulti-face expressionrecognition methods adopt a multi-stage approach, with an overall complex process, poor real-time performance,and insufficient generalization ability. In addition, the existing facial expression datasets are mostly single faceimages, which are of low quality and lack specificity, also restricting the development of this research. This paperaims to propose an end-to-end high-performance multi-face expression recognition algorithm model suitable forsmart classrooms, construct a high-quality multi-face expression dataset to support algorithm research, and applythe model to group emotion assessment to expand its application value. To this end, we propose an end-to-endmulti-face expression recognition algorithm model for smart classrooms (E2E-MFERC). In order to provide highqualityand highly targeted data support for model research, we constructed a multi-face expression dataset inreal classrooms (MFED), containing 2,385 images and a total of 18,712 expression labels, collected from smartclassrooms. In constructing E2E-MFERC, by introducing Re-parameterization visual geometry group (RepVGG)block and symmetric positive definite convolution (SPD-Conv) modules to enhance representational capability;combined with the cross stage partial network fusion module optimized by attention mechanism (C2f_Attention),it strengthens the ability to extract key information;adopts asymptotic feature pyramid network (AFPN) featurefusion tailored to classroomscenes and optimizes the head prediction output size;achieves high-performance endto-end multi-face expression detection. Finally, we apply the model to smart classroom group emotion assessmentand provide design references for classroom effect analysis evaluation metrics. Experiments based on MFED showthat the mAP and F1-score of E2E-MFERC on classroom evaluation data reach 83.6% and 0.77, respectively,improving the mAP of same-scale You Only Look Once version 5 (YOLOv5) and You Only Look Once version8 (YOLOv8) by 6.8% and 2.5%, respectively, and the F1-score by 0.06 and 0.04, respectively. E2E-MFERC modelhas obvious advantages in both detection speed and accuracy, which can meet the practical needs of real-timemulti-face expression analysis in classrooms, and serve the application of teaching effect assessment very well.展开更多
6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is...6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is leveraged to enhance computer vision applications’security,trustworthiness,and transparency.With the widespread use of mobile devices equipped with cameras,the ability to capture and recognize Chinese characters in natural scenes has become increasingly important.Blockchain can facilitate privacy-preserving mechanisms in applications where privacy is paramount,such as facial recognition or personal healthcare monitoring.Users can control their visual data and grant or revoke access as needed.Recognizing Chinese characters from images can provide convenience in various aspects of people’s lives.However,traditional Chinese character text recognition methods often need higher accuracy,leading to recognition failures or incorrect character identification.In contrast,computer vision technologies have significantly improved image recognition accuracy.This paper proposed a Secure end-to-end recognition system(SE2ERS)for Chinese characters in natural scenes based on convolutional neural networks(CNN)using 6G technology.The proposed SE2ERS model uses the Weighted Hyperbolic Curve Cryptograph(WHCC)of the secure data transmission in the 6G network with the blockchain model.The data transmission within the computer vision system,with a 6G gradient directional histogram(GDH),is employed for character estimation.With the deployment of WHCC and GDH in the constructed SE2ERS model,secure communication is achieved for the data transmission with the 6G network.The proposed SE2ERS compares the performance of traditional Chinese text recognition methods and data transmission environment with 6G communication.Experimental results demonstrate that SE2ERS achieves an average recognition accuracy of 88%for simple Chinese characters,compared to 81.2%with traditional methods.For complex Chinese characters,the average recognition accuracy improves to 84.4%with our system,compared to 72.8%with traditional methods.Additionally,deploying the WHCC model improves data security with the increased data encryption rate complexity of∼12&higher than the traditional techniques.展开更多
Distinct brain remodeling has been found after different nerve reconstruction strategies,including motor representation of the affected limb.However,differences among reconstruction strategies at the brain network lev...Distinct brain remodeling has been found after different nerve reconstruction strategies,including motor representation of the affected limb.However,differences among reconstruction strategies at the brain network level have not been elucidated.This study aimed to explore intranetwork changes related to altered peripheral neural pathways after different nerve reconstruction surgeries,including nerve repair,endto-end nerve transfer,and end-to-side nerve transfer.Sprague–Dawley rats underwent complete left brachial plexus transection and were divided into four equal groups of eight:no nerve repair,grafted nerve repair,phrenic nerve end-to-end transfer,and end-to-side transfer with a graft sutured to the anterior upper trunk.Resting-state brain functional magnetic resonance imaging was obtained 7 months after surgery.The independent component analysis algorithm was utilized to identify group-level network components of interest and extract resting-state functional connectivity values of each voxel within the component.Alterations in intra-network resting-state functional connectivity were compared among the groups.Target muscle reinnervation was assessed by behavioral observation(elbow flexion)and electromyography.The results showed that alterations in the sensorimotor and interoception networks were mostly related to changes in the peripheral neural pathway.Nerve repair was related to enhanced connectivity within the sensorimotor network,while end-to-side nerve transfer might be more beneficial for restoring control over the affected limb by the original motor representation.The thalamic-cortical pathway was enhanced within the interoception network after nerve repair and end-to-end nerve transfer.Brain areas related to cognition and emotion were enhanced after end-to-side nerve transfer.Our study revealed important brain networks related to different nerve reconstructions.These networks may be potential targets for enhancing motor recovery.展开更多
With the advent of deep learning,self-driving schemes based on deep learning are becoming more and more popular.Robust perception-action models should learn from data with different scenarios and real behaviors,while ...With the advent of deep learning,self-driving schemes based on deep learning are becoming more and more popular.Robust perception-action models should learn from data with different scenarios and real behaviors,while current end-to-end model learning is generally limited to training of massive data,innovation of deep network architecture,and learning in-situ model in a simulation environment.Therefore,we introduce a new image style transfer method into data augmentation,and improve the diversity of limited data by changing the texture,contrast ratio and color of the image,and then it is extended to the scenarios that the model has been unobserved before.Inspired by rapid style transfer and artistic style neural algorithms,we propose an arbitrary style generation network architecture,including style transfer network,style learning network,style loss network and multivariate Gaussian distribution function.The style embedding vector is randomly sampled from the multivariate Gaussian distribution and linearly interpolated with the embedded vector predicted by the input image on the style learning network,which provides a set of normalization constants for the style transfer network,and finally realizes the diversity of the image style.In order to verify the effectiveness of the method,image classification and simulation experiments were performed separately.Finally,we built a small-sized smart car experiment platform,and apply the data augmentation technology based on image style transfer drive to the experiment of automatic driving for the first time.The experimental results show that:(1)The proposed scheme can improve the prediction accuracy of the end-to-end model and reduce the model’s error accumulation;(2)the method based on image style transfer provides a new scheme for data augmentation technology,and also provides a solution for the high cost that many deep models rely heavily on a large number of label data.展开更多
Capacity reduction is a major problem faced by wireless mesh networks. An efficient way to alleviate this problem is proper channel assignment. Current end-toend channel assignment schemes usually focus on the case wh...Capacity reduction is a major problem faced by wireless mesh networks. An efficient way to alleviate this problem is proper channel assignment. Current end-toend channel assignment schemes usually focus on the case where channels in distinct frequency bands are assigned to mesh access and backbone, but actually backbone network and access network can use the same IEEE 802.11 technology. Besides, these channel assignment schemes only utilize orthogonal channels to perform channel assignment, and the resulting network interference dramatically degrades network performance. Moreover, Internet-oriented traffic is considered only, and peerto-peer traffic is omitted, or vice versa. The traffic type does not match the practical network. In this paper, we explore how to exploit partially overlapped channels to perform endto-end channel assignment in order to achieve effective end-to-end flow transmissions. The proposed flow-based end-to-end channel assignment schemes can conquer the limitations aforementioned. Simulations reveal that loadaware channel assignment can be applied to networks with stable traffic load, and it can achieve near-optimal performance; Traffic-irrelevant channel assignment is suitable for networks with frequent change of traffic load,and it can achieve good balance between performance and overhead. Also, partially overlapped channels' capability of improving network performance is situation-dependent, they should be used carefully.展开更多
End-to-end repair under no or low tension leads to improved outcomes for transected nerves with short gaps,compared to repairs with a graft.However,grafts are typically used to enable a tension-free repair for moderat...End-to-end repair under no or low tension leads to improved outcomes for transected nerves with short gaps,compared to repairs with a graft.However,grafts are typically used to enable a tension-free repair for moderate to large gaps,as excessive tension can cause repairs to fail and catastrophically impede recovery.In this study,we tested the hypothesis that unloading the repair interface by redistributing tension away from the site of repair is a safe and feasible strategy for end-to-end repair of larger nerve gaps.Further,we tested the hypothesis that such an approach does not adversely affect structural and functional regeneration.In this study,we used a rat sciatic nerve injury model to compare the integrity of repair and several regenerative outcomes following end-to-end repairs of nerve gaps of increasing size.In addition,we proposed the use of a novel implantable device to safely repair end-to-end repair of larger nerve gaps by redistributing tension away from the repair interface.Our data suggest that redistriubution of tension away from the site of repair enables safe end-to-end repair of larger gap sizes.In addition,structural and functional measures of regeneration were equal or enhanced in nerves repaired under tension – with or without a tension redistribution device – compared to tension-free repairs.Provided that repair integrity is maintained,end-to-end repairs under tension should be considered as a reasonable surgical strategy.All animal experiments were performed under the approval of the Institutional Animal Care and Use Committee of University of California,San Diego(Protocol S11274).展开更多
End-to-end delay is one of the most important characteristics of Internet end-to-end packet dynamics, which can be applied to quality of services (OoS) management, service level agreement (SLA) management, congest...End-to-end delay is one of the most important characteristics of Internet end-to-end packet dynamics, which can be applied to quality of services (OoS) management, service level agreement (SLA) management, congestion control algorithm development, etc. Nonstationarity and nonlinearity are found by the analysis of various delay series measured from different links. The fact that different types of links have different degree of Self-Similarity is also obtained. By constructing appropriate network architecture and neural functions, functional networks can be used to model the Internet end-to-end nonlinear delay time series. Furthermore, by using adaptive parameter studying algorithm, the nonstationarity can also be well modeled. The numerical results show that the provided functional network architecture and adaptive algorithm can precisely characterize the Internet end-to-end delay dynamics.展开更多
With the emergence of large-scale knowledge base,how to use triple information to generate natural questions is a key technology in question answering systems.The traditional way of generating questions require a lot ...With the emergence of large-scale knowledge base,how to use triple information to generate natural questions is a key technology in question answering systems.The traditional way of generating questions require a lot of manual intervention and produce lots of noise.To solve these problems,we propose a joint model based on semi-automated model and End-to-End neural network to automatically generate questions.The semi-automated model can generate question templates and real questions combining the knowledge base and center graph.The End-to-End neural network directly sends the knowledge base and real questions to BiLSTM network.Meanwhile,the attention mechanism is utilized in the decoding layer,which makes the triples and generated questions more relevant.Finally,the experimental results on SimpleQuestions demonstrate the effectiveness of the proposed approach.展开更多
This paper presents the multi-step Q-learning(MQL)algorithm as an autonomic approach to thejoint radio resource management(JRRM)among heterogeneous radio access technologies(RATs)in theB3G environment.Through the'...This paper presents the multi-step Q-learning(MQL)algorithm as an autonomic approach to thejoint radio resource management(JRRM)among heterogeneous radio access technologies(RATs)in theB3G environment.Through the'trial-and-error'on-line learning process,the JRRM controller can con-verge to the optimized admission control policy.The JRRM controller learns to give the best allocation foreach session in terms of both the access RAT and the service bandwidth.Simulation results show that theproposed algorithm realizes the autonomy of JRRM and achieves well trade-off between the spectrum utilityand the blocking probability comparing to the load-balancing algorithm and the utility-maximizing algo-rithm.Besides,the proposed algorithm has better online performances and convergence speed than theone-step Q-learning(QL)algorithm.Therefore,the user statisfaction degree could be improved also.展开更多
An adaptive joint source channel bit allocation method for video communications over error-prone channel is proposed.To protect the bit-streams from the channel bit errors,the rate compatible punctured convolution(RCP...An adaptive joint source channel bit allocation method for video communications over error-prone channel is proposed.To protect the bit-streams from the channel bit errors,the rate compatible punctured convolution(RCPC)code is used to produce coding rates varying from 4/5 to 1/2 using the same encoder and the Viterbi decoder.An expected end-to-end distortion model was presented to estimate the distortion introduced in compressed source coding due to quantization and channel bit errors jointly.Based on the proposed end-to-end distortion model,an adaptive joint source-channel bit allocation method was proposed under time-varying error-prone channel conditions.Simulated results show that the proposed methods could utilize the available channel capacity more efficiently and achieve better video quality than the other fixed coding-based bit allocation methods when transmitting over error-prone channels.展开更多
The ubiquity of instant messaging services on mobile devices and their use of end-to-end encryption in safeguarding the privacy of their users have become a concern for some governments. WhatsApp messaging service has...The ubiquity of instant messaging services on mobile devices and their use of end-to-end encryption in safeguarding the privacy of their users have become a concern for some governments. WhatsApp messaging service has emerged as the most popular messaging app on mobile devices today. It uses end-to-end encryption which makes government and secret services efforts to combat organized crime, terrorists, and child pornographers technically impossible. Governments would like a “backdoor” into such apps, to use in accessing messages and have emphasized that they will only use the “backdoor” if there is a credible threat to national security. Users of WhatsApp have however, argued against a “backdoor”;they claim a “backdoor” would not only be an infringement of their privacy, but that hackers could also take advantage of it. In light of this security and privacy conflict between the end users of WhatsApp and government’s need to access messages in order to thwart potential terror attacks, this paper presents the advantages of maintaining E2EE in WhatsApp and why governments should not be allowed a “backdoor” to access users’ messages. This research presents the benefits encryption has on consumer security and privacy, and also on the challenges it poses to public safety and national security.展开更多
We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning...We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning is fast. Compared withConvolutional Neural Network, it has a simpler and understood structure and lessparameters to learn. Experimental results show that the advantage of hybridLRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classificationarchitecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN ishelpful to differentiate among multiple language speech sets.展开更多
Saccular extended obstruction is generated when the anastomotic site of functional end-to-end anastomosis is extended saccularly and blocked by intestinal contents. This is a specific complication of functional end-to...Saccular extended obstruction is generated when the anastomotic site of functional end-to-end anastomosis is extended saccularly and blocked by intestinal contents. This is a specific complication of functional end-to-end anastomosis. Saccular extended obstruction of the anastomotic site of func-tional end-to-end anastomosis causes postoperative intestinal obstruction. Saccular extended obstruction places a heavy burden on patients because surgery is necessary for treatment of intestinal obstruction due to saccular extended obstruction. However, saccular extended obstruction is not a commonly recognized complication. The greatest factor contributing to the development of saccular extended obstruction is an acute angle between the portions of the intestinal tract oral and aboral to the anastomotic site. When this angle approaches obtuse angle, preferably close to a straight line, stagnation of the intestinal contents does not occur at the anastomotic site of functional end-to-end anastomosis and saccular extended obstruction is avoided. For making the angle of anastomotic intestinal tracts obtuse or straight, it may be effective that the entry hole of stapling suture instrument creating the anastomotic stoma is closed perpendicular to the intestinal axis.展开更多
Network calculus provides new tools for performance analysis of networks, but analyzing networks with complex topologies is a challenging research issue using statistical network calculus. A service model is proposed ...Network calculus provides new tools for performance analysis of networks, but analyzing networks with complex topologies is a challenging research issue using statistical network calculus. A service model is proposed to characterize a service process of network with complex topologies. To obtain closed-form expression of statistical end-to-end performance bounds for a wide range of traffic source models, the traffic model and service model are expanded according to error function. Based on the proposed models, the explicit end-to-end delay bound of Fractional Brownian Motion(FBM) traffic is derived, the factors that affect the delay bound are analyzed, and a comparison between theoretical and simulation results is performed. The results illustrate that the proposed models not only fit the network behaviors well, but also facilitate the network performance analysis.展开更多
基金supported by the Beijing Natural Science Foundation(L211012)the Natural Science Foundation of China(62122012,62221001)the Fundamental Research Funds for the Central Universities(2022JBQY004)。
文摘The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources.In this paper,we propose an end-to-end(E2E)semantic molecular communication system,aiming to enhance the efficiency of molecular communication systems by reducing the transmitted information.Specifically,following the joint source channel coding paradigm,the network is designed to encode the task-relevant information into the concentration of the information molecules,which is robust to the degradation of the molecular communication channel.Furthermore,we propose a channel network to enable the E2E learning over the non-differentiable molecular channel.Experimental results demonstrate the superior performance of the semantic molecular communication system over the conventional methods in classification tasks.
基金supported by the Natural Science Foundation of Shanghai under Grant 21ZR1426500the National Natural Science Foundation of China under Grant 61873160.
文摘Person search mainly consists of two submissions,namely Person Detection and Person Re-identification(reID).Existing approaches are primarily based on Faster R-CNN and Convolutional Neural Network(CNN)(e.g.,ResNet).While these structures may detect high-quality bounding boxes,they seem to degrade the performance of re-ID.To address this issue,this paper proposes a Dual-Transformer Head Network(DTHN)for end-to-end person search,which contains two independent Transformer heads,a box head for detecting the bounding box and extracting efficient bounding box feature,and a re-ID head for capturing high-quality re-ID features for the re-ID task.Specifically,after the image goes through the ResNet backbone network to extract features,the Region Proposal Network(RPN)proposes possible bounding boxes.The box head then extracts more efficient features within these bounding boxes for detection.Following this,the re-ID head computes the occluded attention of the features in these bounding boxes and distinguishes them from other persons or backgrounds.Extensive experiments on two widely used benchmark datasets,CUHK-SYSU and PRW,achieve state-of-the-art performance levels,94.9 mAP and 95.3 top-1 scores on the CUHK-SYSU dataset,and 51.6 mAP and 87.6 top-1 scores on the PRW dataset,which demonstrates the advantages of this paper’s approach.The efficiency comparison also shows our method is highly efficient in both time and space.
基金Supported by National Key Research and Development Program of China(Grant No.2021YFB1600402)National Natural Science Foundation of China(Grant No.52072212)+1 种基金Dongfeng USharing Technology Co.,Ltd.,China Intelli‑gent and Connected Vehicles(Beijing)Research Institute Co.,Ltd.“Shuimu Tsinghua Scholarship”of Tsinghua University of China.
文摘Environment perception is one of the most critical technology of intelligent transportation systems(ITS).Motion interaction between multiple vehicles in ITS makes it important to perform multi-object tracking(MOT).However,most existing MOT algorithms follow the tracking-by-detection framework,which separates detection and tracking into two independent segments and limit the global efciency.Recently,a few algorithms have combined feature extraction into one network;however,the tracking portion continues to rely on data association,and requires com‑plex post-processing for life cycle management.Those methods do not combine detection and tracking efciently.This paper presents a novel network to realize joint multi-object detection and tracking in an end-to-end manner for ITS,named as global correlation network(GCNet).Unlike most object detection methods,GCNet introduces a global correlation layer for regression of absolute size and coordinates of bounding boxes,instead of ofsetting predictions.The pipeline of detection and tracking in GCNet is conceptually simple,and does not require compli‑cated tracking strategies such as non-maximum suppression and data association.GCNet was evaluated on a multivehicle tracking dataset,UA-DETRAC,demonstrating promising performance compared to state-of-the-art detectors and trackers.
基金supported by Xi’an University of Finance and Economics Scientific Research Support Program(No.21FCZD03)Shaanxi Education Department Research Program(No.22JK0077)National Statistical Science Research Project(Nos.2021LZ40,2022LZ38)。
文摘Interdisciplinary applications between information technology and geriatrics have been accelerated in recent years by the advancement of artificial intelligence,cloud computing,and 5G technology,among others.Meanwhile,applications developed by using the above technologies make it possible to predict the risk of age-related diseases early,which can give caregivers time to intervene and reduce the risk,potentially improving the health span of the elderly.However,the popularity of these applications is still limited for several reasons.For example,many older people are unable or unwilling to use mobile applications or devices(e.g.smartphones)because they are relatively complex operations or time-consuming for older people.In this work,we design and implement an end-to-end framework and integrate it with the WeChat platform to make it easily accessible to elders.In this work,multifactorial geriatric assessment data can be collected.Then,stacked machine learning models are trained to assess and predict the incidence of common diseases in the elderly.Experimental results show that our framework can not only provide more accurate prediction(precision:0.8713,recall:0.8212)for several common elderly diseases,but also very low timeconsuming(28.6 s)within a workflow compared to some existing similar applications.
基金National Natural Science Foundation of China,Grant/Award Number:62071039Beijing Natural Science Foundation,Grant/Award Number:L223033。
文摘The end-to-end separation algorithm with superior performance in the field of speech separation has not been effectively used in music separation.Moreover,since music signals are often dual channel data with a high sampling rate,how to model longsequence data and make rational use of the relevant information between channels is also an urgent problem to be solved.In order to solve the above problems,the performance of the end-to-end music separation algorithm is enhanced by improving the network structure.Our main contributions include the following:(1)A more reasonable densely connected U-Net is designed to capture the long-term characteristics of music,such as main melody,tone and so on.(2)On this basis,the multi-head attention and dualpath transformer are introduced in the separation module.Channel attention units are applied recursively on the feature map of each layer of the network,enabling the network to perform long-sequence separation.Experimental results show that after the introduction of the channel attention,the performance of the proposed algorithm has a stable improvement compared with the baseline system.On the MUSDB18 dataset,the average score of the separated audio exceeds that of the current best-performing music separation algorithm based on the time-frequency domain(T-F domain).
文摘With the rapid development of deep learning methods, the data-driven approach has shown powerful advantages over the model-driven one. In this paper, we propose an end-to-end autoencoder communication system based on Deep Residual Shrinkage Networks (DRSNs), where neural networks (DNNs) are used to implement the coding, decoding, modulation and demodulation functions of the communication system. Our proposed autoencoder communication system can better reduce the signal noise by adding an “attention mechanism” and “soft thresholding” modules and has better performance at various signal-to-noise ratios (SNR). Also, we have shown through comparative experiments that the system can operate at moderate block lengths and support different throughputs. It has been shown to work efficiently in the AWGN channel. Simulation results show that our model has a higher Bit-Error-Rate (BER) gain and greatly improved decoding performance compared to conventional modulation and classical autoencoder systems at various signal-to-noise ratios.
基金the Science and Technology Project of State Grid Corporation of China under Grant No.5700-202318292A-1-1-ZN.
文摘In smart classrooms, conducting multi-face expression recognition based on existing hardware devices to assessstudents’ group emotions can provide educators with more comprehensive and intuitive classroom effect analysis,thereby continuouslypromotingthe improvementof teaching quality.However,most existingmulti-face expressionrecognition methods adopt a multi-stage approach, with an overall complex process, poor real-time performance,and insufficient generalization ability. In addition, the existing facial expression datasets are mostly single faceimages, which are of low quality and lack specificity, also restricting the development of this research. This paperaims to propose an end-to-end high-performance multi-face expression recognition algorithm model suitable forsmart classrooms, construct a high-quality multi-face expression dataset to support algorithm research, and applythe model to group emotion assessment to expand its application value. To this end, we propose an end-to-endmulti-face expression recognition algorithm model for smart classrooms (E2E-MFERC). In order to provide highqualityand highly targeted data support for model research, we constructed a multi-face expression dataset inreal classrooms (MFED), containing 2,385 images and a total of 18,712 expression labels, collected from smartclassrooms. In constructing E2E-MFERC, by introducing Re-parameterization visual geometry group (RepVGG)block and symmetric positive definite convolution (SPD-Conv) modules to enhance representational capability;combined with the cross stage partial network fusion module optimized by attention mechanism (C2f_Attention),it strengthens the ability to extract key information;adopts asymptotic feature pyramid network (AFPN) featurefusion tailored to classroomscenes and optimizes the head prediction output size;achieves high-performance endto-end multi-face expression detection. Finally, we apply the model to smart classroom group emotion assessmentand provide design references for classroom effect analysis evaluation metrics. Experiments based on MFED showthat the mAP and F1-score of E2E-MFERC on classroom evaluation data reach 83.6% and 0.77, respectively,improving the mAP of same-scale You Only Look Once version 5 (YOLOv5) and You Only Look Once version8 (YOLOv8) by 6.8% and 2.5%, respectively, and the F1-score by 0.06 and 0.04, respectively. E2E-MFERC modelhas obvious advantages in both detection speed and accuracy, which can meet the practical needs of real-timemulti-face expression analysis in classrooms, and serve the application of teaching effect assessment very well.
基金supported by the Inner Mongolia Natural Science Fund Project(2019MS06013)Ordos Science and Technology Plan Project(2022YY041)Hunan Enterprise Science and Technology Commissioner Program(2021GK5042).
文摘6G is envisioned as the next generation of wireless communication technology,promising unprecedented data speeds,ultra-low Latency,and ubiquitous Connectivity.In tandem with these advancements,blockchain technology is leveraged to enhance computer vision applications’security,trustworthiness,and transparency.With the widespread use of mobile devices equipped with cameras,the ability to capture and recognize Chinese characters in natural scenes has become increasingly important.Blockchain can facilitate privacy-preserving mechanisms in applications where privacy is paramount,such as facial recognition or personal healthcare monitoring.Users can control their visual data and grant or revoke access as needed.Recognizing Chinese characters from images can provide convenience in various aspects of people’s lives.However,traditional Chinese character text recognition methods often need higher accuracy,leading to recognition failures or incorrect character identification.In contrast,computer vision technologies have significantly improved image recognition accuracy.This paper proposed a Secure end-to-end recognition system(SE2ERS)for Chinese characters in natural scenes based on convolutional neural networks(CNN)using 6G technology.The proposed SE2ERS model uses the Weighted Hyperbolic Curve Cryptograph(WHCC)of the secure data transmission in the 6G network with the blockchain model.The data transmission within the computer vision system,with a 6G gradient directional histogram(GDH),is employed for character estimation.With the deployment of WHCC and GDH in the constructed SE2ERS model,secure communication is achieved for the data transmission with the 6G network.The proposed SE2ERS compares the performance of traditional Chinese text recognition methods and data transmission environment with 6G communication.Experimental results demonstrate that SE2ERS achieves an average recognition accuracy of 88%for simple Chinese characters,compared to 81.2%with traditional methods.For complex Chinese characters,the average recognition accuracy improves to 84.4%with our system,compared to 72.8%with traditional methods.Additionally,deploying the WHCC model improves data security with the increased data encryption rate complexity of∼12&higher than the traditional techniques.
基金supported by the National Natural Science Foundation of China,Nos.81871836(to MZ),82172554(to XH),and 81802249(to XH),81902301(to JW)the National Key R&D Program of China,Nos.2018YFC2001600(to JX)and 2018YFC2001604(to JX)+3 种基金Shanghai Rising Star Program,No.19QA1409000(to MZ)Shanghai Municipal Commission of Health and Family Planning,No.2018YQ02(to MZ)Shanghai Youth Top Talent Development PlanShanghai“Rising Stars of Medical Talent”Youth Development Program,No.RY411.19.01.10(to XH)。
文摘Distinct brain remodeling has been found after different nerve reconstruction strategies,including motor representation of the affected limb.However,differences among reconstruction strategies at the brain network level have not been elucidated.This study aimed to explore intranetwork changes related to altered peripheral neural pathways after different nerve reconstruction surgeries,including nerve repair,endto-end nerve transfer,and end-to-side nerve transfer.Sprague–Dawley rats underwent complete left brachial plexus transection and were divided into four equal groups of eight:no nerve repair,grafted nerve repair,phrenic nerve end-to-end transfer,and end-to-side transfer with a graft sutured to the anterior upper trunk.Resting-state brain functional magnetic resonance imaging was obtained 7 months after surgery.The independent component analysis algorithm was utilized to identify group-level network components of interest and extract resting-state functional connectivity values of each voxel within the component.Alterations in intra-network resting-state functional connectivity were compared among the groups.Target muscle reinnervation was assessed by behavioral observation(elbow flexion)and electromyography.The results showed that alterations in the sensorimotor and interoception networks were mostly related to changes in the peripheral neural pathway.Nerve repair was related to enhanced connectivity within the sensorimotor network,while end-to-side nerve transfer might be more beneficial for restoring control over the affected limb by the original motor representation.The thalamic-cortical pathway was enhanced within the interoception network after nerve repair and end-to-end nerve transfer.Brain areas related to cognition and emotion were enhanced after end-to-side nerve transfer.Our study revealed important brain networks related to different nerve reconstructions.These networks may be potential targets for enhancing motor recovery.
基金the National Natural Science Foundation of China(51965008)Science and Technology projects of Guizhou[2018]2168Excellent Young Researcher Project of Guizhou[2017]5630.
文摘With the advent of deep learning,self-driving schemes based on deep learning are becoming more and more popular.Robust perception-action models should learn from data with different scenarios and real behaviors,while current end-to-end model learning is generally limited to training of massive data,innovation of deep network architecture,and learning in-situ model in a simulation environment.Therefore,we introduce a new image style transfer method into data augmentation,and improve the diversity of limited data by changing the texture,contrast ratio and color of the image,and then it is extended to the scenarios that the model has been unobserved before.Inspired by rapid style transfer and artistic style neural algorithms,we propose an arbitrary style generation network architecture,including style transfer network,style learning network,style loss network and multivariate Gaussian distribution function.The style embedding vector is randomly sampled from the multivariate Gaussian distribution and linearly interpolated with the embedded vector predicted by the input image on the style learning network,which provides a set of normalization constants for the style transfer network,and finally realizes the diversity of the image style.In order to verify the effectiveness of the method,image classification and simulation experiments were performed separately.Finally,we built a small-sized smart car experiment platform,and apply the data augmentation technology based on image style transfer drive to the experiment of automatic driving for the first time.The experimental results show that:(1)The proposed scheme can improve the prediction accuracy of the end-to-end model and reduce the model’s error accumulation;(2)the method based on image style transfer provides a new scheme for data augmentation technology,and also provides a solution for the high cost that many deep models rely heavily on a large number of label data.
基金supported by the National Natural Science Foundation of China under Grants No.61373124
文摘Capacity reduction is a major problem faced by wireless mesh networks. An efficient way to alleviate this problem is proper channel assignment. Current end-toend channel assignment schemes usually focus on the case where channels in distinct frequency bands are assigned to mesh access and backbone, but actually backbone network and access network can use the same IEEE 802.11 technology. Besides, these channel assignment schemes only utilize orthogonal channels to perform channel assignment, and the resulting network interference dramatically degrades network performance. Moreover, Internet-oriented traffic is considered only, and peerto-peer traffic is omitted, or vice versa. The traffic type does not match the practical network. In this paper, we explore how to exploit partially overlapped channels to perform endto-end channel assignment in order to achieve effective end-to-end flow transmissions. The proposed flow-based end-to-end channel assignment schemes can conquer the limitations aforementioned. Simulations reveal that loadaware channel assignment can be applied to networks with stable traffic load, and it can achieve near-optimal performance; Traffic-irrelevant channel assignment is suitable for networks with frequent change of traffic load,and it can achieve good balance between performance and overhead. Also, partially overlapped channels' capability of improving network performance is situation-dependent, they should be used carefully.
基金supported by the Department of Veterans Affairs(VA MERIT IRX001471A to SBS)
文摘End-to-end repair under no or low tension leads to improved outcomes for transected nerves with short gaps,compared to repairs with a graft.However,grafts are typically used to enable a tension-free repair for moderate to large gaps,as excessive tension can cause repairs to fail and catastrophically impede recovery.In this study,we tested the hypothesis that unloading the repair interface by redistributing tension away from the site of repair is a safe and feasible strategy for end-to-end repair of larger nerve gaps.Further,we tested the hypothesis that such an approach does not adversely affect structural and functional regeneration.In this study,we used a rat sciatic nerve injury model to compare the integrity of repair and several regenerative outcomes following end-to-end repairs of nerve gaps of increasing size.In addition,we proposed the use of a novel implantable device to safely repair end-to-end repair of larger nerve gaps by redistributing tension away from the repair interface.Our data suggest that redistriubution of tension away from the site of repair enables safe end-to-end repair of larger gap sizes.In addition,structural and functional measures of regeneration were equal or enhanced in nerves repaired under tension – with or without a tension redistribution device – compared to tension-free repairs.Provided that repair integrity is maintained,end-to-end repairs under tension should be considered as a reasonable surgical strategy.All animal experiments were performed under the approval of the Institutional Animal Care and Use Committee of University of California,San Diego(Protocol S11274).
基金This project was supported by the National Natural Science Foundation of China (60132030 60572147)
文摘End-to-end delay is one of the most important characteristics of Internet end-to-end packet dynamics, which can be applied to quality of services (OoS) management, service level agreement (SLA) management, congestion control algorithm development, etc. Nonstationarity and nonlinearity are found by the analysis of various delay series measured from different links. The fact that different types of links have different degree of Self-Similarity is also obtained. By constructing appropriate network architecture and neural functions, functional networks can be used to model the Internet end-to-end nonlinear delay time series. Furthermore, by using adaptive parameter studying algorithm, the nonstationarity can also be well modeled. The numerical results show that the provided functional network architecture and adaptive algorithm can precisely characterize the Internet end-to-end delay dynamics.
基金supported by National Nature Science Foundation(No.61501529,No.61331013)National Language Committee Project of China(No.ZDI125-36)Young Teachers'Scientific Research Project in Minzu University of China.
文摘With the emergence of large-scale knowledge base,how to use triple information to generate natural questions is a key technology in question answering systems.The traditional way of generating questions require a lot of manual intervention and produce lots of noise.To solve these problems,we propose a joint model based on semi-automated model and End-to-End neural network to automatically generate questions.The semi-automated model can generate question templates and real questions combining the knowledge base and center graph.The End-to-End neural network directly sends the knowledge base and real questions to BiLSTM network.Meanwhile,the attention mechanism is utilized in the decoding layer,which makes the triples and generated questions more relevant.Finally,the experimental results on SimpleQuestions demonstrate the effectiveness of the proposed approach.
基金the National Natural Science Foundation of China(No.60632030)the National High Technology Research and Development Program of China(No.2006AA01Z276)
文摘This paper presents the multi-step Q-learning(MQL)algorithm as an autonomic approach to thejoint radio resource management(JRRM)among heterogeneous radio access technologies(RATs)in theB3G environment.Through the'trial-and-error'on-line learning process,the JRRM controller can con-verge to the optimized admission control policy.The JRRM controller learns to give the best allocation foreach session in terms of both the access RAT and the service bandwidth.Simulation results show that theproposed algorithm realizes the autonomy of JRRM and achieves well trade-off between the spectrum utilityand the blocking probability comparing to the load-balancing algorithm and the utility-maximizing algo-rithm.Besides,the proposed algorithm has better online performances and convergence speed than theone-step Q-learning(QL)algorithm.Therefore,the user statisfaction degree could be improved also.
基金National High-Tech Research and Development Plan of China(No.2003AA1Z2130)Science and Technology Project of Zhejiang Province,China(No.2006C11200)
文摘An adaptive joint source channel bit allocation method for video communications over error-prone channel is proposed.To protect the bit-streams from the channel bit errors,the rate compatible punctured convolution(RCPC)code is used to produce coding rates varying from 4/5 to 1/2 using the same encoder and the Viterbi decoder.An expected end-to-end distortion model was presented to estimate the distortion introduced in compressed source coding due to quantization and channel bit errors jointly.Based on the proposed end-to-end distortion model,an adaptive joint source-channel bit allocation method was proposed under time-varying error-prone channel conditions.Simulated results show that the proposed methods could utilize the available channel capacity more efficiently and achieve better video quality than the other fixed coding-based bit allocation methods when transmitting over error-prone channels.
文摘The ubiquity of instant messaging services on mobile devices and their use of end-to-end encryption in safeguarding the privacy of their users have become a concern for some governments. WhatsApp messaging service has emerged as the most popular messaging app on mobile devices today. It uses end-to-end encryption which makes government and secret services efforts to combat organized crime, terrorists, and child pornographers technically impossible. Governments would like a “backdoor” into such apps, to use in accessing messages and have emphasized that they will only use the “backdoor” if there is a credible threat to national security. Users of WhatsApp have however, argued against a “backdoor”;they claim a “backdoor” would not only be an infringement of their privacy, but that hackers could also take advantage of it. In light of this security and privacy conflict between the end users of WhatsApp and government’s need to access messages in order to thwart potential terror attacks, this paper presents the advantages of maintaining E2EE in WhatsApp and why governments should not be allowed a “backdoor” to access users’ messages. This research presents the benefits encryption has on consumer security and privacy, and also on the challenges it poses to public safety and national security.
文摘We proposed a method using latent regression Bayesian network (LRBN) toextract the shared speech feature for the input of end-to-end speech recognition model.The structure of LRBN is compact and its parameter learning is fast. Compared withConvolutional Neural Network, it has a simpler and understood structure and lessparameters to learn. Experimental results show that the advantage of hybridLRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classificationarchitecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN ishelpful to differentiate among multiple language speech sets.
文摘Saccular extended obstruction is generated when the anastomotic site of functional end-to-end anastomosis is extended saccularly and blocked by intestinal contents. This is a specific complication of functional end-to-end anastomosis. Saccular extended obstruction of the anastomotic site of func-tional end-to-end anastomosis causes postoperative intestinal obstruction. Saccular extended obstruction places a heavy burden on patients because surgery is necessary for treatment of intestinal obstruction due to saccular extended obstruction. However, saccular extended obstruction is not a commonly recognized complication. The greatest factor contributing to the development of saccular extended obstruction is an acute angle between the portions of the intestinal tract oral and aboral to the anastomotic site. When this angle approaches obtuse angle, preferably close to a straight line, stagnation of the intestinal contents does not occur at the anastomotic site of functional end-to-end anastomosis and saccular extended obstruction is avoided. For making the angle of anastomotic intestinal tracts obtuse or straight, it may be effective that the entry hole of stapling suture instrument creating the anastomotic stoma is closed perpendicular to the intestinal axis.
基金Supported by the National Natural Science Foundation Major Research Plan of China (No. 90718003), the National Natural Science Foundation of China (No. 60973027), and the National High Technology Research and Development Program of China (No. 2007AA01Z401 ).
文摘Network calculus provides new tools for performance analysis of networks, but analyzing networks with complex topologies is a challenging research issue using statistical network calculus. A service model is proposed to characterize a service process of network with complex topologies. To obtain closed-form expression of statistical end-to-end performance bounds for a wide range of traffic source models, the traffic model and service model are expanded according to error function. Based on the proposed models, the explicit end-to-end delay bound of Fractional Brownian Motion(FBM) traffic is derived, the factors that affect the delay bound are analyzed, and a comparison between theoretical and simulation results is performed. The results illustrate that the proposed models not only fit the network behaviors well, but also facilitate the network performance analysis.