To utilize residual redundancy to reduce the error induced by fading channels and decrease the complexity of the field model to describe the probability structure for residual redundancy, a simplified statistical mode...To utilize residual redundancy to reduce the error induced by fading channels and decrease the complexity of the field model to describe the probability structure for residual redundancy, a simplified statistical model for residual redundancy and a low complexity joint source-channel decoding(JSCD) algorithm are proposed. The complicated residual redundancy in wavelet compressed images is decomposed into several independent 1-D probability check equations composed of Markov chains and it is regarded as a natural channel code with a structure similar to the low density parity check (LDPC) code. A parallel sum-product (SP) and iterative JSCD algorithm is proposed. Simulation results show that the proposed JSCD algorithm can make full use of residual redundancy in different directions to correct errors and improve the peak signal noise ratio (PSNR) of the reconstructed image and reduce the complexity and delay of JSCD. The performance of JSCD is more robust than the traditional separated encoding system with arithmetic coding in the same data rate.展开更多
In this paper, we present a Joint Source-Channel Decoding algorithm (JSCD) for Low-Density Parity Check (LDPC) codes by modifying the Sum-Product Algorithm (SPA) to account for the source redun-dancy, which results fr...In this paper, we present a Joint Source-Channel Decoding algorithm (JSCD) for Low-Density Parity Check (LDPC) codes by modifying the Sum-Product Algorithm (SPA) to account for the source redun-dancy, which results from the neighbouring Huffman coded bits. Simulations demonstrate that in the presence of source redundancy, the proposed algorithm gives better performance than the Separate Source and Channel Decoding algorithm (SSCD).展开更多
This paper proposes an integrated joint source-channel decoder (I-JSCD) using Max-Log-MAP method for sources encoded with exp-Golomb codes and convolutional codes, and proposes a system applying this method to decod...This paper proposes an integrated joint source-channel decoder (I-JSCD) using Max-Log-MAP method for sources encoded with exp-Golomb codes and convolutional codes, and proposes a system applying this method to decoding the VLC data, e.g. motion vector differences (MVDs), of H.264 across an AWGN channel. This method combines the source code state-space and the channel code state-space together to construct a joint state-space, develops a 3-D trellis and a maximum a-posterior (MAP) algorithm to estimate the source sequence symbol by symbol, and then uses max-log approximation to simplify the algorithm. Experiments indicate that the proposed system gives significant improvements on peak signal-to-noise ratio (PSNR) (maximum about 15 dB) than a separate scheme. This also leads to a higher visual quality of video stream over a highly noisy channel.展开更多
We improve the iterative decoding algorithm by utilizing the “leaked” residual redundancy at the output of the source encoder without changing the encoder structure for the noisy channel. The experimental results sh...We improve the iterative decoding algorithm by utilizing the “leaked” residual redundancy at the output of the source encoder without changing the encoder structure for the noisy channel. The experimental results show that using the residual redundancy of the compressed source in channel decoding is an effective method to improve the error correction performance.展开更多
Realtime speech communications require high efficient compression algorithms to encode speech signals. As the compressed speech parameters are highly sensitive to transmission errors, robust source and channel decodin...Realtime speech communications require high efficient compression algorithms to encode speech signals. As the compressed speech parameters are highly sensitive to transmission errors, robust source and channel decoding and demodulation schemes are both important and of practical use. In this paper, an it- erative joint souree-channel decoding and demodulation algorithm is proposed for mixed excited linear pre- diction (MELP) vocoder by both exploiting the residual redundancy and passing soft information through- out the receiver while introducing systematic global iteration process to further enhance the performance. Being fully compatible with existing transmitter structure, the proposed algorithm does not introduce addi- tional bandwidth expansion and transmission delay. Simulations show substantial error correcting perfor- mance and synthesized speech quality improvement over conventional separate designed systems in delay and bandwidth constraint channels by using the joint source-channel decoding and demodulation (JSCCM) algorithm.展开更多
Most of multimedia schemes employ variable-length codes (VLCs) like Huffman code as core components in obtaining high compression rates. However VLC methods are very sensitive to channel noise. The goal of this pape...Most of multimedia schemes employ variable-length codes (VLCs) like Huffman code as core components in obtaining high compression rates. However VLC methods are very sensitive to channel noise. The goal of this paper is to salvage as many data from the damaged packets as possible for higher audiovisual quality. This paper proposes an integrated joint source-channel decoder (I-JSCD) at a symbol-level using three-dimensional (3-D) trellis representation for first-order Markov sources encoded with VLC source code and convolutional channel code. This method combines source code and channel code state-spaces and bit-lengths to construct a two-dimensional (2-D) state-space, and then develops a 3-D trellis and a maximum a-posterior (MAP) algorithm to estimate the source sequence symbol by symbol. Experiment results demonstrate that our method results in significant improvement in decoding performance, it can salvage at least half of (50%) data in any channel error rate, and can provide additional error resilience to VLC stream like image, audio, video stream over high error rate links.展开更多
In this paper, a new kind of simple-encoding irregular systematic LDPC codes suitable for one-relay coded cooperation is designed, where the proposed joint iterative decoding is effectively performed in the destinatio...In this paper, a new kind of simple-encoding irregular systematic LDPC codes suitable for one-relay coded cooperation is designed, where the proposed joint iterative decoding is effectively performed in the destination which is in accordance with the corresponding joint Tanner graph characterizing two different component LDPC codes used by the source and relay in ideal and non-ideal relay cooperations. The theoretical analysis and simulations show that the coded cooperation scheme obviously outperforms the coded non-cooperation one under the same code rate and decoding complex. The significant performance improvement can be virtually credited to the additional mutual exchange of the extrinsic information resulted by the LDPC code employed by the source and its counterpart used by the relay in both ideal and non-ideal cooperations.展开更多
A network-coding-based multisource LDPC-coded cooperative MIMO scheme is proposed,where multiple sources transmit their messages to the destination with the assistance from a single relay.The relay cooperates with mul...A network-coding-based multisource LDPC-coded cooperative MIMO scheme is proposed,where multiple sources transmit their messages to the destination with the assistance from a single relay.The relay cooperates with multiple sources simultaneously via network-coding.It avoids the issues of imperfect frequency/timing synchronization and large transmission delay which may be introduced by frequency-division multiple access(FDMA)/code-division multiple access(CDMA)and time-division multiple access(TDMA)manners.The proposed joint″Min-Sum″iterative decoding is effectively carried out in the destination.Such a decoding algorithm agrees with the introduced equivalent joint Tanner graph which can be used to fully characterize LDPC codes employed by the sources and relay.Theoretical analysis and numerical simulation show that the proposed scheme with joint iterative decoding can achieve significant cooperation diversity gain.Furthermore,for the relay,compared with the cascade scheme,the proposed scheme has much lower complexity of LDPC-encoding and is easier to be implemented in the hardware with similar bit error rate(BER)performance.展开更多
A novel Joint Source and Channel Decoding (JSCD) scheme for Variable Length Codes (VLCs) concatenated with turbo codes utilizing a new super-trellis decoding algorithm is presented in this letter. The basic idea of ou...A novel Joint Source and Channel Decoding (JSCD) scheme for Variable Length Codes (VLCs) concatenated with turbo codes utilizing a new super-trellis decoding algorithm is presented in this letter. The basic idea of our decoding algorithm is that source a priori information with the form of bit transition probabilities corresponding to the VLC tree can be derived directly from sub-state transitions in new composite-state represented super-trellis. A Maximum Likelihood (ML) decoding algorithm for VLC sequence estimations based on the proposed super-trellis is also described. Simu-lation results show that the new iterative decoding scheme can obtain obvious encoding gain especially for Reversible Variable Length Codes (RVLCs),when compared with the classical separated turbo decoding and the previous joint decoding not considering source statistical characteristics.展开更多
An adaptive joint source channel bit allocation method for video communications over error-prone channel is proposed.To protect the bit-streams from the channel bit errors,the rate compatible punctured convolution(RCP...An adaptive joint source channel bit allocation method for video communications over error-prone channel is proposed.To protect the bit-streams from the channel bit errors,the rate compatible punctured convolution(RCPC)code is used to produce coding rates varying from 4/5 to 1/2 using the same encoder and the Viterbi decoder.An expected end-to-end distortion model was presented to estimate the distortion introduced in compressed source coding due to quantization and channel bit errors jointly.Based on the proposed end-to-end distortion model,an adaptive joint source-channel bit allocation method was proposed under time-varying error-prone channel conditions.Simulated results show that the proposed methods could utilize the available channel capacity more efficiently and achieve better video quality than the other fixed coding-based bit allocation methods when transmitting over error-prone channels.展开更多
A new arithmetic coding system combining source channel coding and maximum a posteriori decoding were proposed. It combines source coding and error correction tasks into one unified process by introducing an adaptive ...A new arithmetic coding system combining source channel coding and maximum a posteriori decoding were proposed. It combines source coding and error correction tasks into one unified process by introducing an adaptive forbidden symbol. The proposed system achieves fixed length code words by adaptively adjusting the probability of the forbidden symbol and adding tail digits of variable length. The corresponding improved MAP decoding metric was derived. The proposed system can improve the performance. Simulations were performed on AWGN channels with various noise levels by using both hard and soft decision with BPSK modulation.The results show its performance is slightly better than that of our adaptive arithmetic error correcting coding system using a forbidden symbol.展开更多
In the future development direction of the sixth generation(6G)mobile communication,several communication models are proposed to face the growing challenges of the task.The rapid development of artificial intelligence...In the future development direction of the sixth generation(6G)mobile communication,several communication models are proposed to face the growing challenges of the task.The rapid development of artificial intelligence(AI)foundation models provides significant support for efficient and intelligent communication interactions.In this paper,we propose an innovative semantic communication paradigm called task-oriented semantic communication system with foundation models.First,we segment the image by using task prompts based on the segment anything model(SAM)and contrastive language-image pretraining(CLIP).Meanwhile,we adopt Bezier curve to enhance the mask to improve the segmentation accuracy.Second,we have differentiated semantic compression and transmission approaches for segmented content.Third,we fuse different semantic information based on the conditional diffusion model to generate high-quality images that satisfy the users'specific task requirements.Finally,the experimental results show that the proposed system compresses the semantic information effectively and improves the robustness of semantic communication.展开更多
In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is de...In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits.展开更多
With the development of deep learning(DL),joint source-channel coding(JSCC)solutions for end-to-end transmission have gained a lot of attention.Adaptive deep JSCC schemes support dynamically adjusting the rate accordi...With the development of deep learning(DL),joint source-channel coding(JSCC)solutions for end-to-end transmission have gained a lot of attention.Adaptive deep JSCC schemes support dynamically adjusting the rate according to different channel conditions during transmission,enhancing robustness in dynamic wireless environment.However,most of the existing adaptive JSCC schemes only consider different channel conditions,ignoring the different feature importance in the image processing and transmission.The uniform compression of different features in the image may result in the compromise of critical image details,particularly in low signal-to-noise ratio(SNR)scenarios.To address the above issues,in this paper,a dual attention mechanism is introduced and an SNR-adaptive deep JSCC mechanism with a convolutional block attention module(CBAM)is proposed,in which matrix operations are applied to features in spatial and channel dimensions respectively.The proposed solution concatenates the pooling feature with the SNR level and passes it sequentially through the channel attention network and spatial attention network to obtain the importance evaluation result.Experiments show that the proposed solution outperforms other baseline schemes in terms of peak SNR(PSNR)and structural similarity(SSIM),particularly in low SNR scenarios or when dealing with complex image content.展开更多
文摘To utilize residual redundancy to reduce the error induced by fading channels and decrease the complexity of the field model to describe the probability structure for residual redundancy, a simplified statistical model for residual redundancy and a low complexity joint source-channel decoding(JSCD) algorithm are proposed. The complicated residual redundancy in wavelet compressed images is decomposed into several independent 1-D probability check equations composed of Markov chains and it is regarded as a natural channel code with a structure similar to the low density parity check (LDPC) code. A parallel sum-product (SP) and iterative JSCD algorithm is proposed. Simulation results show that the proposed JSCD algorithm can make full use of residual redundancy in different directions to correct errors and improve the peak signal noise ratio (PSNR) of the reconstructed image and reduce the complexity and delay of JSCD. The performance of JSCD is more robust than the traditional separated encoding system with arithmetic coding in the same data rate.
文摘In this paper, we present a Joint Source-Channel Decoding algorithm (JSCD) for Low-Density Parity Check (LDPC) codes by modifying the Sum-Product Algorithm (SPA) to account for the source redun-dancy, which results from the neighbouring Huffman coded bits. Simulations demonstrate that in the presence of source redundancy, the proposed algorithm gives better performance than the Separate Source and Channel Decoding algorithm (SSCD).
基金Supported by the Foundation of Ministry of Education of China (211CERS10)
文摘This paper proposes an integrated joint source-channel decoder (I-JSCD) using Max-Log-MAP method for sources encoded with exp-Golomb codes and convolutional codes, and proposes a system applying this method to decoding the VLC data, e.g. motion vector differences (MVDs), of H.264 across an AWGN channel. This method combines the source code state-space and the channel code state-space together to construct a joint state-space, develops a 3-D trellis and a maximum a-posterior (MAP) algorithm to estimate the source sequence symbol by symbol, and then uses max-log approximation to simplify the algorithm. Experiments indicate that the proposed system gives significant improvements on peak signal-to-noise ratio (PSNR) (maximum about 15 dB) than a separate scheme. This also leads to a higher visual quality of video stream over a highly noisy channel.
文摘We improve the iterative decoding algorithm by utilizing the “leaked” residual redundancy at the output of the source encoder without changing the encoder structure for the noisy channel. The experimental results show that using the residual redundancy of the compressed source in channel decoding is an effective method to improve the error correction performance.
基金Supported by the National Natural Science Foundation of China (No. 60572081 )
文摘Realtime speech communications require high efficient compression algorithms to encode speech signals. As the compressed speech parameters are highly sensitive to transmission errors, robust source and channel decoding and demodulation schemes are both important and of practical use. In this paper, an it- erative joint souree-channel decoding and demodulation algorithm is proposed for mixed excited linear pre- diction (MELP) vocoder by both exploiting the residual redundancy and passing soft information through- out the receiver while introducing systematic global iteration process to further enhance the performance. Being fully compatible with existing transmitter structure, the proposed algorithm does not introduce addi- tional bandwidth expansion and transmission delay. Simulations show substantial error correcting perfor- mance and synthesized speech quality improvement over conventional separate designed systems in delay and bandwidth constraint channels by using the joint source-channel decoding and demodulation (JSCCM) algorithm.
基金Supported by the Foundation of Ministry of Education of China (211CERS10)
文摘Most of multimedia schemes employ variable-length codes (VLCs) like Huffman code as core components in obtaining high compression rates. However VLC methods are very sensitive to channel noise. The goal of this paper is to salvage as many data from the damaged packets as possible for higher audiovisual quality. This paper proposes an integrated joint source-channel decoder (I-JSCD) at a symbol-level using three-dimensional (3-D) trellis representation for first-order Markov sources encoded with VLC source code and convolutional channel code. This method combines source code and channel code state-spaces and bit-lengths to construct a two-dimensional (2-D) state-space, and then develops a 3-D trellis and a maximum a-posterior (MAP) algorithm to estimate the source sequence symbol by symbol. Experiment results demonstrate that our method results in significant improvement in decoding performance, it can salvage at least half of (50%) data in any channel error rate, and can provide additional error resilience to VLC stream like image, audio, video stream over high error rate links.
基金Supported by the Open Research Fund of National Moblie Communications Research Laboratory of Southeast Uni-versity (No. W200704)
文摘In this paper, a new kind of simple-encoding irregular systematic LDPC codes suitable for one-relay coded cooperation is designed, where the proposed joint iterative decoding is effectively performed in the destination which is in accordance with the corresponding joint Tanner graph characterizing two different component LDPC codes used by the source and relay in ideal and non-ideal relay cooperations. The theoretical analysis and simulations show that the coded cooperation scheme obviously outperforms the coded non-cooperation one under the same code rate and decoding complex. The significant performance improvement can be virtually credited to the additional mutual exchange of the extrinsic information resulted by the LDPC code employed by the source and its counterpart used by the relay in both ideal and non-ideal cooperations.
基金Supported by the Postdoctoral Science Foundation of China(2014M561694)the Science and Technology on Avionics Integration Laboratory and National Aeronautical Science Foundation of China(20105552)
文摘A network-coding-based multisource LDPC-coded cooperative MIMO scheme is proposed,where multiple sources transmit their messages to the destination with the assistance from a single relay.The relay cooperates with multiple sources simultaneously via network-coding.It avoids the issues of imperfect frequency/timing synchronization and large transmission delay which may be introduced by frequency-division multiple access(FDMA)/code-division multiple access(CDMA)and time-division multiple access(TDMA)manners.The proposed joint″Min-Sum″iterative decoding is effectively carried out in the destination.Such a decoding algorithm agrees with the introduced equivalent joint Tanner graph which can be used to fully characterize LDPC codes employed by the sources and relay.Theoretical analysis and numerical simulation show that the proposed scheme with joint iterative decoding can achieve significant cooperation diversity gain.Furthermore,for the relay,compared with the cascade scheme,the proposed scheme has much lower complexity of LDPC-encoding and is easier to be implemented in the hardware with similar bit error rate(BER)performance.
基金Supported by the National Natural Science Foundation of China (No.90304003, No.60573112, No.60272056)the Foundation Project of China (No.A1320061262).
文摘A novel Joint Source and Channel Decoding (JSCD) scheme for Variable Length Codes (VLCs) concatenated with turbo codes utilizing a new super-trellis decoding algorithm is presented in this letter. The basic idea of our decoding algorithm is that source a priori information with the form of bit transition probabilities corresponding to the VLC tree can be derived directly from sub-state transitions in new composite-state represented super-trellis. A Maximum Likelihood (ML) decoding algorithm for VLC sequence estimations based on the proposed super-trellis is also described. Simu-lation results show that the new iterative decoding scheme can obtain obvious encoding gain especially for Reversible Variable Length Codes (RVLCs),when compared with the classical separated turbo decoding and the previous joint decoding not considering source statistical characteristics.
基金National High-Tech Research and Development Plan of China(No.2003AA1Z2130)Science and Technology Project of Zhejiang Province,China(No.2006C11200)
文摘An adaptive joint source channel bit allocation method for video communications over error-prone channel is proposed.To protect the bit-streams from the channel bit errors,the rate compatible punctured convolution(RCPC)code is used to produce coding rates varying from 4/5 to 1/2 using the same encoder and the Viterbi decoder.An expected end-to-end distortion model was presented to estimate the distortion introduced in compressed source coding due to quantization and channel bit errors jointly.Based on the proposed end-to-end distortion model,an adaptive joint source-channel bit allocation method was proposed under time-varying error-prone channel conditions.Simulated results show that the proposed methods could utilize the available channel capacity more efficiently and achieve better video quality than the other fixed coding-based bit allocation methods when transmitting over error-prone channels.
基金The National Natural Science Foundation ofChina(No60332030)
文摘A new arithmetic coding system combining source channel coding and maximum a posteriori decoding were proposed. It combines source coding and error correction tasks into one unified process by introducing an adaptive forbidden symbol. The proposed system achieves fixed length code words by adaptively adjusting the probability of the forbidden symbol and adding tail digits of variable length. The corresponding improved MAP decoding metric was derived. The proposed system can improve the performance. Simulations were performed on AWGN channels with various noise levels by using both hard and soft decision with BPSK modulation.The results show its performance is slightly better than that of our adaptive arithmetic error correcting coding system using a forbidden symbol.
基金supported in part by the National Natural Science Foundation of China under Grant(62001246,62231017,62201277,62071255)the Natural Science Foundation of Jiangsu Province under Grant BK20220390+3 种基金Key R and D Program of Jiangsu Province Key project and topics under Grant(BE2021095,BE2023035)the Natural Science Research Startup Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications(Grant No.NY221011)National Science Foundation of Xiamen,China(No.3502Z202372013)Open Project of the Key Laboratory of Underwater Acoustic Communication and Marine Information Technology(Xiamen University)of the Ministry of Education,China(No.UAC202304)。
文摘In the future development direction of the sixth generation(6G)mobile communication,several communication models are proposed to face the growing challenges of the task.The rapid development of artificial intelligence(AI)foundation models provides significant support for efficient and intelligent communication interactions.In this paper,we propose an innovative semantic communication paradigm called task-oriented semantic communication system with foundation models.First,we segment the image by using task prompts based on the segment anything model(SAM)and contrastive language-image pretraining(CLIP).Meanwhile,we adopt Bezier curve to enhance the mask to improve the segmentation accuracy.Second,we have differentiated semantic compression and transmission approaches for segmented content.Third,we fuse different semantic information based on the conditional diffusion model to generate high-quality images that satisfy the users'specific task requirements.Finally,the experimental results show that the proposed system compresses the semantic information effectively and improves the robustness of semantic communication.
基金supported in part by the National Natural Science Foundation of China under Grant 61873277in part by the Natural Science Basic Research Plan in Shaanxi Province of China underGrant 2020JQ-758in part by the Chinese Postdoctoral Science Foundation under Grant 2020M673446.
文摘In the video captioning methods based on an encoder-decoder,limited visual features are extracted by an encoder,and a natural sentence of the video content is generated using a decoder.However,this kind ofmethod is dependent on a single video input source and few visual labels,and there is a problem with semantic alignment between video contents and generated natural sentences,which are not suitable for accurately comprehending and describing the video contents.To address this issue,this paper proposes a video captioning method by semantic topic-guided generation.First,a 3D convolutional neural network is utilized to extract the spatiotemporal features of videos during the encoding.Then,the semantic topics of video data are extracted using the visual labels retrieved from similar video data.In the decoding,a decoder is constructed by combining a novel Enhance-TopK sampling algorithm with a Generative Pre-trained Transformer-2 deep neural network,which decreases the influence of“deviation”in the semantic mapping process between videos and texts by jointly decoding a baseline and semantic topics of video contents.During this process,the designed Enhance-TopK sampling algorithm can alleviate a long-tail problem by dynamically adjusting the probability distribution of the predicted words.Finally,the experiments are conducted on two publicly used Microsoft Research Video Description andMicrosoft Research-Video to Text datasets.The experimental results demonstrate that the proposed method outperforms several state-of-art approaches.Specifically,the performance indicators Bilingual Evaluation Understudy,Metric for Evaluation of Translation with Explicit Ordering,Recall Oriented Understudy for Gisting Evaluation-longest common subsequence,and Consensus-based Image Description Evaluation of the proposed method are improved by 1.2%,0.1%,0.3%,and 2.4% on the Microsoft Research Video Description dataset,and 0.1%,1.0%,0.1%,and 2.8% on the Microsoft Research-Video to Text dataset,respectively,compared with the existing video captioning methods.As a result,the proposed method can generate video captioning that is more closely aligned with human natural language expression habits.
基金This work was supported in part by the National Natural Science Foundation of China(62293481)in part by the Young Elite Scientists Sponsorship Program by CAST(2023QNRC001)+1 种基金in part by the National Natural Science Foundation for Young Scientists of China(62001050)in part by the Fundamental Research Funds for the Central Universities(2023RC95).
文摘With the development of deep learning(DL),joint source-channel coding(JSCC)solutions for end-to-end transmission have gained a lot of attention.Adaptive deep JSCC schemes support dynamically adjusting the rate according to different channel conditions during transmission,enhancing robustness in dynamic wireless environment.However,most of the existing adaptive JSCC schemes only consider different channel conditions,ignoring the different feature importance in the image processing and transmission.The uniform compression of different features in the image may result in the compromise of critical image details,particularly in low signal-to-noise ratio(SNR)scenarios.To address the above issues,in this paper,a dual attention mechanism is introduced and an SNR-adaptive deep JSCC mechanism with a convolutional block attention module(CBAM)is proposed,in which matrix operations are applied to features in spatial and channel dimensions respectively.The proposed solution concatenates the pooling feature with the SNR level and passes it sequentially through the channel attention network and spatial attention network to obtain the importance evaluation result.Experiments show that the proposed solution outperforms other baseline schemes in terms of peak SNR(PSNR)and structural similarity(SSIM),particularly in low SNR scenarios or when dealing with complex image content.