The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.I...The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.展开更多
Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on fr...Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on frames is often independent of the inherent speech features, which may lead to great degradation of speech quality. A novel frame-bitrate-change based steganography is proposed in this work, which discovers a novel covert channel for Vo IP and introduces less distortion. This method exploits the feature of multi-rate speech codecs that the practical bitrate of speech frame is identified only by speech decoder at receiving end. Based on this characteristic, two steganography strategies called bitrate downgrading(BD) and bitrate switching(BS)are provided. The first strategy substitutes high bit-rate speech frames with lower ones to embed secret message, which introduces very low distortion in practice, and much less than other bits-modification based methods with the same embedding capacity. The second one encodes secret message bits into different types of speech frames, which is an alternative choice for supplement. The two strategies are implemented and tested on our covert communication system Steg Vo IP. The experiment results show that our proposed method is effective and fulfills the real-time requirement of Vo IP communication.展开更多
To compress screen image sequence in real-time remote and interactive applications,a novel compression method is proposed.The proposed method is named as CABHG.CABHG employs hybrid coding schemes that consist of intra...To compress screen image sequence in real-time remote and interactive applications,a novel compression method is proposed.The proposed method is named as CABHG.CABHG employs hybrid coding schemes that consist of intra-frame and inter-frame coding modes.The intra-frame coding is a rate-distortion optimized adaptive block size that can be also used for the compression of a single screen image.The inter-frame coding utilizes hierarchical group of pictures(GOP) structure to improve system performance during random accesses and fast-backward scans.Experimental results demonstrate that the proposed CABHG method has approximately 47%-48% higher compression ratio and 46%-53% lower CPU utilization than professional screen image sequence codecs such as TechSmith Ensharpen codec and Sorenson 3 codec.Compared with general video codecs such as H.264 codec,XviD MPEG-4 codec and Apple's Animation codec,CABHG also shows 87%-88% higher compression ratio and 64%-81% lower CPU utilization than these general video codecs.展开更多
This paper presented a concatenated maximum-likelihood (ML) decoder for space-time/space-frequency block coded orthogonal frequency diversion multiplexing (ST/SFBC-OFDM) systems in double selective fading channels. Th...This paper presented a concatenated maximum-likelihood (ML) decoder for space-time/space-frequency block coded orthogonal frequency diversion multiplexing (ST/SFBC-OFDM) systems in double selective fading channels. The proposed decoder first detects space-time or space-frequency codeword elements separately. Then, according to the coarsely estimated codeword elements, the ML decoding is performed in a smaller constellation element set to searching final codeword. It is proved that the proposed decoder has optimal performances if and only if subchannels are constant during a codeword interval. The simulation results show that the performances of proposed decoder is close to that of the optimal ML decoder in severe Doppler and delay spread channels. However, the complexity of proposed decoder is much lower than that of the optimal ML decoder.展开更多
In this paper a low-density pairwise check(LDPC) coded three-way relay system is considered, where three user nodes desire to exchange messages with the help of one relay node. Since physical-layer network coding is a...In this paper a low-density pairwise check(LDPC) coded three-way relay system is considered, where three user nodes desire to exchange messages with the help of one relay node. Since physical-layer network coding is applied, two time slots are sufficient for one round information exchange. In this paper, we present a decode-and-forward(DF) scheme based on joint LDPC decoding for three-way relay channels, where relay decoder partially decodes the network code rather than fully decodes all the user messages. Simulation results show that the new DF scheme considerably outperforms other common schemes in three-way relay fading channels.展开更多
A new encryption/decryption system for optical information security is proposed in this paper. We used an iterative Fourier transform algorithm to optimize the encrypted hologram as well as the decryption key as phase...A new encryption/decryption system for optical information security is proposed in this paper. We used an iterative Fourier transform algorithm to optimize the encrypted hologram as well as the decryption key as phase-only elements. The optical decryption was implemented by superimposing the encrypted hologram and the decryyption key in a simple optical setup. Numerical simulation and optical experiment have confirmed the proposed technique as a simple and easy implementation for optical decryption, demonstrating potential applications in optical information security verification.展开更多
In order to improve link performance of future wireless relay networks,a network coding scheme with linear block codes was proposed,which could be deployed in a relay network consisting of multi-source sending data to...In order to improve link performance of future wireless relay networks,a network coding scheme with linear block codes was proposed,which could be deployed in a relay network consisting of multi-source sending data to a common base station(BS) with the assistance of one relay node.At BS,an iterative decoding structure between one cooperative decoder and a number of single-source decoders was established using the relayed network codes and source codes.Further,the extrinsic information transfer(EXIT) chart technique was used to predict and analyze the convergence behavior of iterative decoder.The analysis and simulation results show that the bit error ratio(BER) performance of the proposed scheme outperforms reference scheme under different relay network coding matrices.Compared with a reference scheme without the multisource cooperation,the proposed scheme can obtain network coding gain from the relay network while not reduce its code rate.展开更多
Side information (SI) is one of the key issues in distributed video coding (DVC) and affects the compression performance of DVC largely. This paper proposes an SI refinement algorithm, in which the Wyner-Ziv (WZ...Side information (SI) is one of the key issues in distributed video coding (DVC) and affects the compression performance of DVC largely. This paper proposes an SI refinement algorithm, in which the Wyner-Ziv (WZ) frame is split into two parts based on checkerboard pattern, and the two parts are encoded independently but decoded sequentially. In the decoding process, the part 1 is first decoded with the initial SI and partially decoded part (PDP) 1 is used to improve the motion vectors (MVs) and SI of both parts. At the next stage, the part 2 is decoded with the improved SI and PDP 2 is used to further refine MVs of the part 2. Then, SI of both parts are further refined. Simulation results show that the proposed algorithm can improve the peak signal to noise ratio (PSNR) by up to 1.43 dB when compared with traditional DVC codec.展开更多
In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial netw...In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial network(GAN)was proposed.First,a noise model based on style GAN2 was constructed to estimate the real noise distribution,and the noise information similar to the real noise distribution was generated as the experimental noise data set.Then,a network model with encoder-decoder architecture as the core based on GAN idea was constructed,and the network model was trained with the generated noise data set until it reached the optimal value.Finally,the noise and artifacts in low-dose CT images could be removed by inputting low-dose CT images into the denoising network.The experimental results showed that the constructed network model based on GAN architecture improved the utilization rate of noise feature information and the stability of network training,removed image noise and artifacts,and reconstructed image with rich texture and realistic visual effect.展开更多
基金National Natural Science Foundation of China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)University Superior Discipline Construction Project of Jiangsu Province。
文摘The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.
基金Project(2011CB302305)supported by National Basic Research Program(973 Program)of ChinaProjects(61232004,61302094)supported by National Natural Science Foundation of China+2 种基金Project(ZQN-PY115)supported by Promotion Program for Young and Middle-aged Teacher in Science and Technology Research of Huaqiao University,ChinaProject(JA13012)supported by Education Science Research Program for Young and Middle-aged Teacher of Fujian Province of ChinaProject(2014J01238)supported by Natural Science Foundation of Fujian Province of China
文摘Steganography based on bits-modification of speech frames is a kind of commonly used method, which targets at RTP payloads and offers covert communications over voice-over-IP(Vo IP). However, direct modification on frames is often independent of the inherent speech features, which may lead to great degradation of speech quality. A novel frame-bitrate-change based steganography is proposed in this work, which discovers a novel covert channel for Vo IP and introduces less distortion. This method exploits the feature of multi-rate speech codecs that the practical bitrate of speech frame is identified only by speech decoder at receiving end. Based on this characteristic, two steganography strategies called bitrate downgrading(BD) and bitrate switching(BS)are provided. The first strategy substitutes high bit-rate speech frames with lower ones to embed secret message, which introduces very low distortion in practice, and much less than other bits-modification based methods with the same embedding capacity. The second one encodes secret message bits into different types of speech frames, which is an alternative choice for supplement. The two strategies are implemented and tested on our covert communication system Steg Vo IP. The experiment results show that our proposed method is effective and fulfills the real-time requirement of Vo IP communication.
基金Project(60873230) supported by the National Natural Science Foundation of China
文摘To compress screen image sequence in real-time remote and interactive applications,a novel compression method is proposed.The proposed method is named as CABHG.CABHG employs hybrid coding schemes that consist of intra-frame and inter-frame coding modes.The intra-frame coding is a rate-distortion optimized adaptive block size that can be also used for the compression of a single screen image.The inter-frame coding utilizes hierarchical group of pictures(GOP) structure to improve system performance during random accesses and fast-backward scans.Experimental results demonstrate that the proposed CABHG method has approximately 47%-48% higher compression ratio and 46%-53% lower CPU utilization than professional screen image sequence codecs such as TechSmith Ensharpen codec and Sorenson 3 codec.Compared with general video codecs such as H.264 codec,XviD MPEG-4 codec and Apple's Animation codec,CABHG also shows 87%-88% higher compression ratio and 64%-81% lower CPU utilization than these general video codecs.
文摘This paper presented a concatenated maximum-likelihood (ML) decoder for space-time/space-frequency block coded orthogonal frequency diversion multiplexing (ST/SFBC-OFDM) systems in double selective fading channels. The proposed decoder first detects space-time or space-frequency codeword elements separately. Then, according to the coarsely estimated codeword elements, the ML decoding is performed in a smaller constellation element set to searching final codeword. It is proved that the proposed decoder has optimal performances if and only if subchannels are constant during a codeword interval. The simulation results show that the performances of proposed decoder is close to that of the optimal ML decoder in severe Doppler and delay spread channels. However, the complexity of proposed decoder is much lower than that of the optimal ML decoder.
基金supported in part by the National Natural Science Foundation of China under Grant 61201187by the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions under Grant YETP0110+2 种基金by the Tsinghua University Initiative Scientific Research Program under Grant 20121088074by the Foundation of Zhejiang Educational Committee under Grant Y201121579by the Visiting Scholar Professional Development Project of Zhejiang Educational Committee under Grant FX2014052
文摘In this paper a low-density pairwise check(LDPC) coded three-way relay system is considered, where three user nodes desire to exchange messages with the help of one relay node. Since physical-layer network coding is applied, two time slots are sufficient for one round information exchange. In this paper, we present a decode-and-forward(DF) scheme based on joint LDPC decoding for three-way relay channels, where relay decoder partially decodes the network code rather than fully decodes all the user messages. Simulation results show that the new DF scheme considerably outperforms other common schemes in three-way relay fading channels.
文摘A new encryption/decryption system for optical information security is proposed in this paper. We used an iterative Fourier transform algorithm to optimize the encrypted hologram as well as the decryption key as phase-only elements. The optical decryption was implemented by superimposing the encrypted hologram and the decryyption key in a simple optical setup. Numerical simulation and optical experiment have confirmed the proposed technique as a simple and easy implementation for optical decryption, demonstrating potential applications in optical information security verification.
基金National Natural Science Foundation of China(No.51204176)
文摘In order to improve link performance of future wireless relay networks,a network coding scheme with linear block codes was proposed,which could be deployed in a relay network consisting of multi-source sending data to a common base station(BS) with the assistance of one relay node.At BS,an iterative decoding structure between one cooperative decoder and a number of single-source decoders was established using the relayed network codes and source codes.Further,the extrinsic information transfer(EXIT) chart technique was used to predict and analyze the convergence behavior of iterative decoder.The analysis and simulation results show that the bit error ratio(BER) performance of the proposed scheme outperforms reference scheme under different relay network coding matrices.Compared with a reference scheme without the multisource cooperation,the proposed scheme can obtain network coding gain from the relay network while not reduce its code rate.
基金Supported by the National Natural Science Foundation of China ( No. 60736043, 60672088) and the National Basic Research Program of China (No. 2009CB32005).
文摘Side information (SI) is one of the key issues in distributed video coding (DVC) and affects the compression performance of DVC largely. This paper proposes an SI refinement algorithm, in which the Wyner-Ziv (WZ) frame is split into two parts based on checkerboard pattern, and the two parts are encoded independently but decoded sequentially. In the decoding process, the part 1 is first decoded with the initial SI and partially decoded part (PDP) 1 is used to improve the motion vectors (MVs) and SI of both parts. At the next stage, the part 2 is decoded with the improved SI and PDP 2 is used to further refine MVs of the part 2. Then, SI of both parts are further refined. Simulation results show that the proposed algorithm can improve the peak signal to noise ratio (PSNR) by up to 1.43 dB when compared with traditional DVC codec.
基金supported by National Natural Science Foundation of China(No.11802272)China Postdoctoral Science Foundation(No.2019M651085)。
文摘In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial network(GAN)was proposed.First,a noise model based on style GAN2 was constructed to estimate the real noise distribution,and the noise information similar to the real noise distribution was generated as the experimental noise data set.Then,a network model with encoder-decoder architecture as the core based on GAN idea was constructed,and the network model was trained with the generated noise data set until it reached the optimal value.Finally,the noise and artifacts in low-dose CT images could be removed by inputting low-dose CT images into the denoising network.The experimental results showed that the constructed network model based on GAN architecture improved the utilization rate of noise feature information and the stability of network training,removed image noise and artifacts,and reconstructed image with rich texture and realistic visual effect.