In this paper we present a CNN based approach for a real time 3 D-hand pose estimation from the depth sequence.Prior discriminative approaches have achieved remarkable success but are facing two main challenges:Firstl...In this paper we present a CNN based approach for a real time 3 D-hand pose estimation from the depth sequence.Prior discriminative approaches have achieved remarkable success but are facing two main challenges:Firstly,the methods are fully supervised hence require large numbers of annotated training data to extract the dynamic information from a hand representation.Secondly,unreliable hand detectors based on strong assumptions or a weak detector which often fail in several situations like complex environment and multiple hands.In contrast to these methods,this paper presents an approach that can be considered as semi-supervised by performing predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision.The hand is modelled using a novel latent tree dependency model(LDTM)which transforms internal joint location to an explicit representation.Then the modeled hand topology is integrated with the pose estimator using data dependent method to jointly learn latent variables of the posterior pose appearance and the pose configuration respectively.Finally,an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose.Experiments on three challenging public datasets,ICVL,MSRA,and NYU demonstrate the significant performance of the proposed method which is comparable or better than state-of-the-art approaches.展开更多
In this paper, a CMOS image sensor(CIS) is proposed, which can accomplish both decorrelation and entropy coding of image compression directly on the focal plane. The design is based on predictive coding for image deco...In this paper, a CMOS image sensor(CIS) is proposed, which can accomplish both decorrelation and entropy coding of image compression directly on the focal plane. The design is based on predictive coding for image decorrelation. The predictions are performed in analog domain by 2×2 pixel units. Both the prediction residuals and original pixel values are quantized and encoded in parallel. Since the residuals have a peak distribution around zero,the output codewords can be replaced by the valid part of the residuals' binary mode. The compressed bit stream is accessible directly at the output of CIS without extra disposition. Simulation results show that the proposed approach achieves a compression rate of 2. 2 and PSNR of 51 on different test images.展开更多
Lateral predictive coding is a recurrent neural network that creates energy-efficient internal representations by exploiting statistical regularity in sensory inputs.Here,we analytically investigate the trade-off betw...Lateral predictive coding is a recurrent neural network that creates energy-efficient internal representations by exploiting statistical regularity in sensory inputs.Here,we analytically investigate the trade-off between information robustness and energy in a linear model of lateral predictive coding and numerically minimize a free energy quantity.We observed several phase transitions in the synaptic weight matrix,particularly a continuous transition that breaks reciprocity and permutation symmetry and builds cyclic dominance and a discontinuous transition with the associated sudden emergence of tight balance between excitatory and inhibitory interactions.The optimal network follows an ideal gas law over an extended temperature range and saturates the efficiency upper bound of energy use.These results provide theoretical insights into the emergence and evolution of complex internal models in predictive processing systems.展开更多
To decrease the computational complexity of adaptive inter-layer prediction and improve the encoding efficiency in sealable video coding, a mode decision algorithm is proposed by exploiting the part of used candidate ...To decrease the computational complexity of adaptive inter-layer prediction and improve the encoding efficiency in sealable video coding, a mode decision algorithm is proposed by exploiting the part of used candidate modes of the co-located reference macrobloeks for Hierarchical-B pictures. This scheme reduces the amount of the candidate modes to generate a dynamic list for the current encoding macroblock according to the statistical information derived from the co-located reference macroblocks in different temporal levels. The experimental results show that this fast algorithm reduces approximately 31% encoding time on average with the negligible loss of encoding performance.展开更多
Predictive coding is a promising theoretical framework in neuroscience for understanding information transmission and perception.It posits that the brain perceives the external world through internal models and update...Predictive coding is a promising theoretical framework in neuroscience for understanding information transmission and perception.It posits that the brain perceives the external world through internal models and updates these models under the guidance of prediction errors.Previous studies on predictive coding emphasized top-down feedback interactions in hierarchical multilayered networks but largely ignored lateral recurrent interactions.We perform analytical and numerical investigations in this work on the effects of single-layer lateral interactions.We consider a simple predictive response dynamics and run it on the MNIST dataset of hand-written digits.We find that learning will generally break the interaction symmetry between peer neurons,and that high input correlation between two neurons does not necessarily bring strong direct interactions between them.The optimized network responds to familiar input signals much faster than to novel or random inputs,and it significantly reduces the correlations between the output states of pairs of neurons.展开更多
[Objective] To discuss the effects of major mapping methods for DNA sequence on the accuracy of protein coding regions prediction,and to find out the effective mapping methods.[Method] By taking Approximate Correlatio...[Objective] To discuss the effects of major mapping methods for DNA sequence on the accuracy of protein coding regions prediction,and to find out the effective mapping methods.[Method] By taking Approximate Correlation(AC) as the full measure of the prediction accuracy at nucleotide level,the windowed narrow pass-band filter(WNPBF) based prediction algorithm was applied to study the effects of different mapping methods on prediction accuracy.[Result] In DNA data sets ALLSEQ and HMR195,the Voss and Z-Curve methods are proved to be more effective mapping methods than paired numeric(PN),Electron-ion Interaction Potential(EIIP) and complex number methods.[Conclusion] This study lays the foundation to verify the effectiveness of new mapping methods by using the predicted AC value,and it is meaningful to reveal DNA structure by using bioinformatics methods.展开更多
AIM To construct a long non-coding RNA(lnc RNA) signature for predicting hepatocellular carcinoma(HCC) prognosis with high efficiency.METHODS Differentially expressed lnc RNAs(DELs) between HCC specimens and peritumor...AIM To construct a long non-coding RNA(lnc RNA) signature for predicting hepatocellular carcinoma(HCC) prognosis with high efficiency.METHODS Differentially expressed lnc RNAs(DELs) between HCC specimens and peritumor liver specimens were identified using the edge R package to analyze The Cancer Genome Atlas(TCGA) LIHC dataset.Univariate Cox proportional hazards regression was performed to obtain the DELs significantly associated with overall survival(OS) in a training set.These OS-related DELs were further analyzed using a stepwise multivariate Cox regression model.Those lnc RNAs fitted in the multivariate Cox regression model and independently associated with overall survival were chosen to build a prognostic risk formula.The prognostic value ofthis formula was then validated in the test group and the entire cohort and further compared with two previously identified prognostic signatures for HCC.Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses were performed to explore the potential biological functions of the lnc RNAs in the signature.RESULTS Based on lnc RNA expression profiling of 370 HCC patients from the TCGA database,we constructed a 5-lnc RNA signature(AC015908.3,AC091057.3,TMCC1-AS1,DCST1-AS1 and FOXD2-AS1) that was significantly associated with prognosis.HCC patients with high-risk scores based on the expression of the 5 lnc RNAs had significantly shorter survival times compared to patients with low-risk scores in both the training and test groups.Multivariate Cox regression analysis demonstrated that the prognostic value of the 5 lnc RNAs was independent of clinicopathological parameters.A comparison study involving two previously identified prognostic signatures for HCC demonstrated that this 5-lnc RNA signature showed improved prognostic power compared with the other two signatures.Functional enrichment analysis indicated that the 5 lnc RNAs were potentially involved in metabolic processes,fibrinolysis and complement activation.CONCLUSION Our present study constructed a 5-lnc RNA signature that improves survival prediction and can be used as a prognostic biomarker for HCC patients.展开更多
The second generation Audio Video Coding Standard (AVS2) is the most recent video coding standard. By introducing several new coding techniques, AVS2 can provide more efficient compression for scene videos such as sur...The second generation Audio Video Coding Standard (AVS2) is the most recent video coding standard. By introducing several new coding techniques, AVS2 can provide more efficient compression for scene videos such as surveillance videos, conference videos, etc. Due to the limited scenes, scene videos have great redundancy especially in background region. The new scene video coding techniques applied in AVS2 mainly focus on reducing redundancy in order to achieve higher compression. This paper introduces several important AVS2 scene video coding techniques. Experimental results show that with scene video coding tools, AVS2 can save nearly 40%BD?rate (Bj?ntegaard?Delta bit?rate) on scene videos.展开更多
Following the success of the audio video standard (AVS) for 2D video coding, in 2008, the China AVS workgroup started developing 3D video (3DV) coding techniques. In this paper, we discuss the background, technica...Following the success of the audio video standard (AVS) for 2D video coding, in 2008, the China AVS workgroup started developing 3D video (3DV) coding techniques. In this paper, we discuss the background, technical features, and applications of AVS 3DV coding technology. We introduce two core techniques used in AVS 3DV coding: inter-view prediction and enhanced stereo packing coding. We elaborate on these techniques, which are used in the AVS real-time 3DV encoder. An application of the AVS 3DV coding system is presented to show the great practical value of this system. Simulation results show that the advanced techniques used in AVS 3DV coding provide remarkable coding gain compared with techniques used in a simulcast scheme.展开更多
The LPC “Linear Predictive Coding” algorithm is a widely used technique for voice coder. In this paper we present different implementations of the LPC algorithm used in the majority of voice decoding standard. The w...The LPC “Linear Predictive Coding” algorithm is a widely used technique for voice coder. In this paper we present different implementations of the LPC algorithm used in the majority of voice decoding standard. The windowing/autocorrelation bloc is implemented by three different versions on an FPGA Spartan 3. Allowing the possibility to integrate a Microblaze processor core a first solution consists of a pure software implementation of the LPC using this core RISC processor. Second solution is a pure hardware architecture implemented using VHDL based methodology starting from description until integration. Finally, the autocorrelation core is then proposed to be implemented using hardware/software (HW/SW) architecture with the existing processor. Each architecture performances are compared for different data lengths.展开更多
This paper presents a real-time implementation of 4.2Kb/s CELP speech coding on single DSP chip. An algorithm reducing search complexity for adaptive codebook is suggested; the solving method that the parameters are c...This paper presents a real-time implementation of 4.2Kb/s CELP speech coding on single DSP chip. An algorithm reducing search complexity for adaptive codebook is suggested; the solving method that the parameters are changed into LSP parameters is discussed. The realtime implementation process of this coding on a commercial development board with a single TMS320C30 is described.展开更多
The requirements of data coding in multimedia applications are presented, the current technique of coding and relative standards is introduced, then the work that have been doing is presented, i.e. the wavelet-based c...The requirements of data coding in multimedia applications are presented, the current technique of coding and relative standards is introduced, then the work that have been doing is presented, i.e. the wavelet-based coding method and the VE (Visual Entropy)-based coding method. The experiment results prove that these methods have gained a better perceptual quality of a reconstructed image and a lower bit rate. Their performance evaluations are better than JPEG (Joint Photographic Experts Group) coding. Finally, the future topics of study are put forward.展开更多
Quality degradation occurs during transmission of video streaming over the error-prone network. By jointly using redundant slice, reference frame selection, and intra/inters mode decision, a content and end-to-end rat...Quality degradation occurs during transmission of video streaming over the error-prone network. By jointly using redundant slice, reference frame selection, and intra/inters mode decision, a content and end-to-end rate-distortion based error resilience method is proposed. Firstly, the intra/inter mode decision is implemented using macro-block(MB) refresh, and then redundant picture and reference frame selection are utilized together to realize the redundant coding. The estimated error propagation distortion and bit consumption of refresh MB are used for the mode and reference frame decision of refresh MB. Secondly, by analyzing the statistical property in the successive frames, the error propagation distortion and bit consumption are formulated as a function of temporal distance. Encoding parameters of the current frame is determined by the estimated error propagation distortion and bit consumption. Thirdly, by comparing the rate-distortion cost of different combinations, proper selection of error resilience method is performed before the encoding process of the current frame. Finally, the MB mode and bit distribution of the primary picture are analyzed for the derivation of the texture information. The motion information is subsequently incorporated for the calculation of video content complexity to implement the content based redundant coding. Experimental results demonstrate that the proposed algorithm achieves significant performance gains over the LA-RDO and HRP method when video is transmitted over error-prone channel.展开更多
This paper proposed a back propagation neural network model for predictive block-matching. Predictive block-matching is a way to significantly decrease the computational complexity of motion estimation, but the tradit...This paper proposed a back propagation neural network model for predictive block-matching. Predictive block-matching is a way to significantly decrease the computational complexity of motion estimation, but the traditional prediction model was proposed 26 years ago. It is straight forward but not accurate enough. The proposed back propagation neural network has 5 inputs, 5 neutrons and 1 output. Because of its simplicity, it requires very little calculation power which is negligible compared with existing computation complexity. The test results show 10% - 30% higher prediction accuracy and PSNR improvement up to 0.3 dB. The above advantages make it a feasible replacement of the current model.展开更多
基金supported in part by the Fundamental Research Funds for the Central Universities(WK2350000002)。
文摘In this paper we present a CNN based approach for a real time 3 D-hand pose estimation from the depth sequence.Prior discriminative approaches have achieved remarkable success but are facing two main challenges:Firstly,the methods are fully supervised hence require large numbers of annotated training data to extract the dynamic information from a hand representation.Secondly,unreliable hand detectors based on strong assumptions or a weak detector which often fail in several situations like complex environment and multiple hands.In contrast to these methods,this paper presents an approach that can be considered as semi-supervised by performing predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision.The hand is modelled using a novel latent tree dependency model(LDTM)which transforms internal joint location to an explicit representation.Then the modeled hand topology is integrated with the pose estimator using data dependent method to jointly learn latent variables of the posterior pose appearance and the pose configuration respectively.Finally,an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose.Experiments on three challenging public datasets,ICVL,MSRA,and NYU demonstrate the significant performance of the proposed method which is comparable or better than state-of-the-art approaches.
基金Supported by the National Natural Science Foundation of China(No.61036004)Tianjin Research Program of Application Foundation and Advanced Technology(No.13JCQNJC00600)
文摘In this paper, a CMOS image sensor(CIS) is proposed, which can accomplish both decorrelation and entropy coding of image compression directly on the focal plane. The design is based on predictive coding for image decorrelation. The predictions are performed in analog domain by 2×2 pixel units. Both the prediction residuals and original pixel values are quantized and encoded in parallel. Since the residuals have a peak distribution around zero,the output codewords can be replaced by the valid part of the residuals' binary mode. The compressed bit stream is accessible directly at the output of CIS without extra disposition. Simulation results show that the proposed approach achieves a compression rate of 2. 2 and PSNR of 51 on different test images.
基金supported by the National Natural Science Foundation of China(Grant Nos.12047503,11747601 and 12247104)the National Innovation Institute of Defense Technology(Grant No.22TQ0904ZT01025)。
文摘Lateral predictive coding is a recurrent neural network that creates energy-efficient internal representations by exploiting statistical regularity in sensory inputs.Here,we analytically investigate the trade-off between information robustness and energy in a linear model of lateral predictive coding and numerically minimize a free energy quantity.We observed several phase transitions in the synaptic weight matrix,particularly a continuous transition that breaks reciprocity and permutation symmetry and builds cyclic dominance and a discontinuous transition with the associated sudden emergence of tight balance between excitatory and inhibitory interactions.The optimal network follows an ideal gas law over an extended temperature range and saturates the efficiency upper bound of energy use.These results provide theoretical insights into the emergence and evolution of complex internal models in predictive processing systems.
基金Sponsored by the Fundamental Research Funds for the Central Universities(Grant No. HEUCF11805)
文摘To decrease the computational complexity of adaptive inter-layer prediction and improve the encoding efficiency in sealable video coding, a mode decision algorithm is proposed by exploiting the part of used candidate modes of the co-located reference macrobloeks for Hierarchical-B pictures. This scheme reduces the amount of the candidate modes to generate a dynamic list for the current encoding macroblock according to the statistical information derived from the co-located reference macroblocks in different temporal levels. The experimental results show that this fast algorithm reduces approximately 31% encoding time on average with the negligible loss of encoding performance.
基金supported by the National Natural Science Foundation of China(Grant Nos.11975295 and 12047503)the Chinese Academy of Sciences(Grant Nos.QYZDJ-SSW-SYS018,and XDPD15)
文摘Predictive coding is a promising theoretical framework in neuroscience for understanding information transmission and perception.It posits that the brain perceives the external world through internal models and updates these models under the guidance of prediction errors.Previous studies on predictive coding emphasized top-down feedback interactions in hierarchical multilayered networks but largely ignored lateral recurrent interactions.We perform analytical and numerical investigations in this work on the effects of single-layer lateral interactions.We consider a simple predictive response dynamics and run it on the MNIST dataset of hand-written digits.We find that learning will generally break the interaction symmetry between peer neurons,and that high input correlation between two neurons does not necessarily bring strong direct interactions between them.The optimized network responds to familiar input signals much faster than to novel or random inputs,and it significantly reduces the correlations between the output states of pairs of neurons.
基金Supported by Ningxia Natural Science Foundation (NZ1024)the Scientific Research the Project of Ningxia Universities (201027)~~
文摘[Objective] To discuss the effects of major mapping methods for DNA sequence on the accuracy of protein coding regions prediction,and to find out the effective mapping methods.[Method] By taking Approximate Correlation(AC) as the full measure of the prediction accuracy at nucleotide level,the windowed narrow pass-band filter(WNPBF) based prediction algorithm was applied to study the effects of different mapping methods on prediction accuracy.[Result] In DNA data sets ALLSEQ and HMR195,the Voss and Z-Curve methods are proved to be more effective mapping methods than paired numeric(PN),Electron-ion Interaction Potential(EIIP) and complex number methods.[Conclusion] This study lays the foundation to verify the effectiveness of new mapping methods by using the predicted AC value,and it is meaningful to reveal DNA structure by using bioinformatics methods.
基金Supported by the National Nature Science Foundation of China,No.81702816(to Zhao QJ)Shandong Provincial Natural Science Foundation,No.ZR2017PH030(to Zhao QJ)
文摘AIM To construct a long non-coding RNA(lnc RNA) signature for predicting hepatocellular carcinoma(HCC) prognosis with high efficiency.METHODS Differentially expressed lnc RNAs(DELs) between HCC specimens and peritumor liver specimens were identified using the edge R package to analyze The Cancer Genome Atlas(TCGA) LIHC dataset.Univariate Cox proportional hazards regression was performed to obtain the DELs significantly associated with overall survival(OS) in a training set.These OS-related DELs were further analyzed using a stepwise multivariate Cox regression model.Those lnc RNAs fitted in the multivariate Cox regression model and independently associated with overall survival were chosen to build a prognostic risk formula.The prognostic value ofthis formula was then validated in the test group and the entire cohort and further compared with two previously identified prognostic signatures for HCC.Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses were performed to explore the potential biological functions of the lnc RNAs in the signature.RESULTS Based on lnc RNA expression profiling of 370 HCC patients from the TCGA database,we constructed a 5-lnc RNA signature(AC015908.3,AC091057.3,TMCC1-AS1,DCST1-AS1 and FOXD2-AS1) that was significantly associated with prognosis.HCC patients with high-risk scores based on the expression of the 5 lnc RNAs had significantly shorter survival times compared to patients with low-risk scores in both the training and test groups.Multivariate Cox regression analysis demonstrated that the prognostic value of the 5 lnc RNAs was independent of clinicopathological parameters.A comparison study involving two previously identified prognostic signatures for HCC demonstrated that this 5-lnc RNA signature showed improved prognostic power compared with the other two signatures.Functional enrichment analysis indicated that the 5 lnc RNAs were potentially involved in metabolic processes,fibrinolysis and complement activation.CONCLUSION Our present study constructed a 5-lnc RNA signature that improves survival prediction and can be used as a prognostic biomarker for HCC patients.
基金supported by the National Basic Research Program of China under grant 2015CB351806the National Natural Science Foundation of China under contract No.61425025,No.61390515 and No.61421062Shenzhen Peacock Plan
文摘The second generation Audio Video Coding Standard (AVS2) is the most recent video coding standard. By introducing several new coding techniques, AVS2 can provide more efficient compression for scene videos such as surveillance videos, conference videos, etc. Due to the limited scenes, scene videos have great redundancy especially in background region. The new scene video coding techniques applied in AVS2 mainly focus on reducing redundancy in order to achieve higher compression. This paper introduces several important AVS2 scene video coding techniques. Experimental results show that with scene video coding tools, AVS2 can save nearly 40%BD?rate (Bj?ntegaard?Delta bit?rate) on scene videos.
文摘Following the success of the audio video standard (AVS) for 2D video coding, in 2008, the China AVS workgroup started developing 3D video (3DV) coding techniques. In this paper, we discuss the background, technical features, and applications of AVS 3DV coding technology. We introduce two core techniques used in AVS 3DV coding: inter-view prediction and enhanced stereo packing coding. We elaborate on these techniques, which are used in the AVS real-time 3DV encoder. An application of the AVS 3DV coding system is presented to show the great practical value of this system. Simulation results show that the advanced techniques used in AVS 3DV coding provide remarkable coding gain compared with techniques used in a simulcast scheme.
文摘The LPC “Linear Predictive Coding” algorithm is a widely used technique for voice coder. In this paper we present different implementations of the LPC algorithm used in the majority of voice decoding standard. The windowing/autocorrelation bloc is implemented by three different versions on an FPGA Spartan 3. Allowing the possibility to integrate a Microblaze processor core a first solution consists of a pure software implementation of the LPC using this core RISC processor. Second solution is a pure hardware architecture implemented using VHDL based methodology starting from description until integration. Finally, the autocorrelation core is then proposed to be implemented using hardware/software (HW/SW) architecture with the existing processor. Each architecture performances are compared for different data lengths.
文摘This paper presents a real-time implementation of 4.2Kb/s CELP speech coding on single DSP chip. An algorithm reducing search complexity for adaptive codebook is suggested; the solving method that the parameters are changed into LSP parameters is discussed. The realtime implementation process of this coding on a commercial development board with a single TMS320C30 is described.
文摘The requirements of data coding in multimedia applications are presented, the current technique of coding and relative standards is introduced, then the work that have been doing is presented, i.e. the wavelet-based coding method and the VE (Visual Entropy)-based coding method. The experiment results prove that these methods have gained a better perceptual quality of a reconstructed image and a lower bit rate. Their performance evaluations are better than JPEG (Joint Photographic Experts Group) coding. Finally, the future topics of study are put forward.
基金Project(40927001)supported by the National Natural Science Foundation of ChinaProject(2011R09021-06)supported by the Program of Key Scientific and Technological Innovation Team of Zhejiang Province,ChinaProject supported by the Fundamental Research Funds for the Central Universities of China
文摘Quality degradation occurs during transmission of video streaming over the error-prone network. By jointly using redundant slice, reference frame selection, and intra/inters mode decision, a content and end-to-end rate-distortion based error resilience method is proposed. Firstly, the intra/inter mode decision is implemented using macro-block(MB) refresh, and then redundant picture and reference frame selection are utilized together to realize the redundant coding. The estimated error propagation distortion and bit consumption of refresh MB are used for the mode and reference frame decision of refresh MB. Secondly, by analyzing the statistical property in the successive frames, the error propagation distortion and bit consumption are formulated as a function of temporal distance. Encoding parameters of the current frame is determined by the estimated error propagation distortion and bit consumption. Thirdly, by comparing the rate-distortion cost of different combinations, proper selection of error resilience method is performed before the encoding process of the current frame. Finally, the MB mode and bit distribution of the primary picture are analyzed for the derivation of the texture information. The motion information is subsequently incorporated for the calculation of video content complexity to implement the content based redundant coding. Experimental results demonstrate that the proposed algorithm achieves significant performance gains over the LA-RDO and HRP method when video is transmitted over error-prone channel.
文摘This paper proposed a back propagation neural network model for predictive block-matching. Predictive block-matching is a way to significantly decrease the computational complexity of motion estimation, but the traditional prediction model was proposed 26 years ago. It is straight forward but not accurate enough. The proposed back propagation neural network has 5 inputs, 5 neutrons and 1 output. Because of its simplicity, it requires very little calculation power which is negligible compared with existing computation complexity. The test results show 10% - 30% higher prediction accuracy and PSNR improvement up to 0.3 dB. The above advantages make it a feasible replacement of the current model.