针对视频转文字(video to text)存在的建模复杂和准确率低的问题,提出了基于自适应帧采样算法和双向长短时记忆模型的视频转文字方法.自适应帧采样算法能够动态地调整采样率,以提供尽量多的特征来训练模型;结合双向长短时记忆模型,能有...针对视频转文字(video to text)存在的建模复杂和准确率低的问题,提出了基于自适应帧采样算法和双向长短时记忆模型的视频转文字方法.自适应帧采样算法能够动态地调整采样率,以提供尽量多的特征来训练模型;结合双向长短时记忆模型,能有效学习视频中前面帧和未来帧的相关信息;同时,用于训练的特征是来自深度卷积神经网络的特征,使得这种双深度的网络结构能够学习视频帧在时空上的关联表示及全局依赖信息;帧信息的融合又增加了特征的种类,从而提升了实验效果.结果显示,在M-VAD和MPIIMD两个数据集中,文中的方法在METEOR中的评分均值分别为7.8%和8.6%,相对原S2VT模型分别提高了16.4%和21.1%,也提升了视频转文字的语言效果.展开更多
A new wipe transition detection approach was proposed. By analyzing the spatial-temporal characteristics of an ideal wipe production model, the concept of wipe transition strip (TS) was introduced. The macroblock type...A new wipe transition detection approach was proposed. By analyzing the spatial-temporal characteristics of an ideal wipe production model, the concept of wipe transition strip (TS) was introduced. The macroblock type information of P-frames is used to extract TS regions. An improved TS region accumulation technique is performed for detecting and verifying wipe transitions. The experimental results indicate that the proposed approach is capable of detecting various wipe transitions quickly and accurately.展开更多
2D-to-3D video conversion is a feasible way to generate 3D programs for the current 3DTV industry. However, for large-scale 3D video production, current systems are no longer adequate in terms of the time and labor re...2D-to-3D video conversion is a feasible way to generate 3D programs for the current 3DTV industry. However, for large-scale 3D video production, current systems are no longer adequate in terms of the time and labor required for conversion. In this paper, we introduce a distributed 2D-to-3D video conversion system that includes a 2D-to-3D video conversion module, architecture of the parallel computation on the cloud, and 3D video coding in the system. The system enables cooperation among multiple users in the simultaneous completion of their conversion tasks so that the conversion efficiency is greatly promoted. In the experiments, we evaluate the system based on criteria related to both time consumption and video coding performance.展开更多
The paper presents a framework for developing a variety of video transition effects. The framework is designed to deal with the problem of inefficiency for programmers to generate more and more diversified video trans...The paper presents a framework for developing a variety of video transition effects. The framework is designed to deal with the problem of inefficiency for programmers to generate more and more diversified video transition, which is caused by excessive coupling between the sub-modules of the system. So the framework is designed to be modular, flexible and extensible. Based on the analysis of common features of different effects, the implementation of video transition effect is divided into 4 sub-odules, each of which can be designed and developed independently. Furthermore, these sub-modules can be easily sub- stituted, modified and reused. We present a formal description of our framework, and give typical study cases to show the extensive utility of the framework.展开更多
Based on the fact that it is diffi cult to implement optimum inversion using 2D and 3D forward modeling with magnetic-source transient electromagnetics(TEM),this paper explores a novel approach to the implementation o...Based on the fact that it is diffi cult to implement optimum inversion using 2D and 3D forward modeling with magnetic-source transient electromagnetics(TEM),this paper explores a novel approach to the implementation of 2D magnetic-source TEM inversion.In particular,we converted magnetic-source TEM data into magnetotelluric(MT)data and then used a 2D MT inversion method to implement a 2D magnetic-source TEM inversion interpretation.First,we studied the similarity between magnetic-source TEM waves and MT waves and between magnetic-source TEM all-time apparent resistivity and MT Cagniard apparent resistivity.Then,we selected an optimal time-frequency transformation coeffi cient to implement rapid time-frequency transformation of all-time TEM apparent resistivity to MT Cagniard apparent resistivity.Afterward,we conducted 1D pseudo-MT inversions of magnetic-source 1D TEM theoretical models.The 1D inversion results demonstrated that the diff erence between the inversion parameters and model parameters was small,while the MT 1D inversion method could be used to conduct magnetic 1D TEM inversion within a certain margin of error.We further conducted 2D pseudo-MT inversions of 3D magnetic-source TEM theoretical models,and the 2D inversion results indicated that selecting a joint 2D pseudo-MT transverse-electric(TE)and transverse-magnetic(TM)inversion method based on measuring the line above a 3D anomalous body can help to accurately implement a 2D inversion interpretation of the 3D TEM response.展开更多
The new H.264 video coding standard achieves significantly higher compression performance than MPEG-2. As the MPEG-2 is popular in digital TV, DVD, etc., bandwidth or memory space can be saved by transcoding those str...The new H.264 video coding standard achieves significantly higher compression performance than MPEG-2. As the MPEG-2 is popular in digital TV, DVD, etc., bandwidth or memory space can be saved by transcoding those streams into H.264 in these applications. Unfortunately, the huge complexity keeps transcoding from being widely used in practical applications. This paper proposes an efficient transcoding architecture with a smart downscaling decoder and a fast mode decision algorithm. Using the proposed architecture, huge buffering memory space is saved and the transcoding complexity is reduced. Performance of the proposed fast mode decision algorithm is validated by experiments.展开更多
文摘针对视频转文字(video to text)存在的建模复杂和准确率低的问题,提出了基于自适应帧采样算法和双向长短时记忆模型的视频转文字方法.自适应帧采样算法能够动态地调整采样率,以提供尽量多的特征来训练模型;结合双向长短时记忆模型,能有效学习视频中前面帧和未来帧的相关信息;同时,用于训练的特征是来自深度卷积神经网络的特征,使得这种双深度的网络结构能够学习视频帧在时空上的关联表示及全局依赖信息;帧信息的融合又增加了特征的种类,从而提升了实验效果.结果显示,在M-VAD和MPIIMD两个数据集中,文中的方法在METEOR中的评分均值分别为7.8%和8.6%,相对原S2VT模型分别提高了16.4%和21.1%,也提升了视频转文字的语言效果.
文摘A new wipe transition detection approach was proposed. By analyzing the spatial-temporal characteristics of an ideal wipe production model, the concept of wipe transition strip (TS) was introduced. The macroblock type information of P-frames is used to extract TS regions. An improved TS region accumulation technique is performed for detecting and verifying wipe transitions. The experimental results indicate that the proposed approach is capable of detecting various wipe transitions quickly and accurately.
基金supported by the National Key Basic Research Program of China (973 Program) under Grant No. 2009CB320904the National Natural Science Foundation of China under Grants No. 61121002, No. 61231010, 91120004the Key Projects in the National Science and Technology Pillar Program under Grant No. 2011BAH08B03
文摘2D-to-3D video conversion is a feasible way to generate 3D programs for the current 3DTV industry. However, for large-scale 3D video production, current systems are no longer adequate in terms of the time and labor required for conversion. In this paper, we introduce a distributed 2D-to-3D video conversion system that includes a 2D-to-3D video conversion module, architecture of the parallel computation on the cloud, and 3D video coding in the system. The system enables cooperation among multiple users in the simultaneous completion of their conversion tasks so that the conversion efficiency is greatly promoted. In the experiments, we evaluate the system based on criteria related to both time consumption and video coding performance.
文摘The paper presents a framework for developing a variety of video transition effects. The framework is designed to deal with the problem of inefficiency for programmers to generate more and more diversified video transition, which is caused by excessive coupling between the sub-modules of the system. So the framework is designed to be modular, flexible and extensible. Based on the analysis of common features of different effects, the implementation of video transition effect is divided into 4 sub-odules, each of which can be designed and developed independently. Furthermore, these sub-modules can be easily sub- stituted, modified and reused. We present a formal description of our framework, and give typical study cases to show the extensive utility of the framework.
基金this research project is funded by a major science and technology project of Gansu province,“research on the complete set technology for highway construction in collapsible loess region of Gansu province”(No.1302GKDA009).
文摘Based on the fact that it is diffi cult to implement optimum inversion using 2D and 3D forward modeling with magnetic-source transient electromagnetics(TEM),this paper explores a novel approach to the implementation of 2D magnetic-source TEM inversion.In particular,we converted magnetic-source TEM data into magnetotelluric(MT)data and then used a 2D MT inversion method to implement a 2D magnetic-source TEM inversion interpretation.First,we studied the similarity between magnetic-source TEM waves and MT waves and between magnetic-source TEM all-time apparent resistivity and MT Cagniard apparent resistivity.Then,we selected an optimal time-frequency transformation coeffi cient to implement rapid time-frequency transformation of all-time TEM apparent resistivity to MT Cagniard apparent resistivity.Afterward,we conducted 1D pseudo-MT inversions of magnetic-source 1D TEM theoretical models.The 1D inversion results demonstrated that the diff erence between the inversion parameters and model parameters was small,while the MT 1D inversion method could be used to conduct magnetic 1D TEM inversion within a certain margin of error.We further conducted 2D pseudo-MT inversions of 3D magnetic-source TEM theoretical models,and the 2D inversion results indicated that selecting a joint 2D pseudo-MT transverse-electric(TE)and transverse-magnetic(TM)inversion method based on measuring the line above a 3D anomalous body can help to accurately implement a 2D inversion interpretation of the 3D TEM response.
基金Project (No. CNGI-04-15-2A) supported by the China Next Gen-eration Internet (CNGI)
文摘The new H.264 video coding standard achieves significantly higher compression performance than MPEG-2. As the MPEG-2 is popular in digital TV, DVD, etc., bandwidth or memory space can be saved by transcoding those streams into H.264 in these applications. Unfortunately, the huge complexity keeps transcoding from being widely used in practical applications. This paper proposes an efficient transcoding architecture with a smart downscaling decoder and a fast mode decision algorithm. Using the proposed architecture, huge buffering memory space is saved and the transcoding complexity is reduced. Performance of the proposed fast mode decision algorithm is validated by experiments.