Video structure analysis is a basic requirement for most content-based video editing and processing systems. This paper presents a fast video structure analysis method based on image segmentation in each frame, with r...Video structure analysis is a basic requirement for most content-based video editing and processing systems. This paper presents a fast video structure analysis method based on image segmentation in each frame, with region matching between frames. The structure analysis decomposes the video into several moving objects, including information about their colors, positions, shapes, movements, and lifetimes. The method also supports user interactions to improve the results. The result shows that this method is fast and stable and can complete video analyzinq interactivelv.展开更多
The increasing need of video based applications issues the importance of parsing and organizing the content in videos. However, the accurate understanding and manag- ing video contents at the semantic level is still i...The increasing need of video based applications issues the importance of parsing and organizing the content in videos. However, the accurate understanding and manag- ing video contents at the semantic level is still insufficient. The semantic gap between low level features and high level semantics cannot be bridged by manual or semi-automatic methods. In this paper, a semantic based model named video structural description (VSD) for representing and organizing the content in videos is proposed. Video structural descrip- tion aims at parsing video content into the text information, which uses spatiotemporal segmentation, feature selection, object recognition, and semantic web technology. The pro- posed model uses the predefined ontologies including con- cepts and their semantic relations to represent the contents in videos. The defined ontologies can be used to retrieve and organize videos unambiguously. In addition, besides the de- fined ontologies, the semantic relations between the videos are mined. The video resources are linked and organized by their related semantic relations.展开更多
High efficiency video coding (HEVC) transform algorithm for residual coding uses 2-dimensional (2D) 4 × 4 transforms with higher precision than H.264's 4 ×4 transforms, resulting in increased hardware c...High efficiency video coding (HEVC) transform algorithm for residual coding uses 2-dimensional (2D) 4 × 4 transforms with higher precision than H.264's 4 ×4 transforms, resulting in increased hardware complexity. In this paper, we present a shared architecture that can compute the 4 ~4 forward discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) of HEVC using a new mapping scheme in the video processor array structure. The architecture is implemented with only adders and shills to an area-efficient design. The proposed architecture is synthesized using ISE 14.7 and implemented using the BEE4 platform with the Virtex-6 FF1759 LX550T field programmable gate array (FPGA). The result shows that the video processor array structure achieves a maximum operation frequency of 165.2 MHz. The architecture and its implementation are presented in this paper to demonstrate its programmable and high performance.展开更多
基金Supported by the National Key Basic Research and Development (973) Program of China (No. 2006CB303106)the Specialized Research Fund for the Doctoral Program of Higher Education of MOE, P.R.C. (No. 20060003057)and the Basic Research Foun-dation of Tsinghua National Laboratory for Information Science and Technology (TNList)
文摘Video structure analysis is a basic requirement for most content-based video editing and processing systems. This paper presents a fast video structure analysis method based on image segmentation in each frame, with region matching between frames. The structure analysis decomposes the video into several moving objects, including information about their colors, positions, shapes, movements, and lifetimes. The method also supports user interactions to improve the results. The result shows that this method is fast and stable and can complete video analyzinq interactivelv.
文摘The increasing need of video based applications issues the importance of parsing and organizing the content in videos. However, the accurate understanding and manag- ing video contents at the semantic level is still insufficient. The semantic gap between low level features and high level semantics cannot be bridged by manual or semi-automatic methods. In this paper, a semantic based model named video structural description (VSD) for representing and organizing the content in videos is proposed. Video structural descrip- tion aims at parsing video content into the text information, which uses spatiotemporal segmentation, feature selection, object recognition, and semantic web technology. The pro- posed model uses the predefined ontologies including con- cepts and their semantic relations to represent the contents in videos. The defined ontologies can be used to retrieve and organize videos unambiguously. In addition, besides the de- fined ontologies, the semantic relations between the videos are mined. The video resources are linked and organized by their related semantic relations.
基金supported by the National Natural Science Foundation of China (61272120,61602377,61634004)the Shaanxi Provincial Co-Ordination Innovation Project of Science and Technology (2016KTZDGY02-04-02)the National Science and Technology Major Project of China (2016ZX03001003-006)
文摘High efficiency video coding (HEVC) transform algorithm for residual coding uses 2-dimensional (2D) 4 × 4 transforms with higher precision than H.264's 4 ×4 transforms, resulting in increased hardware complexity. In this paper, we present a shared architecture that can compute the 4 ~4 forward discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) of HEVC using a new mapping scheme in the video processor array structure. The architecture is implemented with only adders and shills to an area-efficient design. The proposed architecture is synthesized using ISE 14.7 and implemented using the BEE4 platform with the Virtex-6 FF1759 LX550T field programmable gate array (FPGA). The result shows that the video processor array structure achieves a maximum operation frequency of 165.2 MHz. The architecture and its implementation are presented in this paper to demonstrate its programmable and high performance.