期刊文献+
共找到8篇文章
< 1 >
每页显示 20 50 100
基于普通器件实现快1000倍的相机与机器视觉 被引量:1
1
作者 Tiejun Huang Yajing Zheng +13 位作者 Zhaofei Yu Rui Chen Yuan Li Ruiqin Xiong Lei Ma Junwei Zhao Siwei Dong Lin Zhu Jianing Li Shanshan Jia Yihua Fu Boxin Shi Si Wu yonghong tian 《Engineering》 SCIE EI CAS CSCD 2023年第6期110-119,M0005,共11页
在数码相机中,我们发现了一个重大缺陷,即从胶片相机继承的图像和视频模型阻碍了相机捕捉快速变化的光子世界。我们提出了一种新的视觉形式,称为视象(vform),这是一个比特序列阵列,其中每个比特表示光子的累积是否达到了一个阈值,从而... 在数码相机中,我们发现了一个重大缺陷,即从胶片相机继承的图像和视频模型阻碍了相机捕捉快速变化的光子世界。我们提出了一种新的视觉形式,称为视象(vform),这是一个比特序列阵列,其中每个比特表示光子的累积是否达到了一个阈值,从而可以记录和重建任何时刻场景的光强。仅使用消费级CMOS传感器和集成电路,开发了一种比传统相机快1000倍的脉冲相机。将视象看作生物视觉中的脉冲序列,进一步开发了基于脉冲神经网络的机器视觉系统,它可以将机器的速度和生物视觉的机理结合起来,从而实现了比人类视觉快1000倍的高速目标检测和跟踪,并通过辅助裁判和目标瞄准系统证明了脉冲相机和超级视觉系统的效用。视象模型和芯片有望从根本上改变图像和视频的概念以及摄影、电影和视觉媒体等相关行业,并开启一个全新的基于脉冲神经网络的速度自由的机器视觉时代。 展开更多
关键词 脉冲神经网络 机器视觉 生物视觉 视觉系统 CMOS传感器 视觉媒体 胶片相机 脉冲序列
下载PDF
Toward the Next Generation of Retinal Neuroprosthesis: Visual Computation with Spikes 被引量:3
2
作者 Zhaofei Yu Jian K.Liu +4 位作者 Shanshan Jia Yichen Zhang Yajing Zheng yonghong tian Tiejun Huang 《Engineering》 SCIE EI 2020年第4期449-461,共13页
A neuroprosthesis is a type of precision medical device that is intended to manipulate the neuronal signals of the brain in a closed-loop fashion,while simultaneously receiving stimuli from the environment and control... A neuroprosthesis is a type of precision medical device that is intended to manipulate the neuronal signals of the brain in a closed-loop fashion,while simultaneously receiving stimuli from the environment and controlling some part of a human brain or body.Incoming visual information can be processed by the brain in millisecond intervals.The retina computes visual scenes and sends its output to the cortex in the form of neuronal spikes for further computation.Thus,the neuronal signal of interest for a retinal neuroprosthesis is the neuronal spike.Closed-loop computation in a neuroprosthesis includes two stages:encoding a stimulus as a neuronal signal,and decoding it back into a stimulus.In this paper,we review some of the recent progress that has been achieved in visual computation models that use spikes to analyze natural scenes that include static images and dynamic videos.We hypothesize that in order to obtain a better understanding of the computational principles in the retina,a hypercircuit view of the retina is necessary,in which the different functional network motifs that have been revealed in the cortex neuronal network are taken into consideration when interacting with the retina.The different building blocks of the retina,which include a diversity of cell types and synaptic connections-both chemical synapses and electrical synapses(gap junctions)-make the retina an ideal neuronal network for adapting the computational techniques that have been developed in artificial intelligence to model the encoding and decoding of visual scenes.An overall systems approach to visual computation with neuronal spikes is necessary in order to advance the next generation of retinal neuroprosthesis as an artificial visual system. 展开更多
关键词 Visual coding RETINA NEUROPROSTHESIS Brain-machine interface Artificial intelligence Deep learning Spiking neural network Probabilistic graphical model
下载PDF
Cluster structure prediction via CALYPSO method 被引量:1
3
作者 田永红 孙伟国 +2 位作者 陈伯乐 金圆圆 卢成 《Chinese Physics B》 SCIE EI CAS CSCD 2019年第10期1-9,共9页
Cluster science as a bridge linking atomic molecular physics and condensed matter inspired the nanomaterials development in the past decades, ranging from the single-atom catalysis to ligand-protected noble metal clus... Cluster science as a bridge linking atomic molecular physics and condensed matter inspired the nanomaterials development in the past decades, ranging from the single-atom catalysis to ligand-protected noble metal clusters. The corresponding studies not only have been restricted to the search for the geometrical structures of clusters, but also have promoted the development of cluster-assembled materials as the building blocks. The CALYPSO cluster prediction method combined with other computational techniques have significantly stimulated the development of the cluster-based nanomaterials. In this review, we will summarize some good cases of cluster structure by CALYPSO method, which have also been successfully identified by the photoelectron spectra experiments. Beginning with the alkali-metal clusters, which serve as benchmarks, a series of studies are performed on the size-dependent elemental clusters which possess relatively high stability and interesting chemical physical properties. Special attentions are paid to the boron-based clusters because of their promising applications. The NbSi12 and BeB16 clusters, for example, are two classic representatives of the silicon-and boron-based clusters, which can be viewed as building blocks of nanotubes and borophene. This review offers a detailed description of the structural evolutions and electronic properties of medium-sized pure and doped clusters, which will advance fundamental knowledge of cluster-based nanomaterials and provide valuable information for further theoretical and experimental studies. 展开更多
关键词 CALYPSO METHOD CLUSTER STRUCTURE PREDICTION BORON CLUSTER SILICON CLUSTER
下载PDF
Introduction to AVS2 Scene Video Coding Techniques 被引量:1
4
作者 Jiaying Yan Siwei Dong +1 位作者 yonghong tian Tiejun Huang 《ZTE Communications》 2016年第1期50-53,共4页
The second generation Audio Video Coding Standard (AVS2) is the most recent video coding standard. By introducing several new coding techniques, AVS2 can provide more efficient compression for scene videos such as sur... The second generation Audio Video Coding Standard (AVS2) is the most recent video coding standard. By introducing several new coding techniques, AVS2 can provide more efficient compression for scene videos such as surveillance videos, conference videos, etc. Due to the limited scenes, scene videos have great redundancy especially in background region. The new scene video coding techniques applied in AVS2 mainly focus on reducing redundancy in order to achieve higher compression. This paper introduces several important AVS2 scene video coding techniques. Experimental results show that with scene video coding tools, AVS2 can save nearly 40%BD?rate (Bj?ntegaard?Delta bit?rate) on scene videos. 展开更多
关键词 AVS2 scene videos coding background prediction
下载PDF
Parsing Objects at a Finer Granularity: A Survey
5
作者 Yifan Zhao Jia Li yonghong tian 《Machine Intelligence Research》 EI CSCD 2024年第3期431-451,共21页
Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e.g., agr... Fine-grained visual parsing, including fine-grained part segmentation and fine-grained object recognition, has attracted considerable critical attention due to its importance in many real-world applications, e.g., agriculture, remote sensing, and space technologies. Predominant research efforts tackle these fine-grained sub-tasks following different paradigms, while the inherent relations between these tasks are neglected. Moreover, given most of the research remains fragmented, we conduct an in-depth study of the advanced work from a new perspective of learning the part relationship. In this perspective, we first consolidate recent research and benchmark syntheses with new taxonomies. Based on this consolidation, we revisit the universal challenges in fine-grained part segmentation and recognition tasks and propose new solutions by part relationship learning for these important challenges. Furthermore, we conclude several promising lines of research in fine-grained visual parsing for future research. 展开更多
关键词 Finer granularity visual parsing part segmentation fine-grained object recognition part relationship
原文传递
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 被引量:5
6
作者 Xiao Wang Guangyao Chen +5 位作者 Guangwu Qian Pengcheng Gao Xiao-Yong Wei Yaowei Wang yonghong tian Wen Gao 《Machine Intelligence Research》 EI CSCD 2023年第4期447-482,共36页
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp... With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey. 展开更多
关键词 Multi-modal(MM) pre-trained model(PTM) information fusion representation learning deep learning
原文传递
数字视网膜:智慧城市系统演进的关键环节 被引量:20
7
作者 高文 田永鸿 王坚 《中国科学:信息科学》 CSCD 北大核心 2018年第8期1076-1082,共7页
本文阐述了作者对智慧城市建设和发展的主要观点:(1)如何实时聚合各类城市大数据,特别是来自视频监控网络的图像视频数据,并通过构建基于云计算的"城市大脑"来分析和挖掘大数据价值并服务于城市运营与管理,是智慧城市发展中... 本文阐述了作者对智慧城市建设和发展的主要观点:(1)如何实时聚合各类城市大数据,特别是来自视频监控网络的图像视频数据,并通过构建基于云计算的"城市大脑"来分析和挖掘大数据价值并服务于城市运营与管理,是智慧城市发展中亟待解决的一个关键问题.(2)现阶段智慧城市建设的现状是"有眼、有脑",但作为"眼睛"的摄像头功能过于单一使得"脑强眼弱",其根源在于传统监控摄像机网络所采用的技术体系是为存储而不是分析设计的.尽管近期有些智能摄像头具有车牌或人脸识别功能,但是这种单纯强调"边缘计算"的方案仍然无法解决"眼脑合一"的问题.(3)为了解决目前阻碍智慧城市系统功能快速演进的难题,我们应借鉴人类进化了数十万年的视觉系统之"人类视网膜同时具有影像编码与特征编码功能"这一特性,研究与设计数字视网膜,使之具有统一时间戳和精确地理位置,能同时进行高效视频编码和紧凑特征表达的联合优化,并有效支持云端大规模监控视频分析与快速视觉搜索等功能.(4)为利用数字视网膜来构筑智慧城市的"慧眼",应积极布局与推进相关标准制定、芯片与硬件实现、支撑软件开发与软硬件开源社区,并开展大规模测试与应用. 展开更多
关键词 智慧城市 城市大脑 数字视网膜
原文传递
Perspective:Societally connected multimedia across cultures
8
作者 Zhongfei ZHANG Zhengyou ZHANG +14 位作者 Ramesh JAIN Yueting ZHUANG Noshir CONTRACTOR Alexander G. HAUPTMANN Alejandro (Alex) JAIMES Wanqing LI Alexander C. LOUI Tao MEI Nicu SEBE yonghong tian Vincent S. TSENG Qing WANG Changsheng XU Huimin YU Shiwen YU 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2012年第12期875-880,共6页
The advance of the Internet in the past decade has radically changed the way people communicate and col- laborate with each other. Physical distance is no more a barrier in online social networks, but cultural differe... The advance of the Internet in the past decade has radically changed the way people communicate and col- laborate with each other. Physical distance is no more a barrier in online social networks, but cultural differences (at the individual, community, as well as societal levels) still govern human-human interactions and must be con- sidered and leveraged in the online world. The rapid de-ployment of high-speed Internet allows humans to interact using a rich set of multimedia data such as texts, pictures, and videos. This position paper proposes to define a new research area called 'connected multimedia', which is the study of a collection of research issues of the super-area social media that receive little attention in the literature. By connected multimedia, we mean the study of the social and technical interactions among users, multimedia data, and devices across cultures and explicitly exploiting the cultural differences. We justify why it is necessary to bring attention to this new research area and what benefits of this new research area may bring to the broader scientific research community and the humanity. 展开更多
关键词 多媒体数据 跨文化 连接 社会 LNTERNET 文化差异 网络世界 相互作用
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部