3D sensing represents the main channel through which humans,or robotics agents,understand and interact with each other and with the real world.As such,many 3D acquisition technologies and devices have been developed a...3D sensing represents the main channel through which humans,or robotics agents,understand and interact with each other and with the real world.As such,many 3D acquisition technologies and devices have been developed and applied in emerging applications,such as autonomous systems,augmented reality and digital production.A typical 3D visual system takes RGB and/or range images of an object or scene and generates 3D geometry.展开更多
Following the success of the past eight years,Tsinghua University Press has continued to sponsor an annual award for the best papers published in Computational Visual Media.Eight papers in 2023 from Computational Visu...Following the success of the past eight years,Tsinghua University Press has continued to sponsor an annual award for the best papers published in Computational Visual Media.Eight papers in 2023 from Computational Visual Media were recommended by Associate Editors as candidate papers for the Best Paper Award.Associate Editors-in-Chief Ming C.Lin and Ralph Martin led the Best Paper Award Committee to select the Best Paper.After careful deliberation,the following paper was chosen with the unanimous consensus as the winner.展开更多
Electro-optic modulation at frequencies of 100 GHz and beyond is important for photonic-electronic signal processing at the highest speeds.To date,however,only a small number of devices exist that can operate up to th...Electro-optic modulation at frequencies of 100 GHz and beyond is important for photonic-electronic signal processing at the highest speeds.To date,however,only a small number of devices exist that can operate up to this frequency.In this study,we demonstrate that this frequency range can be addressed by nanophotonic,silicon-based modulators.We exploit the ultrafast Pockels effect by using the silicon–organic hybrid(SOH)platform,which combines highly nonlinear organic molecules with silicon waveguides.Until now,the bandwidth of these devices was limited by the losses of the radiofrequency(RF)signal and the RC(resistor-capacitor)time constant of the silicon structure.The RF losses are overcome by using a device as short as 500 μm,and the RC time constant is decreased by using a highly conductive electron accumulation layer and an improved gate insulator.Using this method,we demonstrate for the first time an integrated silicon modulator with a 3dB bandwidth at an operating frequency beyond 100 GHz.Our results clearly indicate that the RC time constant is not a fundamental speed limitation of SOH devices at these frequencies.Our device has a voltage–length product of only V_(π)L=11 V mm,which compares favorably with the best silicon-photonic modulators available today.Using cladding materials with stronger nonlinearities,the voltage–length product is expected to improve by more than an order of magnitude.展开更多
We present a multiscale deformed implicit surface network(MDISN)to reconstruct 3D objects from single images by adapting the implicit surface of the target object from coarse to fine to the input image.The basic idea ...We present a multiscale deformed implicit surface network(MDISN)to reconstruct 3D objects from single images by adapting the implicit surface of the target object from coarse to fine to the input image.The basic idea is to optimize the implicit surface according to the change of consecutive feature maps from the input image.And with multi-resolution feature maps,the implicit field is refined progressively,such that lower resolutions outline the main object components,and higher resolutions reveal fine-grained geometric details.To better explore the changes in feature maps,we devise a simple field deformation module that receives two consecutive feature maps to refine the implicit field with finer geometric details.Experimental results on both synthetic and real-world datasets demonstrate the superiority of the proposed method compared to state-of-the-art methods.展开更多
Understanding semantic similarity among images is the core of a wide range of computer graphics and computer vision applications.However,the visual context of images is often ambiguous as images that can be perceived ...Understanding semantic similarity among images is the core of a wide range of computer graphics and computer vision applications.However,the visual context of images is often ambiguous as images that can be perceived with emphasis on different attributes.In this paper,we present a method for learning the semantic visual similarity among images,inferring their latent attributes and embedding them into multi-spaces corresponding to each latent attribute.We consider the multi-embedding problem as an optimization function that evaluates the embedded distances with respect to qualitative crowdsourced clusterings.The key idea of our approach is to collect and embed qualitative pairwise tuples that share the same attributes in clusters.To ensure similarity attribute sharing among multiple measures,image classification clusters are presented to,and solved by users.The collected image clusters are then converted into groups of tuples,which are fed into our group optimization algorithm that jointly infers the attribute similarity and multi-attribute embedding.Our multi-attribute embedding allows retrieving similar objects in different attribute spaces.Experimental results show that our approach outperforms state-of-the-art multi-embedding approaches on various datasets,and demonstrate the usage of the multi-attribute embedding in image retrieval application.展开更多
Vast amounts of data are produced with the development of smart cities and urban computing technologies.The data is often captured from multiple sensors,with heterogeneous structures and highly decentralized connectio...Vast amounts of data are produced with the development of smart cities and urban computing technologies.The data is often captured from multiple sensors,with heterogeneous structures and highly decentralized connections.Integrated data representation and smart computational models are required for more complex tasks in urban computing.We dwell deeply on two fundamental questions—can we provide an integrated data representation for the whole cyber–physical–social system?And,can we provide an integrated framework to choose the appropriate data for understanding a specific urban event?A holography data representation and the quasi-holography computational model have been proposed to address these problems.We describe case studies using the quasi-holography computational model,and discuss further problems to solve regarding our model.展开更多
文摘3D sensing represents the main channel through which humans,or robotics agents,understand and interact with each other and with the real world.As such,many 3D acquisition technologies and devices have been developed and applied in emerging applications,such as autonomous systems,augmented reality and digital production.A typical 3D visual system takes RGB and/or range images of an object or scene and generates 3D geometry.
文摘Following the success of the past eight years,Tsinghua University Press has continued to sponsor an annual award for the best papers published in Computational Visual Media.Eight papers in 2023 from Computational Visual Media were recommended by Associate Editors as candidate papers for the Best Paper Award.Associate Editors-in-Chief Ming C.Lin and Ralph Martin led the Best Paper Award Committee to select the Best Paper.After careful deliberation,the following paper was chosen with the unanimous consensus as the winner.
基金We acknowledge support by the DFG Center for Functional Nanostructuresthe Helmholtz International Research School of Teratronics+3 种基金the Karlsruhe School of Optics and Photonicsthe EU-FP7 projects SOFI(grant 248609)and EURO-FOS(grant 224402)the BMBF joint project MISTRAL,which is funded by the German Ministry of Education and Research under grant 01BL0804and the European Research Council(ERC Starting Grant‘EnTeraPIC’,number 280145).
文摘Electro-optic modulation at frequencies of 100 GHz and beyond is important for photonic-electronic signal processing at the highest speeds.To date,however,only a small number of devices exist that can operate up to this frequency.In this study,we demonstrate that this frequency range can be addressed by nanophotonic,silicon-based modulators.We exploit the ultrafast Pockels effect by using the silicon–organic hybrid(SOH)platform,which combines highly nonlinear organic molecules with silicon waveguides.Until now,the bandwidth of these devices was limited by the losses of the radiofrequency(RF)signal and the RC(resistor-capacitor)time constant of the silicon structure.The RF losses are overcome by using a device as short as 500 μm,and the RC time constant is decreased by using a highly conductive electron accumulation layer and an improved gate insulator.Using this method,we demonstrate for the first time an integrated silicon modulator with a 3dB bandwidth at an operating frequency beyond 100 GHz.Our results clearly indicate that the RC time constant is not a fundamental speed limitation of SOH devices at these frequencies.Our device has a voltage–length product of only V_(π)L=11 V mm,which compares favorably with the best silicon-photonic modulators available today.Using cladding materials with stronger nonlinearities,the voltage–length product is expected to improve by more than an order of magnitude.
基金This work was supported in part by National Key R&D Program of China(2018YFB1403901,2019YFF0302902)NSF China(61902007)Joint NSFC-ISF Research Grant,China(62161146002).
文摘We present a multiscale deformed implicit surface network(MDISN)to reconstruct 3D objects from single images by adapting the implicit surface of the target object from coarse to fine to the input image.The basic idea is to optimize the implicit surface according to the change of consecutive feature maps from the input image.And with multi-resolution feature maps,the implicit field is refined progressively,such that lower resolutions outline the main object components,and higher resolutions reveal fine-grained geometric details.To better explore the changes in feature maps,we devise a simple field deformation module that receives two consecutive feature maps to refine the implicit field with finer geometric details.Experimental results on both synthetic and real-world datasets demonstrate the superiority of the proposed method compared to state-of-the-art methods.
基金This study was funded by National Key Research&Develop-ment Plan of China(No.2016YFB1001404)National Natural Science Foundation of China(No.61602273).
文摘Understanding semantic similarity among images is the core of a wide range of computer graphics and computer vision applications.However,the visual context of images is often ambiguous as images that can be perceived with emphasis on different attributes.In this paper,we present a method for learning the semantic visual similarity among images,inferring their latent attributes and embedding them into multi-spaces corresponding to each latent attribute.We consider the multi-embedding problem as an optimization function that evaluates the embedded distances with respect to qualitative crowdsourced clusterings.The key idea of our approach is to collect and embed qualitative pairwise tuples that share the same attributes in clusters.To ensure similarity attribute sharing among multiple measures,image classification clusters are presented to,and solved by users.The collected image clusters are then converted into groups of tuples,which are fed into our group optimization algorithm that jointly infers the attribute similarity and multi-attribute embedding.Our multi-attribute embedding allows retrieving similar objects in different attribute spaces.Experimental results show that our approach outperforms state-of-the-art multi-embedding approaches on various datasets,and demonstrate the usage of the multi-attribute embedding in image retrieval application.
基金This work was supported by National Basic Research Program of China(973 Program)(2015CB352501,2015CB352500).
文摘Vast amounts of data are produced with the development of smart cities and urban computing technologies.The data is often captured from multiple sensors,with heterogeneous structures and highly decentralized connections.Integrated data representation and smart computational models are required for more complex tasks in urban computing.We dwell deeply on two fundamental questions—can we provide an integrated data representation for the whole cyber–physical–social system?And,can we provide an integrated framework to choose the appropriate data for understanding a specific urban event?A holography data representation and the quasi-holography computational model have been proposed to address these problems.We describe case studies using the quasi-holography computational model,and discuss further problems to solve regarding our model.