Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate eval...Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures.展开更多
Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly dist...Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.展开更多
Vision-simulated imagery―the process of generating images that mimic the human visual system―is a valuable tool with a wide spectrum of possible applications, including visual acuity measurements, personalized plann...Vision-simulated imagery―the process of generating images that mimic the human visual system―is a valuable tool with a wide spectrum of possible applications, including visual acuity measurements, personalized planning of corrective lenses and surgeries, vision-correcting displays, vision-related hardware development, and extended reality discomfort reduction. A critical property of human vision is that it is imperfect because of the highly influential wavefront aberrations that vary from person to person. This study provides an overview of the existing computational image generation techniques that properly simulate human vision in the presence of wavefront aberrations. These algorithms typically apply ray tracing with a detailed description of the simulated eye or utilize the point-spread func-tion of the eye to perform convolution on the input image. Based on the description of the vision simulation tech-niques, several of their characteristic features have been evaluated and some potential application areas and research directions have been outlined.展开更多
To overcome the shortcomings of the Lee image enhancement algorithm and its improvement based on the logarithmic image processing(LIP) model, this paper proposes what we believe to be an effective image enhancement al...To overcome the shortcomings of the Lee image enhancement algorithm and its improvement based on the logarithmic image processing(LIP) model, this paper proposes what we believe to be an effective image enhancement algorithm. This algorithm introduces fuzzy entropy, makes full use of neighborhood information, fuzzy information and human visual characteristics.To enhance an image, this paper first carries out the reasonable fuzzy-3 partition of its histogram into the dark region, intermediate region and bright region. It then extracts the statistical characteristics of the three regions and adaptively selects the parameter αaccording to the statistical characteristics of the image’s gray-scale values. It also adds a useful nonlinear transform, thus increasing the ubiquity of the algorithm. Finally, the causes for the gray-scale value overcorrection that occurs in the traditional image enhancement algorithms are analyzed and their solutions are proposed.The simulation results show that our image enhancement algorithm can effectively suppress the noise of an image, enhance its contrast and visual effect, sharpen its edge and adjust its dynamic range.展开更多
A Robust Adaptive Video Encoder (RAVE) based on human visual model is proposed. The encoder combines the best features of Fine Granularity Scalable (FGS) coding, framedropping coding, video redundancy coding, and huma...A Robust Adaptive Video Encoder (RAVE) based on human visual model is proposed. The encoder combines the best features of Fine Granularity Scalable (FGS) coding, framedropping coding, video redundancy coding, and human visual model. According to packet loss and available bandwidth of the network, the encoder adjust the output bit rate by jointly adapting quantization step-size instructed by human visual model, rate shaping, and periodically inserting key frame. The proposed encoder is implemented based on MPEG-4 encoder and is compared with the case of a conventional FGS algorithm. It is shown that RAVE is a very efficient robust video encoder that provides improved visual quality for the receiver and consumes equal or less network resource. Results are confirmed by subjective tests and simulation tests.展开更多
The key to the wavelet based denoising teehniquea is how to manipulate the wavelet coefficients. By referring to the idea of Inclusive-OR in the design of circuits, this paper proposes a new algorithm called wavelet d...The key to the wavelet based denoising teehniquea is how to manipulate the wavelet coefficients. By referring to the idea of Inclusive-OR in the design of circuits, this paper proposes a new algorithm called wavelet domain Inclusive-OR denoising algorithm(WDIDA), which distinguishes the wavelet coefficients belonging to image or noise by considering their phases and modulus maxima simultaneously. Using this new algorithm, the denoising effects are improved and the computation time is reduced. Furthermore, in order to enhance the edges of the image but not magnify noise, a contrast nonlinear enhancing algorithm is presented according to human visual properties. Compared with traditional enhancing algorithms, the algorithm that we proposed has a better noise reducing performanee , preserving edges and improving the visual quality of images.展开更多
红外小目标的检测一直是红外追踪系统的关键技术,针对现有红外小目标检测方法在复杂背景下易造成虚警、检测速度慢的不足,从人类视觉系统的角度出发,参考了多尺度局部能量因子检测方法(multiscale local contrast measure using a local...红外小目标的检测一直是红外追踪系统的关键技术,针对现有红外小目标检测方法在复杂背景下易造成虚警、检测速度慢的不足,从人类视觉系统的角度出发,参考了多尺度局部能量因子检测方法(multiscale local contrast measure using a local energy factor,MLCM-LEF),提出了一种基于双层局部能量因子的红外小目标检测方法.从局部能量差异与局部亮度差异两个角度进行目标检测,使用双层局部能量因子从能量角度描述小目标与背景的相异程度,同时采取加权亮度差因子从亮度角度对图像进行目标检测,通过二维高斯融合上述二者的处理结果,最终利用图像均值和标准差进行自适应阈值分割,提取红外小目标.经过公开数据集实验测试,该方法在抑制背景噪声、减低虚警概率的表现上比主流的检测方法有所提升,与MLCM-LEF算法相比,基于双层局部能量因子的方法将单帧检测时间降低至三分之一.展开更多
In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on...In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its Mean Absolute Difference (MAD) value. Lastly, the algorithm was implemented in JVT JM12.2. Simulation results show that, comparing with traditional rate control algorithm, the proposed one can reduce the coding bit rate and improve the reconstructed video subjective quality, especially for visual saliency region. It is very suitable for wireless video transmission.展开更多
The concept of receptive field(RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals,while those in humans remain nearly unexplored. Here, we measured neuronal RFs w...The concept of receptive field(RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals,while those in humans remain nearly unexplored. Here, we measured neuronal RFs with intracranial local field potentials(LFPs) and spiking activity in human visual cortex(V1/V2/V3). We recorded LFPs via macro-contacts and discovered that RF sizes estimated from lowfrequency activity(LFA, 0.5–30 Hz) were larger than those estimated from low-gamma activity(LGA, 30–60 Hz) and high-gamma activity(HGA, 60–150 Hz). We then took a rare opportunity to record LFPs and spiking activity via microwires in V1 simultaneously. We found that RF sizes and temporal profiles measured from LGA and HGA closely matched those from spiking activity. In sum, this study reveals that spiking activity of neurons in human visual cortex could be well approximated by LGA and HGA in RF estimation and temporal profile measurement, implying the pivotal functions of LGA and HGA in early visual information processing.展开更多
While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal proces...While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal processing and communication algorithms, as well as various related decision-making processes. In this paper, we first provide an overview of recently derived quality assessment approaches for traditional visual signals (i.e., 2D images/videos), with highlights for new trends (such as machine learning approaches). On the other hand, with the ongoing development of devices and multimedia services, newly emerged visual signals (e.g., mobile/3D videos) are becoming more and more popular. This work focuses on recent progresses of quality metrics, which have been reviewed for the newly emerged forms of visual signals, which include scalable and mobile videos, High Dynamic Range (HDR) images, image segmentation results, 3D images/videos, and retargeted images.展开更多
Recent studies on no-reference image quality assessment (NR-IQA) methods usually learn to evaluate the image quality by regressing from human subjective scores of the training samples. This study presented an NR-IQA m...Recent studies on no-reference image quality assessment (NR-IQA) methods usually learn to evaluate the image quality by regressing from human subjective scores of the training samples. This study presented an NR-IQA method based on the basic image visual parameters without using human scored image databases in learning. We demonstrated that these features comprised the most basic characteristics for constructing an image and influencing the visual quality of an image. In this paper, the definitions, computational method, and relationships among these visual metrics were described. We subsequently proposed a no-reference assessment function, which was referred to as a visual parameter measurement index (VPMI), based on the integration of these visual metrics to assess image quality. It is established that the maximum of VPMI corresponds to the best quality of the color image. We verified this method using the popular assessment database—image quality assessment database (LIVE), and the results indicated that the proposed method matched better with the subjective assessment of human vision. Compared with other image quality assessment models, it is highly competitive. VPMI has low computational complexity, which makes it promising to implement in real-time image assessment systems.展开更多
Bicycle sharing system has emerged as a new mode of transportation in many big cities over the past decade.Since the large number of bicycle stations distribute widely in the city,it is difficult to identify their uni...Bicycle sharing system has emerged as a new mode of transportation in many big cities over the past decade.Since the large number of bicycle stations distribute widely in the city,it is difficult to identify their unique attributes and characteristics directly.Oriented to the real bicycle hire dataset in Hangzhou,China,the clustering analysis for the bicycle stations based on the temporal flow data was carried out firstly.Then,based on the spatial distribution and temporal attributes of calculated clusters,visual diagram and map were used to vividly analyze the bicycle hire behavior related to station category and study the travel rules of citizens.The experimental results demonstrate the relation between human mobility,the time of day,day of week and the station location.展开更多
基金This work was supported by National Natural Science Foundation of China(Nos.61831015 and 61901260)Key Research and Development Program of China(No.2019YFB1405902).
文摘Objective image quality assessment(IQA)plays an important role in various visual communication systems,which can automatically and efficiently predict the perceived quality of images.The human eye is the ultimate evaluator for visual experience,thus the modeling of human visual system(HVS)is a core issue for objective IQA and visual experience optimization.The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively,while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity.For bridging the gap between signal distortion and visual experience,in this paper,we propose a novel perceptual no-reference(NR)IQA algorithm based on structural computational modeling of HVS.According to the mechanism of the human brain,we divide the visual signal processing into a low-level visual layer,a middle-level visual layer and a high-level visual layer,which conduct pixel information processing,primitive information processing and global image information processing,respectively.The natural scene statistics(NSS)based features,deep features and free-energy based features are extracted from these three layers.The support vector regression(SVR)is employed to aggregate features to the final quality prediction.Extensive experimental comparisons on three widely used benchmark IQA databases(LIVE,CSIQ and TID2013)demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures.
基金Supported by Tianjin Municipal Natural Science Foundation of China(Grant No.19JCJQJC61600)Hebei Provincial Natural Science Foundation of China(Grant Nos.F2020202051,F2020202053).
文摘Visual odometry is critical in visual simultaneous localization and mapping for robot navigation.However,the pose estimation performance of most current visual odometry algorithms degrades in scenes with unevenly distributed features because dense features occupy excessive weight.Herein,a new human visual attention mechanism for point-and-line stereo visual odometry,which is called point-line-weight-mechanism visual odometry(PLWM-VO),is proposed to describe scene features in a global and balanced manner.A weight-adaptive model based on region partition and region growth is generated for the human visual attention mechanism,where sufficient attention is assigned to position-distinctive objects(sparse features in the environment).Furthermore,the sum of absolute differences algorithm is used to improve the accuracy of initialization for line features.Compared with the state-of-the-art method(ORB-VO),PLWM-VO show a 36.79%reduction in the absolute trajectory error on the Kitti and Euroc datasets.Although the time consumption of PLWM-VO is higher than that of ORB-VO,online test results indicate that PLWM-VO satisfies the real-time demand.The proposed algorithm not only significantly promotes the environmental adaptability of visual odometry,but also quantitatively demonstrates the superiority of the human visual attention mechanism.
文摘Vision-simulated imagery―the process of generating images that mimic the human visual system―is a valuable tool with a wide spectrum of possible applications, including visual acuity measurements, personalized planning of corrective lenses and surgeries, vision-correcting displays, vision-related hardware development, and extended reality discomfort reduction. A critical property of human vision is that it is imperfect because of the highly influential wavefront aberrations that vary from person to person. This study provides an overview of the existing computational image generation techniques that properly simulate human vision in the presence of wavefront aberrations. These algorithms typically apply ray tracing with a detailed description of the simulated eye or utilize the point-spread func-tion of the eye to perform convolution on the input image. Based on the description of the vision simulation tech-niques, several of their characteristic features have been evaluated and some potential application areas and research directions have been outlined.
基金supported by the National Natural Science Foundation of China(61472324)
文摘To overcome the shortcomings of the Lee image enhancement algorithm and its improvement based on the logarithmic image processing(LIP) model, this paper proposes what we believe to be an effective image enhancement algorithm. This algorithm introduces fuzzy entropy, makes full use of neighborhood information, fuzzy information and human visual characteristics.To enhance an image, this paper first carries out the reasonable fuzzy-3 partition of its histogram into the dark region, intermediate region and bright region. It then extracts the statistical characteristics of the three regions and adaptively selects the parameter αaccording to the statistical characteristics of the image’s gray-scale values. It also adds a useful nonlinear transform, thus increasing the ubiquity of the algorithm. Finally, the causes for the gray-scale value overcorrection that occurs in the traditional image enhancement algorithms are analyzed and their solutions are proposed.The simulation results show that our image enhancement algorithm can effectively suppress the noise of an image, enhance its contrast and visual effect, sharpen its edge and adjust its dynamic range.
基金Supported by Innovation Fund of China(00C26224210641)
文摘A Robust Adaptive Video Encoder (RAVE) based on human visual model is proposed. The encoder combines the best features of Fine Granularity Scalable (FGS) coding, framedropping coding, video redundancy coding, and human visual model. According to packet loss and available bandwidth of the network, the encoder adjust the output bit rate by jointly adapting quantization step-size instructed by human visual model, rate shaping, and periodically inserting key frame. The proposed encoder is implemented based on MPEG-4 encoder and is compared with the case of a conventional FGS algorithm. It is shown that RAVE is a very efficient robust video encoder that provides improved visual quality for the receiver and consumes equal or less network resource. Results are confirmed by subjective tests and simulation tests.
文摘The key to the wavelet based denoising teehniquea is how to manipulate the wavelet coefficients. By referring to the idea of Inclusive-OR in the design of circuits, this paper proposes a new algorithm called wavelet domain Inclusive-OR denoising algorithm(WDIDA), which distinguishes the wavelet coefficients belonging to image or noise by considering their phases and modulus maxima simultaneously. Using this new algorithm, the denoising effects are improved and the computation time is reduced. Furthermore, in order to enhance the edges of the image but not magnify noise, a contrast nonlinear enhancing algorithm is presented according to human visual properties. Compared with traditional enhancing algorithms, the algorithm that we proposed has a better noise reducing performanee , preserving edges and improving the visual quality of images.
文摘红外小目标的检测一直是红外追踪系统的关键技术,针对现有红外小目标检测方法在复杂背景下易造成虚警、检测速度慢的不足,从人类视觉系统的角度出发,参考了多尺度局部能量因子检测方法(multiscale local contrast measure using a local energy factor,MLCM-LEF),提出了一种基于双层局部能量因子的红外小目标检测方法.从局部能量差异与局部亮度差异两个角度进行目标检测,使用双层局部能量因子从能量角度描述小目标与背景的相异程度,同时采取加权亮度差因子从亮度角度对图像进行目标检测,通过二维高斯融合上述二者的处理结果,最终利用图像均值和标准差进行自适应阈值分割,提取红外小目标.经过公开数据集实验测试,该方法在抑制背景噪声、减低虚警概率的表现上比主流的检测方法有所提升,与MLCM-LEF算法相比,基于双层局部能量因子的方法将单帧检测时间降低至三分之一.
基金supported by National Natural Science Foundation of China under Grant No.610700800973 Sub-Program Projects under Grant No.2009CB320906+3 种基金National Science and Technology of Major Special Projects under Grant No.2010ZX03004-003S&T Planning Project of Hubei Provincial Department of Education under Grant No. Q20112805H&SPlanning Project of Hubei Provincial Department of Education under Grant No.2011jyte142Science Foundation of HubeiProvincial under Grant No.2010CDB05103
文摘In order to further improve the efficiency of video compression, we introduce a perceptual characteristics of Human Visual System (HVS) to video coding, and propose a novel video coding rate control algorithm based on human visual saliency model in H.264/AVC. Firstly, we modifie Itti's saliency model. Secondly, target bits of each frame are allocated through the correlation of saliency region between the current and previous frame, and the complexity of each MB is modified through the saliency value and its Mean Absolute Difference (MAD) value. Lastly, the algorithm was implemented in JVT JM12.2. Simulation results show that, comparing with traditional rate control algorithm, the proposed one can reduce the coding bit rate and improve the reconstructed video subjective quality, especially for visual saliency region. It is very suitable for wireless video transmission.
基金supported by the National Science and Technology Innovation 2030 Major Program(2022ZD0204802,2022ZD0204804)the National Natural Science Foundation of China(31930053,32171039)Beijing Academy of Artificial Intelligence(BAAI)。
文摘The concept of receptive field(RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals,while those in humans remain nearly unexplored. Here, we measured neuronal RFs with intracranial local field potentials(LFPs) and spiking activity in human visual cortex(V1/V2/V3). We recorded LFPs via macro-contacts and discovered that RF sizes estimated from lowfrequency activity(LFA, 0.5–30 Hz) were larger than those estimated from low-gamma activity(LGA, 30–60 Hz) and high-gamma activity(HGA, 60–150 Hz). We then took a rare opportunity to record LFPs and spiking activity via microwires in V1 simultaneously. We found that RF sizes and temporal profiles measured from LGA and HGA closely matched those from spiking activity. In sum, this study reveals that spiking activity of neurons in human visual cortex could be well approximated by LGA and HGA in RF estimation and temporal profile measurement, implying the pivotal functions of LGA and HGA in early visual information processing.
基金partially supported by the Research Grants Council of the Hong Kong SAR, China (Project CUHK 415712)the Ministry of Education Academic Research Fund (AcRF) Tier 2 in Singapore under Grant No. T208B1218
文摘While quality assessment is essential for testing, optimizing, benchmarking, monitoring, and inspecting related systems and services, it also plays an essential role in the design of virtually all visual signal processing and communication algorithms, as well as various related decision-making processes. In this paper, we first provide an overview of recently derived quality assessment approaches for traditional visual signals (i.e., 2D images/videos), with highlights for new trends (such as machine learning approaches). On the other hand, with the ongoing development of devices and multimedia services, newly emerged visual signals (e.g., mobile/3D videos) are becoming more and more popular. This work focuses on recent progresses of quality metrics, which have been reviewed for the newly emerged forms of visual signals, which include scalable and mobile videos, High Dynamic Range (HDR) images, image segmentation results, 3D images/videos, and retargeted images.
基金supported by the National Natural Science Foundation of China under Grants No.61773094,No.61573080,No.91420105,and No.61375115National Program on Key Basic Research Project(973 Program)under Grant No.2013CB329401+1 种基金National High-Tech R&D Program of China(863 Program)under Grant No.2015AA020505Sichuan Province Science and Technology Project under Grants No.2015SZ0141 and No.2018ZA0138
文摘Recent studies on no-reference image quality assessment (NR-IQA) methods usually learn to evaluate the image quality by regressing from human subjective scores of the training samples. This study presented an NR-IQA method based on the basic image visual parameters without using human scored image databases in learning. We demonstrated that these features comprised the most basic characteristics for constructing an image and influencing the visual quality of an image. In this paper, the definitions, computational method, and relationships among these visual metrics were described. We subsequently proposed a no-reference assessment function, which was referred to as a visual parameter measurement index (VPMI), based on the integration of these visual metrics to assess image quality. It is established that the maximum of VPMI corresponds to the best quality of the color image. We verified this method using the popular assessment database—image quality assessment database (LIVE), and the results indicated that the proposed method matched better with the subjective assessment of human vision. Compared with other image quality assessment models, it is highly competitive. VPMI has low computational complexity, which makes it promising to implement in real-time image assessment systems.
基金the Public Projects of Zhejiang Province,China(Nos.2016C33110,2015C33067)National Natural Science Foundations of China(Nos.61602141,61473108,61402141)
文摘Bicycle sharing system has emerged as a new mode of transportation in many big cities over the past decade.Since the large number of bicycle stations distribute widely in the city,it is difficult to identify their unique attributes and characteristics directly.Oriented to the real bicycle hire dataset in Hangzhou,China,the clustering analysis for the bicycle stations based on the temporal flow data was carried out firstly.Then,based on the spatial distribution and temporal attributes of calculated clusters,visual diagram and map were used to vividly analyze the bicycle hire behavior related to station category and study the travel rules of citizens.The experimental results demonstrate the relation between human mobility,the time of day,day of week and the station location.