In software development process,the last step is usually the Graphic User Interface(GUI) test,which is part of the final user experience(UE) test.Traditionally,there exist some GUI test tools in the market,such as Abb...In software development process,the last step is usually the Graphic User Interface(GUI) test,which is part of the final user experience(UE) test.Traditionally,there exist some GUI test tools in the market,such as Abbot Java GUI Test Framework and Pounder,in which testers pre-configure in the script all desired actions and instructions for the computer,nonetheless requiring too much of invariance of GUI environment;and they require reconfiguration in case of GUI changes,therefore still to be done mostly manually and hard for non-programmer testers to.Consequently,we proposed GUI tests by image recognition to automate the last process;we managed to innovate upon current algorithms such as SIFT and Random Fern,from which we develop the new algorithm scheme retrieving most efficient feature and dispelling inefficient part of each algorithm.Computers then apply the algorithm,to search for target patterns themselves and take subsequent actions such as manual mouse,keyboard and screen I/O automatically to test the GUI without any manual instructions.Test results showed that the proposed approach can accelerate GUI test largely compared to current benchmarks.展开更多
Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods dire...Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.展开更多
Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust a...Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset.展开更多
The imperfect material effect is one of the most important themes to obtain photo-realistic results in rendering.Textile material rendering has always been a key area in the field of computer graphics.So far,a great d...The imperfect material effect is one of the most important themes to obtain photo-realistic results in rendering.Textile material rendering has always been a key area in the field of computer graphics.So far,a great deal of effort has been invested in its unique appearance and physics-based simulation.The appearance of the dyeing effect commonly found in textiles has received little attention.This paper introduces techniques for simulation of staining effects on textiles.Pulling,wearing,squeezing,tearing,and breaking effects are more common imperfect effects of fabrics,these external forces will cause changes in the fabric structure,thus affecting the diffusion effect of stains.Based on the microstructure of yarn,we handle the effect of the stain on the imperfect textile surface.Our simulation results can achieve a photo-realistic effect.展开更多
Fluid and solid simulation is to generate a realistic simulation of fluids and solids,in particular for the fluids such as water and smoke,with computation of Euler equations or Navier-Stokes equations conducted to go...Fluid and solid simulation is to generate a realistic simulation of fluids and solids,in particular for the fluids such as water and smoke,with computation of Euler equations or Navier-Stokes equations conducted to govern the real fluid physics.Fluid simulation is an important field by its wide applications in many fields and industries,such as film and game simulation,weather forecasting,natural disaster simulation and protection,simulation in maritime and aviation.There are basically two main categories of methods for fluid simulation,data-driven methods and physically-based methods.The data-driven models establish a direct mapping between variables and extract their relationship from historically measured data by the algorithms developed in the fields of statistics,computational intelligence,machine learning,and data mining.展开更多
Background The interaction of gas and liquid can produce many interesting phenomena,such as bubbles rising from the bottom of the liquid.The simulation of two-phase fluids is a challenging topic in computer graphics.T...Background The interaction of gas and liquid can produce many interesting phenomena,such as bubbles rising from the bottom of the liquid.The simulation of two-phase fluids is a challenging topic in computer graphics.To animate the interaction of a gas and liquid,MultiFLIP samples the two types of particles,and a Euler grid is used to track the interface of the liquid and gas.However,MultiFLIP uses the fluid implicit particle(FLIP)method to interpolate the velocities of particles into the Euler grid,which suffer from additional noise and instability.Methods To solve the problem caused by fluid implicit particles(FLIP),we present a novel velocity transport technique for two individual particles based on the affine particle-in-cell(APIC)method.First,we design a weighed coupling method for interpolating the velocities of liquid and gas particles to the Euler grid such that we can apply the APIC method to the simulation of a two-phase fluid.Second,we introduce a narrowband method to our system because MultiFLIP is a time-consuming approach owing to the large number of particles.Results Experiments show that our method is well integrated with the APIC method and provides a visually credible two-phase fluid animation.Conclusions The proposed method can successfully handle the simulation of a two phase fluid.展开更多
We present a lightweight and efficient semisupervised video object segmentation network based on the space-time memory framework.To some extent,our method solves the two difficulties encountered in traditional video o...We present a lightweight and efficient semisupervised video object segmentation network based on the space-time memory framework.To some extent,our method solves the two difficulties encountered in traditional video object segmentation:one is that the single frame calculation time is too long,and the other is that the current frame’s segmentation should use more information from past frames.The algorithm uses a global context(GC)module to achieve highperformance,real-time segmentation.The GC module can effectively integrate multi-frame image information without increased memory and can process each frame in real time.Moreover,the prediction mask of the previous frame is helpful for the segmentation of the current frame,so we input it into a spatial constraint module(SCM),which constrains the areas of segments in the current frame.The SCM effectively alleviates mismatching of similar targets yet consumes few additional resources.We added a refinement module to the decoder to improve boundary segmentation.Our model achieves state-of-the-art results on various datasets,scoring 80.1%on YouTube-VOS 2018 and a J&F score of 78.0%on DAVIS 2017,while taking 0.05 s per frame on the DAVIS 2016 validation dataset.展开更多
Interactive image segmentation aims at classifying the image pixels into foreground and background classes given some foreground and background markers. In this paper, we propose a novel framework for interactive imag...Interactive image segmentation aims at classifying the image pixels into foreground and background classes given some foreground and background markers. In this paper, we propose a novel framework for interactive image segmentation that builds upon graph-based manifold ranking model, a graph-based semi-supervised learning technique which can learn very smooth functions with respect to the intrinsic structure revealed by the input data. The final segmentation results are improved by overcoming two core problems of graph construction in traditional models: graph structure and graph edge weights. The user provided scribbles are treated as the must-link and must-not-link constraints. Then we model the graph as an approximatively k-regular sparse graph by integrating these constraints and our extended neighboring spatial relationships into graph structure modeling. The content and labels driven locally adaptive kernel parameter is proposed to tackle the insufficiency of previous models which usually employ a unified kernel parameter. After the graph construction,a novel three-stage strategy is proposed to get the final segmentation results. Due to the sparsity and extended neighboring relationships of our constructed graph and usage of superpixels, our model can provide nearly real-time, user scribble insensitive segmentations which are two core demands in interactive image segmentation. Last but not least, our framework is very easy to be extended to multi-label segmentation,and for some less complicated scenarios, it can even get the segmented object through single line interaction. Experimental results and comparisons with other state-of-the-art methods demonstrate that our framework can efficiently and accurately extract foreground objects from background.展开更多
基金supported by the National Natural Science Foundation of China(Nos.61572316,61133009)National Hightech R&D Program of China(863 Program)(Grant No.2015AA015904)+3 种基金the Science and Technology Commission of Shanghai Municipality Program(No.13511505000)the Interdisciplinary Program of Shanghai Jiao Tong University(No.14JCY10)a grant from the Research Grants Council of Hong Kong(Project No.:28200215)a grant from The Education University of Hong Kong(Project No:FLASS/DRF/ECR-7)
文摘In software development process,the last step is usually the Graphic User Interface(GUI) test,which is part of the final user experience(UE) test.Traditionally,there exist some GUI test tools in the market,such as Abbot Java GUI Test Framework and Pounder,in which testers pre-configure in the script all desired actions and instructions for the computer,nonetheless requiring too much of invariance of GUI environment;and they require reconfiguration in case of GUI changes,therefore still to be done mostly manually and hard for non-programmer testers to.Consequently,we proposed GUI tests by image recognition to automate the last process;we managed to innovate upon current algorithms such as SIFT and Random Fern,from which we develop the new algorithm scheme retrieving most efficient feature and dispelling inefficient part of each algorithm.Computers then apply the algorithm,to search for target patterns themselves and take subsequent actions such as manual mouse,keyboard and screen I/O automatically to test the GUI without any manual instructions.Test results showed that the proposed approach can accelerate GUI test largely compared to current benchmarks.
基金Supported by the National Natural Science Foundation of China under Grants 61872241, 62077037 and 62077037Shanghai Municipal Science and Technology Major Project under Grant 2021SHZDZX0102。
文摘Background Monocular depth estimation aims to predict a dense depth map from a single RGB image,and has important applications in 3D reconstruction,automatic driving,and augmented reality.However,existing methods directly feed the original RGB image into the model to extract depth features without avoiding the interference of depth-irrelevant information on depth-estimation accuracy,which leads to inferior performance.Methods To remove the influence of depth-irrelevant information and improve the depth-prediction accuracy,we propose RADepthNet,a novel reflectance-guided network that fuses boundary features.Specifically,our method predicts depth maps using the following three steps:(1)Intrinsic Image Decomposition.We propose a reflectance extraction module consisting of an encoder-decoder structure to extract the depth-related reflectance.Through an ablation study,we demonstrate that the module can reduce the influence of illumination on depth estimation.(2)Boundary Detection.A boundary extraction module,consisting of an encoder,refinement block,and upsample block,was proposed to better predict the depth at object boundaries utilizing gradient constraints.(3)Depth Prediction Module.We use an encoder different from(2)to obtain depth features from the reflectance map and fuse boundary features to predict depth.In addition,we proposed FIFADataset,a depth-estimation dataset applied in soccer scenarios.Results Extensive experiments on a public dataset and our proposed FIFADataset show that our method achieves state-of-the-art performance.
基金Supported by the National Natural Science Foundation of China under Grants 61872241,62077037 and 62272298in part by Shanghai Municipal Science and Technology Major Project under Grant 2021SHZDZX0102。
文摘Background Exploring correspondences across multiview images is the basis of various computer vision tasks.However,most existing methods have limited accuracy under challenging conditions.Method To learn more robust and accurate correspondences,we propose DSD-MatchingNet for local feature matching in this study.First,we develop a deformable feature extraction module to obtain multilevel feature maps,which harvest contextual information from dynamic receptive fields.The dynamic receptive fields provided by the deformable convolution network ensure that our method obtains dense and robust correspondence.Second,we utilize sparse-to-dense matching with symmetry of correspondence to implement accurate pixel-level matching,which enables our method to produce more accurate correspondences.Result Experiments show that our proposed DSD-MatchingNet achieves a better performance on the image matching benchmark,as well as on the visual localization benchmark.Specifically,our method achieved 91.3%mean matching accuracy on the HPatches dataset and 99.3%visual localization recalls on the Aachen Day-Night dataset.
文摘The imperfect material effect is one of the most important themes to obtain photo-realistic results in rendering.Textile material rendering has always been a key area in the field of computer graphics.So far,a great deal of effort has been invested in its unique appearance and physics-based simulation.The appearance of the dyeing effect commonly found in textiles has received little attention.This paper introduces techniques for simulation of staining effects on textiles.Pulling,wearing,squeezing,tearing,and breaking effects are more common imperfect effects of fabrics,these external forces will cause changes in the fabric structure,thus affecting the diffusion effect of stains.Based on the microstructure of yarn,we handle the effect of the stain on the imperfect textile surface.Our simulation results can achieve a photo-realistic effect.
文摘Fluid and solid simulation is to generate a realistic simulation of fluids and solids,in particular for the fluids such as water and smoke,with computation of Euler equations or Navier-Stokes equations conducted to govern the real fluid physics.Fluid simulation is an important field by its wide applications in many fields and industries,such as film and game simulation,weather forecasting,natural disaster simulation and protection,simulation in maritime and aviation.There are basically two main categories of methods for fluid simulation,data-driven methods and physically-based methods.The data-driven models establish a direct mapping between variables and extract their relationship from historically measured data by the algorithms developed in the fields of statistics,computational intelligence,machine learning,and data mining.
基金National High Technology R&D Program of China(2017YFB1002701,M2019YFB1600702)NSFC(62072449)+1 种基金Science and Technology Development Fund,Macao SAR(0018/2019/AKP,0008/2019/AGJ,SKL-IOTSC-2018-2020)University of Macao Grant(MYRG2019-00006-FST).
文摘Background The interaction of gas and liquid can produce many interesting phenomena,such as bubbles rising from the bottom of the liquid.The simulation of two-phase fluids is a challenging topic in computer graphics.To animate the interaction of a gas and liquid,MultiFLIP samples the two types of particles,and a Euler grid is used to track the interface of the liquid and gas.However,MultiFLIP uses the fluid implicit particle(FLIP)method to interpolate the velocities of particles into the Euler grid,which suffer from additional noise and instability.Methods To solve the problem caused by fluid implicit particles(FLIP),we present a novel velocity transport technique for two individual particles based on the affine particle-in-cell(APIC)method.First,we design a weighed coupling method for interpolating the velocities of liquid and gas particles to the Euler grid such that we can apply the APIC method to the simulation of a two-phase fluid.Second,we introduce a narrowband method to our system because MultiFLIP is a time-consuming approach owing to the large number of particles.Results Experiments show that our method is well integrated with the APIC method and provides a visually credible two-phase fluid animation.Conclusions The proposed method can successfully handle the simulation of a two phase fluid.
基金partially supported by the National Natural Science Foundation of China(Grant Nos.61802197,62072449,and 61632003)the Science and Technology Development Fund,Macao SAR(Grant Nos.0018/2019/AKP and SKL-IOTSC(UM)-2021-2023)+1 种基金the Guangdong Science and Technology Department(Grant No.2020B1515130001)University of Macao(Grant Nos.MYRG2020-00253-FST and MYRG2022-00059-FST).
文摘We present a lightweight and efficient semisupervised video object segmentation network based on the space-time memory framework.To some extent,our method solves the two difficulties encountered in traditional video object segmentation:one is that the single frame calculation time is too long,and the other is that the current frame’s segmentation should use more information from past frames.The algorithm uses a global context(GC)module to achieve highperformance,real-time segmentation.The GC module can effectively integrate multi-frame image information without increased memory and can process each frame in real time.Moreover,the prediction mask of the previous frame is helpful for the segmentation of the current frame,so we input it into a spatial constraint module(SCM),which constrains the areas of segments in the current frame.The SCM effectively alleviates mismatching of similar targets yet consumes few additional resources.We added a refinement module to the decoder to improve boundary segmentation.Our model achieves state-of-the-art results on various datasets,scoring 80.1%on YouTube-VOS 2018 and a J&F score of 78.0%on DAVIS 2017,while taking 0.05 s per frame on the DAVIS 2016 validation dataset.
基金supported by NSFC (National Natural Science Foundation of China, No. 61272326)the research grant of University of Macao (No. MYRG202(Y1L4)-FST11-WEH)the research grant of University of Macao (No. MYRG2014-00139-FST)
文摘Interactive image segmentation aims at classifying the image pixels into foreground and background classes given some foreground and background markers. In this paper, we propose a novel framework for interactive image segmentation that builds upon graph-based manifold ranking model, a graph-based semi-supervised learning technique which can learn very smooth functions with respect to the intrinsic structure revealed by the input data. The final segmentation results are improved by overcoming two core problems of graph construction in traditional models: graph structure and graph edge weights. The user provided scribbles are treated as the must-link and must-not-link constraints. Then we model the graph as an approximatively k-regular sparse graph by integrating these constraints and our extended neighboring spatial relationships into graph structure modeling. The content and labels driven locally adaptive kernel parameter is proposed to tackle the insufficiency of previous models which usually employ a unified kernel parameter. After the graph construction,a novel three-stage strategy is proposed to get the final segmentation results. Due to the sparsity and extended neighboring relationships of our constructed graph and usage of superpixels, our model can provide nearly real-time, user scribble insensitive segmentations which are two core demands in interactive image segmentation. Last but not least, our framework is very easy to be extended to multi-label segmentation,and for some less complicated scenarios, it can even get the segmented object through single line interaction. Experimental results and comparisons with other state-of-the-art methods demonstrate that our framework can efficiently and accurately extract foreground objects from background.