This paper presents a multi-mode control scheme for a soft-switched flyback converter to achieve high efficiency and excellent load regulation over the entire load range. At heavy load, critical conduction mode with v...This paper presents a multi-mode control scheme for a soft-switched flyback converter to achieve high efficiency and excellent load regulation over the entire load range. At heavy load, critical conduction mode with valley switching (CCMVS) is employed to realize soft switching so as to reduce turn-on loss of power switch as well as conducted electromagnetic interference (EMI). At light load, the converter operates in discontinuous conduction mode (DCM) with valley switching and adaptive off-time control (AOT) to limit the switching frequency range and maintain load regulation. At extremely light load or in standby mode, burst mode operation is adopted to provide low power consumption through reducing both switching frequency and static power dissipation of the controller. The multi-mode control is implemented by an oscillator whose pulse duration is adjusted by output feedback. An accurate valley switching control circuit guarantees the minimum turn-on voltage drop of power switch. The pro-totype of the controller IC was fabricated in a 1.5-μm BiCMOS process and applied to a 310 V/20 V, 90 W flyback DC/DC converter circuitry. Experimental results showed that all expected functions were realized successfully. The flyback converter achieved a high efficiency of over 80% from full load down to 2.5 W, with the maximum reaching 88.8%, while the total power consumption in standby mode was about 300 mW.展开更多
To efficiently exploit the performance of single instruction multiple data (SIMD) architectures for video coding, a parallel memory architecture with power-of-two memory modules is proposed. It employs two novel ske...To efficiently exploit the performance of single instruction multiple data (SIMD) architectures for video coding, a parallel memory architecture with power-of-two memory modules is proposed. It employs two novel skewing schemes to provide conflict-free access to adjacent elements (8-bit and 16-bit data types) or with power-of-two intervals in both horizontal and vertical directions, which were not possible in previous parallel memory architectures. Area consumptions and delay estimations are given respectively with 4, 8 and 16 memory modules. Under a 0.18-pm CMOS technology, the synthesis results show that the proposed system can achieve 230 MHz clock frequency with 16 memory modules at the cost of 19k gates when read and write latencies are 3 and 2 clock cycles, respectively. We implement the proposed parallel memory architecture on a video signal processor (VSP). The results show that VSP enhanced with the proposed architecture achieves 1.28× speedups for H.264 real-time decoding.展开更多
We present a semi-custom design methodology based on transistor tuning to optimize the design performance. Compared with other transistor tuning approaches, our tuning process takes the cross-talk effect into account ...We present a semi-custom design methodology based on transistor tuning to optimize the design performance. Compared with other transistor tuning approaches, our tuning process takes the cross-talk effect into account and prominently reduces the complexity for circuit simulation and analysis by decomposing the circuit network utilizing graph theory. Furthermore, the incremental placement and routing for the corresponding transistor tuning in conventional approaches is not required in our methodology, which might induce timing graph variation and additional iterations for design convergence. This methodology combines the flexible automated circuit tuning and physical design tools to provide more opportunities for design optimization throughout the design cycle.展开更多
Inverse lithography technology(ILT),also known as pixel-based optical proximity correction(PB-OPC),has shown promising capability in pushing the current 193 nm lithography to its limit.By treating the mask optimizatio...Inverse lithography technology(ILT),also known as pixel-based optical proximity correction(PB-OPC),has shown promising capability in pushing the current 193 nm lithography to its limit.By treating the mask optimization process as an inverse problem in lithography,ILT provides a more complete exploration of the solution space and better pattern fidelity than the traditional edge-based OPC.However,the existing methods of ILT are extremely time-consuming due to the slow convergence of the optimization process.To address this issue,in this paper we propose a support vector machine(SVM)based layout retargeting method for ILT,which is designed to generate a good initial input mask for the optimization process and promote the convergence speed.Supervised by optimized masks of training layouts generated by conventional ILT,SVM models are learned and used to predict the initial pixel values in the‘undefined areas’of the new layout.By this process,an initial input mask close to the final optimized mask of the new layout is generated,which reduces iterations needed in the following optimization process.Manufacturability is another critical issue in ILT;however,the mask generated by our layout retargeting method is quite irregular due to the prediction inaccuracy of the SVM models.To compensate for this drawback,a spatial filter is employed to regularize the retargeted mask for complexity reduction.We implemented our layout retargeting method with a regularized level-set based ILT(LSB-ILT)algorithm under partially coherent illumination conditions.Experimental results show that with an initial input mask generated by our layout retargeting method,the number of iterations needed in the optimization process and runtime of the whole process in ILT are reduced by 70.8%and 69.0%,respectively.展开更多
Inverse lithography technology(ILT)is one of the promising resolution enhancement techniques,as the advanced IC technology nodes still use the 193 nm light source.In ILT,optical proximity correction(OPC)is treated as ...Inverse lithography technology(ILT)is one of the promising resolution enhancement techniques,as the advanced IC technology nodes still use the 193 nm light source.In ILT,optical proximity correction(OPC)is treated as an inverse imaging problem to find the optimal solution using a set of mathematical approaches.Among all the algorithms for ILT,the level-set-based ILT(LSB-ILT)is a feasible choice with good production in practice.However,the manufacturability of the optimized mask is one of the critical issues in ILT;that is,the topology of its result is usually too complicated to manufacture.We put forward a new algorithm with high pattern fidelity called regularized LSB-ILT implemented in partially coherent illumination(PCI),which has the advantage of reducing mask complexity by suppressing the isolated irregular holes and protrusions in the edges generated in the optimization process.A new regularization term named the Laplacian term is also proposed in the regularized LSB-ILT optimization process to further reduce mask complexity in contrast with the total variation(TV)term.Experimental results show that the new algorithm with the Laplacian term can reduce the complexity of mask by over 40%compared with the ordinary LSB-ILT.展开更多
Accurate and fast performance estimation is necessary to drive design space exploration and thus support important design decisions. Current techniques are either time consuming or not accurate enough. In this paper, ...Accurate and fast performance estimation is necessary to drive design space exploration and thus support important design decisions. Current techniques are either time consuming or not accurate enough. In this paper, we solve these problems by presenting a hybrid method for multimedia multiprocessor system-on-chip (MPSoC) performance estimation. A general coverage analysis tool GNU gcov is employed to profile the execution statistics during the native simulation. To tackle the complexity and keep the analysis and simulation manageable, the orthogonalization of communication and computation parts is adopted. The estimation result of the computation part is annotated to a transaction accurate model for further analysis, by which a gradual refinement of MPSoC performance estimation is supported. The implementation and its experimental results prove the feasibility and efficiency of the proposed method.展开更多
The application-specific multiprocessor system-on-chip(MPSoC) architecture is becoming an attractive solution to deal with increasingly complex embedded applications,which require both high performance and flexible pr...The application-specific multiprocessor system-on-chip(MPSoC) architecture is becoming an attractive solution to deal with increasingly complex embedded applications,which require both high performance and flexible programmability. As an effective method for MPSoC development,we present a gradual refinement flow starting from a high-level Simulink model to a synthesizable and executable hardware and software specification. The proposed methodology consists of five different abstract levels:Simulink combined algorithm and architecture model(CAAM),virtual architecture(VA),transactional accurate architecture(TA),virtual prototype(VP) and field-programmable gate array(FPGA) emulation. Experimental results of Motion-JPEG and H.264 show that the proposed gradual refinement flow can generate various MPSoC architectures from an original Simulink model,allowing processor,communication and tasks design space exploration.展开更多
In existing integrated circuit (IC) fabrication methods,the yield is typically limited by defects generated in the manufacturing process.In fact,the yield often shows a good correlation with the type and density of th...In existing integrated circuit (IC) fabrication methods,the yield is typically limited by defects generated in the manufacturing process.In fact,the yield often shows a good correlation with the type and density of the defect.As a result,an accurate defect limited yield model is essential for accurate correlation analysis and yield prediction.Since real defects exhibit a great variety of shapes,to ensure the accuracy of yield prediction,it is necessary to select the most appropriate defect model and to extract the critical area based on the defect model.Considering the realistic outline of scratches introduced by the chemical mechanical polishing (CMP) process,we propose a novel scratch-concerned yield model.A linear model is introduced to model scratches.Based on the linear model,the related critical area extraction algorithm and defect density distribution are discussed.Owing to higher correspondence with the realistic outline of scratches,the linear defect model enables a more accurate yield prediction caused by scratches and results in a more accurate total product yield prediction as compared to the traditional circular model.展开更多
Context-based adaptive binary arithmetic coding(CABAC) is the major entropy-coding algorithm employed in H.264/AVC.In this paper,we present a new VLSI architecture design for an H.264/AVC CABAC decoder,which optimizes...Context-based adaptive binary arithmetic coding(CABAC) is the major entropy-coding algorithm employed in H.264/AVC.In this paper,we present a new VLSI architecture design for an H.264/AVC CABAC decoder,which optimizes both decode decision and decode bypass engines for high throughput,and improves context model allocation for efficient external memory access.Based on the fact that the most possible symbol(MPS) branch is much simpler than the least possible symbol(LPS) branch,a newly organized decode decision engine consisting of two serially concatenated MPS branches and one LPS branch is proposed to achieve better parallelism at lower timing path cost.A look-ahead context index(ctxIdx) calculation mechanism is designed to provide the context model for the second MPS branch.A head-zero detector is proposed to improve the performance of the decode bypass engine according to UEGk encoding features.In addition,to lower the frequency of memory access,we reorganize the context models in external memory and use three circular buffers to cache the context models,neighboring information,and bit stream,respectively.A pre-fetching mechanism with a prediction scheme is adopted to load the corresponding content to a circular buffer to hide external memory latency.Experimental results show that our design can operate at 250 MHz with a 20.71k gate count in SMIC18 silicon technology,and that it achieves an average data decoding rate of 1.5 bins/cycle.展开更多
Due to the importance of metal layers in the product yield,serpentine test structures are usually fabricated on test chips to extract parameters for yield prediction.In this paper,the confidence level and estimation p...Due to the importance of metal layers in the product yield,serpentine test structures are usually fabricated on test chips to extract parameters for yield prediction.In this paper,the confidence level and estimation precision of the average defect density on metal layers are investigated to minimize the randomness of experimental results and make the measured parameters more convincing.On the basis of the Poisson yield model,the method to determine the total area of all serpentine test structures is obtained using the law of large numbers and the Lindeberg-Levy theorem.Furthermore,the method to determine an adequate area of each serpentine test structure is proposed under a specific requirement of confidence level and estimation precision.The results of Monte Carlo simulation show that the proposed method is consistent with theoretical analyses.It is also revealed by wafer experimental results that the method of designing serpentine test structure proposed in this paper has better performance.展开更多
For accurate prediction of via yield, via chains are usually fabricated on test chips to investigate issues about vias. To minimize the randomness of experiments and make the testing results more convincing, the confi...For accurate prediction of via yield, via chains are usually fabricated on test chips to investigate issues about vias. To minimize the randomness of experiments and make the testing results more convincing, the confidence level and estimation precision of the via failure rate are investigated in this paper. Based on the Poisson yield model, the method of determining an adequate number of total vias is obtained using the law of large numbers and the de Moivre-Laplace theorem. Moreover, for a specific confidence level and estimation precision, the method of determining a suitable via chain length is proposed. For area minimization, an optimal combination of total vias and via chain length is further determined. Monte Carlo simulation results show that the method is in good accordance with theoretical analyses. Results of via failure rates measured on test chips also reveal that via chains designed using the proposed method has a better performance. In addition, the proposed methodology can be extended to investigate statistical significance for other failure modes.展开更多
This paper presents an approach for analyzing the key parts of a general digital radio frequency(RF) charge sampling mixer based on discrete-time charge values.The cascade sampling and filtering stages are analyzed an...This paper presents an approach for analyzing the key parts of a general digital radio frequency(RF) charge sampling mixer based on discrete-time charge values.The cascade sampling and filtering stages are analyzed and expressed in theoretical formulae.The effects of a pseudo-differential structure and CMOS switch-on resistances on the transfer function are addressed in detail.The DC-gain is restrained by using the pseudo-differential structure.The transfer gain is reduced because of the charge-sharing time constant when taking CMOS switch-on resistances into account.The unfolded transfer gains of a typical digital RF charge sampling mixer are analyzed in different cases using this approach.A circuit-level model of the typical mixer is then constructed and simulated in Cadence SpectreRF to verify the results.This work informs the design of charge-sampling,infinite impulse response(ⅡR) filtering,and finite impulse response(FIR) filtering circuits.The discrete-time approach can also be applied to other multi-rate receiver systems based on charge sampling techniques.展开更多
We present a new data structure for the representation of an integrated circuit layout. It is a modified HV/VH tree using arrays as the primary container in bisector lists and leaf nodes. By grouping and sorting objec...We present a new data structure for the representation of an integrated circuit layout. It is a modified HV/VH tree using arrays as the primary container in bisector lists and leaf nodes. By grouping and sorting objects within these arrays together with a customized binary search algorithm, our new data structure provides excellent performance in both memory usage and region query speed. Experimental results show that in comparison with the original HV/VH tree, which has been regarded as the best layout data structure to date, the new data structure uses much less memory and can become 30% faster on region query.展开更多
基金the National Natural Science Foundation of China (No. 90707002)the Natural Science Foundation of Zheji-ang Province, China (No. Z104441)
文摘This paper presents a multi-mode control scheme for a soft-switched flyback converter to achieve high efficiency and excellent load regulation over the entire load range. At heavy load, critical conduction mode with valley switching (CCMVS) is employed to realize soft switching so as to reduce turn-on loss of power switch as well as conducted electromagnetic interference (EMI). At light load, the converter operates in discontinuous conduction mode (DCM) with valley switching and adaptive off-time control (AOT) to limit the switching frequency range and maintain load regulation. At extremely light load or in standby mode, burst mode operation is adopted to provide low power consumption through reducing both switching frequency and static power dissipation of the controller. The multi-mode control is implemented by an oscillator whose pulse duration is adjusted by output feedback. An accurate valley switching control circuit guarantees the minimum turn-on voltage drop of power switch. The pro-totype of the controller IC was fabricated in a 1.5-μm BiCMOS process and applied to a 310 V/20 V, 90 W flyback DC/DC converter circuitry. Experimental results showed that all expected functions were realized successfully. The flyback converter achieved a high efficiency of over 80% from full load down to 2.5 W, with the maximum reaching 88.8%, while the total power consumption in standby mode was about 300 mW.
基金Project (No. 2005AA1Z1271) supported by the Hi-Tech Research and Development Program (863) of China
文摘To efficiently exploit the performance of single instruction multiple data (SIMD) architectures for video coding, a parallel memory architecture with power-of-two memory modules is proposed. It employs two novel skewing schemes to provide conflict-free access to adjacent elements (8-bit and 16-bit data types) or with power-of-two intervals in both horizontal and vertical directions, which were not possible in previous parallel memory architectures. Area consumptions and delay estimations are given respectively with 4, 8 and 16 memory modules. Under a 0.18-pm CMOS technology, the synthesis results show that the proposed system can achieve 230 MHz clock frequency with 16 memory modules at the cost of 19k gates when read and write latencies are 3 and 2 clock cycles, respectively. We implement the proposed parallel memory architecture on a video signal processor (VSP). The results show that VSP enhanced with the proposed architecture achieves 1.28× speedups for H.264 real-time decoding.
基金Project (No. 2005AA1Z1271) supported by the Hi-Tech Researchand Development Program (863) of China
文摘We present a semi-custom design methodology based on transistor tuning to optimize the design performance. Compared with other transistor tuning approaches, our tuning process takes the cross-talk effect into account and prominently reduces the complexity for circuit simulation and analysis by decomposing the circuit network utilizing graph theory. Furthermore, the incremental placement and routing for the corresponding transistor tuning in conventional approaches is not required in our methodology, which might induce timing graph variation and additional iterations for design convergence. This methodology combines the flexible automated circuit tuning and physical design tools to provide more opportunities for design optimization throughout the design cycle.
文摘Inverse lithography technology(ILT),also known as pixel-based optical proximity correction(PB-OPC),has shown promising capability in pushing the current 193 nm lithography to its limit.By treating the mask optimization process as an inverse problem in lithography,ILT provides a more complete exploration of the solution space and better pattern fidelity than the traditional edge-based OPC.However,the existing methods of ILT are extremely time-consuming due to the slow convergence of the optimization process.To address this issue,in this paper we propose a support vector machine(SVM)based layout retargeting method for ILT,which is designed to generate a good initial input mask for the optimization process and promote the convergence speed.Supervised by optimized masks of training layouts generated by conventional ILT,SVM models are learned and used to predict the initial pixel values in the‘undefined areas’of the new layout.By this process,an initial input mask close to the final optimized mask of the new layout is generated,which reduces iterations needed in the following optimization process.Manufacturability is another critical issue in ILT;however,the mask generated by our layout retargeting method is quite irregular due to the prediction inaccuracy of the SVM models.To compensate for this drawback,a spatial filter is employed to regularize the retargeted mask for complexity reduction.We implemented our layout retargeting method with a regularized level-set based ILT(LSB-ILT)algorithm under partially coherent illumination conditions.Experimental results show that with an initial input mask generated by our layout retargeting method,the number of iterations needed in the optimization process and runtime of the whole process in ILT are reduced by 70.8%and 69.0%,respectively.
文摘Inverse lithography technology(ILT)is one of the promising resolution enhancement techniques,as the advanced IC technology nodes still use the 193 nm light source.In ILT,optical proximity correction(OPC)is treated as an inverse imaging problem to find the optimal solution using a set of mathematical approaches.Among all the algorithms for ILT,the level-set-based ILT(LSB-ILT)is a feasible choice with good production in practice.However,the manufacturability of the optimized mask is one of the critical issues in ILT;that is,the topology of its result is usually too complicated to manufacture.We put forward a new algorithm with high pattern fidelity called regularized LSB-ILT implemented in partially coherent illumination(PCI),which has the advantage of reducing mask complexity by suppressing the isolated irregular holes and protrusions in the edges generated in the optimization process.A new regularization term named the Laplacian term is also proposed in the regularized LSB-ILT optimization process to further reduce mask complexity in contrast with the total variation(TV)term.Experimental results show that the new algorithm with the Laplacian term can reduce the complexity of mask by over 40%compared with the ordinary LSB-ILT.
基金Project-supported-- by the National Natural Science Foundation of China (No. 61100074), the National Science and Technol- ogy Major Project of China (No. 2012ZX01039-004), and the Fundamental Research Funds for the Central Universities, China
文摘Accurate and fast performance estimation is necessary to drive design space exploration and thus support important design decisions. Current techniques are either time consuming or not accurate enough. In this paper, we solve these problems by presenting a hybrid method for multimedia multiprocessor system-on-chip (MPSoC) performance estimation. A general coverage analysis tool GNU gcov is employed to profile the execution statistics during the native simulation. To tackle the complexity and keep the analysis and simulation manageable, the orthogonalization of communication and computation parts is adopted. The estimation result of the computation part is annotated to a transaction accurate model for further analysis, by which a gradual refinement of MPSoC performance estimation is supported. The implementation and its experimental results prove the feasibility and efficiency of the proposed method.
文摘The application-specific multiprocessor system-on-chip(MPSoC) architecture is becoming an attractive solution to deal with increasingly complex embedded applications,which require both high performance and flexible programmability. As an effective method for MPSoC development,we present a gradual refinement flow starting from a high-level Simulink model to a synthesizable and executable hardware and software specification. The proposed methodology consists of five different abstract levels:Simulink combined algorithm and architecture model(CAAM),virtual architecture(VA),transactional accurate architecture(TA),virtual prototype(VP) and field-programmable gate array(FPGA) emulation. Experimental results of Motion-JPEG and H.264 show that the proposed gradual refinement flow can generate various MPSoC architectures from an original Simulink model,allowing processor,communication and tasks design space exploration.
文摘In existing integrated circuit (IC) fabrication methods,the yield is typically limited by defects generated in the manufacturing process.In fact,the yield often shows a good correlation with the type and density of the defect.As a result,an accurate defect limited yield model is essential for accurate correlation analysis and yield prediction.Since real defects exhibit a great variety of shapes,to ensure the accuracy of yield prediction,it is necessary to select the most appropriate defect model and to extract the critical area based on the defect model.Considering the realistic outline of scratches introduced by the chemical mechanical polishing (CMP) process,we propose a novel scratch-concerned yield model.A linear model is introduced to model scratches.Based on the linear model,the related critical area extraction algorithm and defect density distribution are discussed.Owing to higher correspondence with the realistic outline of scratches,the linear defect model enables a more accurate yield prediction caused by scratches and results in a more accurate total product yield prediction as compared to the traditional circular model.
基金Project supported by the National Natural Science Foundation of China(No.61100074)the Fundamental Research Funds for the Central Universities,China(No.2013QNA5008)
文摘Context-based adaptive binary arithmetic coding(CABAC) is the major entropy-coding algorithm employed in H.264/AVC.In this paper,we present a new VLSI architecture design for an H.264/AVC CABAC decoder,which optimizes both decode decision and decode bypass engines for high throughput,and improves context model allocation for efficient external memory access.Based on the fact that the most possible symbol(MPS) branch is much simpler than the least possible symbol(LPS) branch,a newly organized decode decision engine consisting of two serially concatenated MPS branches and one LPS branch is proposed to achieve better parallelism at lower timing path cost.A look-ahead context index(ctxIdx) calculation mechanism is designed to provide the context model for the second MPS branch.A head-zero detector is proposed to improve the performance of the decode bypass engine according to UEGk encoding features.In addition,to lower the frequency of memory access,we reorganize the context models in external memory and use three circular buffers to cache the context models,neighboring information,and bit stream,respectively.A pre-fetching mechanism with a prediction scheme is adopted to load the corresponding content to a circular buffer to hide external memory latency.Experimental results show that our design can operate at 250 MHz with a 20.71k gate count in SMIC18 silicon technology,and that it achieves an average data decoding rate of 1.5 bins/cycle.
基金Project (No. 2009ZX02023-004-1) supported by the National Science and Technology Major Project,China
文摘Due to the importance of metal layers in the product yield,serpentine test structures are usually fabricated on test chips to extract parameters for yield prediction.In this paper,the confidence level and estimation precision of the average defect density on metal layers are investigated to minimize the randomness of experimental results and make the measured parameters more convincing.On the basis of the Poisson yield model,the method to determine the total area of all serpentine test structures is obtained using the law of large numbers and the Lindeberg-Levy theorem.Furthermore,the method to determine an adequate area of each serpentine test structure is proposed under a specific requirement of confidence level and estimation precision.The results of Monte Carlo simulation show that the proposed method is consistent with theoretical analyses.It is also revealed by wafer experimental results that the method of designing serpentine test structure proposed in this paper has better performance.
基金Project (No. 2009ZX02023-004-1) supported by the National Science and Technology Major Project, China
文摘For accurate prediction of via yield, via chains are usually fabricated on test chips to investigate issues about vias. To minimize the randomness of experiments and make the testing results more convincing, the confidence level and estimation precision of the via failure rate are investigated in this paper. Based on the Poisson yield model, the method of determining an adequate number of total vias is obtained using the law of large numbers and the de Moivre-Laplace theorem. Moreover, for a specific confidence level and estimation precision, the method of determining a suitable via chain length is proposed. For area minimization, an optimal combination of total vias and via chain length is further determined. Monte Carlo simulation results show that the method is in good accordance with theoretical analyses. Results of via failure rates measured on test chips also reveal that via chains designed using the proposed method has a better performance. In addition, the proposed methodology can be extended to investigate statistical significance for other failure modes.
基金supported by the National Natural Science Foundation of China (No.90407011)the National High-Tech Research and Development Program (863) of China (No.2007AA01Z2b3)China Postdoctoral Science Foundation (No.20090451439)
文摘This paper presents an approach for analyzing the key parts of a general digital radio frequency(RF) charge sampling mixer based on discrete-time charge values.The cascade sampling and filtering stages are analyzed and expressed in theoretical formulae.The effects of a pseudo-differential structure and CMOS switch-on resistances on the transfer function are addressed in detail.The DC-gain is restrained by using the pseudo-differential structure.The transfer gain is reduced because of the charge-sharing time constant when taking CMOS switch-on resistances into account.The unfolded transfer gains of a typical digital RF charge sampling mixer are analyzed in different cases using this approach.A circuit-level model of the typical mixer is then constructed and simulated in Cadence SpectreRF to verify the results.This work informs the design of charge-sampling,infinite impulse response(ⅡR) filtering,and finite impulse response(FIR) filtering circuits.The discrete-time approach can also be applied to other multi-rate receiver systems based on charge sampling techniques.
基金supported by the National Natural Science Foundation of China (No. 61106034)the National Science and Technology Major Project (No. 2009ZX02023-004-1)
文摘We present a new data structure for the representation of an integrated circuit layout. It is a modified HV/VH tree using arrays as the primary container in bisector lists and leaf nodes. By grouping and sorting objects within these arrays together with a customized binary search algorithm, our new data structure provides excellent performance in both memory usage and region query speed. Experimental results show that in comparison with the original HV/VH tree, which has been regarded as the best layout data structure to date, the new data structure uses much less memory and can become 30% faster on region query.