Massive multiple-input multiple-output provides improved energy efficiency and spectral efficiency in 5 G. However it requires large-scale matrix computation with tremendous complexity, especially for data detection a...Massive multiple-input multiple-output provides improved energy efficiency and spectral efficiency in 5 G. However it requires large-scale matrix computation with tremendous complexity, especially for data detection and precoding. Recently, many detection and precoding methods were proposed using approximate iteration methods, which meet the demand of precision with low complexity. In this paper, we compare these approximate iteration methods in precision and complexity, and then improve these methods with iteration refinement at the cost of little complexity and no extra hardware resource. By derivation, our proposal is a combination of three approximate iteration methods in essence and provides remarkable precision improvement on desired vectors. The results show that our proposal provides 27%-83% normalized mean-squared error improvement of the detection symbol vector and precoding symbol vector. Moreover, we find the bit-error rate is mainly controlled by soft-input soft-output Viterbi decoding when using approximate iteration methods. Further, only considering the effect on soft-input soft-output Viterbi decoding, the simulation results show that using a rough estimation for the filter matrix of minimum mean square error detection to calculating log-likelihood ratio could provideenough good bit-error rate performance, especially when the ratio of base station antennas number and the users number is not too large.展开更多
FinFET technologies are becoming the mainstream process as technology scales down. Based on a 28-nm bulk p- FinFET device, we have investigated the fin width and height dependence of bipolar amplification for heavy-io...FinFET technologies are becoming the mainstream process as technology scales down. Based on a 28-nm bulk p- FinFET device, we have investigated the fin width and height dependence of bipolar amplification for heavy-ion-irradiated FinFETs by 3D TCAD numerical simulation. Simulation results show that due to a well bipolar conduction mechanism rather than a channel (fin) conduction path, the transistors with narrower fins exhibit a diminished bipolar amplification effect, while the fin height presents a trivial effect on the bipolar amplification and charge collection. The results also indicate that the single event transient (SET) pulse width can be mitigated about 35% at least by optimizing the ratio of fin width and height, which can provide guidance for radiation-hardened applications in bulk FinFET technology.展开更多
Charge sharing is becoming an important topic as the feature size scales down in fin field-effect-transistor (FinFET) technology. However, the studies of charge sharing induced single-event transient (SET) pulse q...Charge sharing is becoming an important topic as the feature size scales down in fin field-effect-transistor (FinFET) technology. However, the studies of charge sharing induced single-event transient (SET) pulse quenching with bulk FinFET are reported seldomly. Using three-dimensional technology computer aided design (3DTCAD) mixed-mode simulations, the effects of supply voltage and body-biasing on SET pulse quenching are investigated for the first time in bulk FinFET process. Research results indicate that due to an enhanced charge sharing effect, the propagating SET pulse width decreases with reducing supply voltage. Moreover, compared with reverse body-biasing (RBB), the circuit with forward body-biasing (FBB) is vulnerable to charge sharing and can effectively mitigate the propagating SET pulse width up to 53% at least. This can provide guidance for radiation-hardened bulk FinFET technology especially in low power and high performance applications.展开更多
Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at t...Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.展开更多
Many real-world networks are found to be scale-free. However, graph partition technology, as a technology capable of parallel computing, performs poorly when scale-free graphs are provided. The reason for this is that...Many real-world networks are found to be scale-free. However, graph partition technology, as a technology capable of parallel computing, performs poorly when scale-free graphs are provided. The reason for this is that traditional partitioning algorithms are designed for random networks and regular networks, rather than for scale-free networks. Multilevel graph-partitioning algorithms are currently considered to be the state of the art and are used extensively. In this paper, we analyse the reasons why traditional multilevel graph-partitioning algorithms perform poorly and present a new multilevel graph-partitioning paradigm, top down partitioning, which derives its name from the comparison with the traditional bottom-up partitioning. A new multilevel partitioning algorithm, named betweenness-based partitioning algorithm, is also presented as an implementation of top-down partitioning paradigm. An experimental evaluation of seven different real-world scale-free networks shows that the betweenness-based partitioning algorithm significantly outperforms the existing state-of-the-art approaches.展开更多
Stragglers can temporize jobs and reduce cluster efficiency seriously.Many researches have been contributed to the solution, such as Blacklist[8], speculative execution[1, 6], Dolly[8]. In this paper, we put forward a...Stragglers can temporize jobs and reduce cluster efficiency seriously.Many researches have been contributed to the solution, such as Blacklist[8], speculative execution[1, 6], Dolly[8]. In this paper, we put forward a new approach for mitigating stragglers in MapReduce, name Hummer. It starts task clones only for high-risk delaying tasks. Related experiments have been carried and results show that it can decrease the job delaying risk with fewer resources consumption. For small jobs, Hummer also improves job completion time by 48% and 10% compared to LATE and Dolly.展开更多
To reduce the time required to complete the regeneration process of erasure codes, we propose a Tree-structured Parallel Regeneration (TPR) scheme for multiple data losses in distributed storage systems. Under the sch...To reduce the time required to complete the regeneration process of erasure codes, we propose a Tree-structured Parallel Regeneration (TPR) scheme for multiple data losses in distributed storage systems. Under the scheme, two algorithms are proposed for the construction of multiple regeneration trees, namely the edge-disjoint algorithm and edge-sharing algorithm. The edge-disjoint algorithm constructs multiple independent trees, and is simple and appropriate for environments where newcomers and their providers are distributed over a large area and have few intersections. The edge-sharing algorithm constructs multiple trees that compete to utilize the bandwidth, and make a better utilization of the bandwidth, although it needs to measure the available band-width and deal with the bandwidth changes; it is therefore difficult to implement in practical systems. The parallel regeneration for multiple data losses of TPR primarily includes two optimizations: firstly, transferring the data through the bandwidth optimized-paths in a pipe-line manner; secondly, executing data regeneration over multiple trees in parallel. To evaluate the proposal, we implement an event-based simulator and make a detailed comparison with some popular regeneration methods. The quantitative comparison results show that the use of TPR employing either the edge-disjoint algorithm or edge-sharing algorithm reduces the regeneration time significantly.展开更多
Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28...Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28-nm technology and 0.2-lm technology to analyze the impact of strike location on SET sensitivity in FDSOI devices. Simulation results show that the most SET-sensitive region in FDSOI transistors is the drain region near the gate. An in-depth analysis shows that the bipolar amplification effect in FDSOI devices is dependent on the strike locations. In addition, when the drain contact is moved toward the drain direction, the most sensitive region drifts toward the drain and collects more charge. This provides theoretical guidance for SET hardening.展开更多
Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applicatio...Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applications demand to run across several clouds to satisfy the requirements like best cost efficiency, avoidance of vender lock-in, and geolocation sensitive service. JointCloud computing is a new research initiated by Chinese institutes to address the computing issues concerned with multiple clouds. In JointCloud, users' diverse and dynamic requirements on cloud resources axe satisfied by providing users virtual cloud (VC) for special purposes. A virtual cloud for special purposes is in essence a user's specific cloud working environment having the customized software stacks, configurations and computing resources readily available. This paper first introduces what is JointCloud computing and then describes the design rationales, motivation examples, mechanisms and enabling technologies of VC in JointCloud.展开更多
As the fourth passive circuit component, a memristor is a nonlinear resistor that can "remember" the amount of charge passing through it. The characteristic of "remembering" the charge and non-volatility makes mem...As the fourth passive circuit component, a memristor is a nonlinear resistor that can "remember" the amount of charge passing through it. The characteristic of "remembering" the charge and non-volatility makes memristors great potential candidates in many fields. Nowadays, only a few groups have the ability to fabricate memristors, and most researchers study them by theoretic analysis and simulation. In this paper, we first analyse the theoretical base and characteristics of memristors, then use a simulation program with integrated circuit emphasis as our tool to simulate the theoretical model of memristors and change the parameters in the model to see the influence of each parameter on the characteristics. Our work supplies researchers engaged in memristor-based circuits with advice on how to choose the proper parameters.展开更多
In this paper, the effect of floating body effect (FBE) on a single event transient generation mechanism in fully depleted (FD) silicon-on-insulator (SOI) technology is investigated using three-dimensional techn...In this paper, the effect of floating body effect (FBE) on a single event transient generation mechanism in fully depleted (FD) silicon-on-insulator (SOI) technology is investigated using three-dimensional technology computer-aided design (3D- TCAD) numerical simulation. The results indicate that the main SET generation mechanism is not carder drift/diffusion but floating body effect (FBE) whether for positive or negative channel metal oxide semiconductor (PMOS or NMOS). Two stacking layout designs mitigating FBE are investigated as well, and the results indicate that the in-line stacking (IS) layout can mitigate FBE completely and is area penalty saving compared with the conventional stacking layout.展开更多
The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks ...The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks with millions, or more, of vertices. The MATLAB language, with its mass of statistical functions, is a good choice to rapidly realize an algorithm prototype of complex networks. The performance of the MATLAB codes can be further improved by using graphic processor units (GPU). This paper presents the strategies and performance of the GPU implementation of a complex networks package, and the Jacket toolbox of MATLAB is used. Compared with some commercially available CPU implementations, GPU can achieve a speedup of, on average, 11.3x. The experimental result proves that the GPU platform combined with the MATLAB language is a good combination for complex network research.展开更多
Internet-based virtual computing environment(iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services.Virtual machines(VMs) ha...Internet-based virtual computing environment(iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services.Virtual machines(VMs) have been widely used in iVCE to isolate different users/jobs and ensure trustworthiness,but traditionally VMs require a long period of time for booting,which cannot meet the requirement of iVCE's large-scale and highly dynamic applications.To address this problem,in this paper we design and implement VirtMan,a fast booting system for a large number of virtual machines in iVCE.VirtMan uses the Linux Small Computer System Interface(SCSI) target to remotely mount to the source image in a scalable hierarchy,and leverages the homogeneity of a set of VMs to transfer only necessary image data at runtime.We have implemented VirtMan both as a standalone system and for Open Stack.In our 100-server testbed,VirtMan boots up 1000 VMs(with a 15 GB image of Windows Server 2008) on 100 physical servers in less than 120 s,which is three orders of magnitude lower than current public clouds.展开更多
Feature-based image matching algorithms play an indispensable role in automatic target recognition(ATR).In this work,a fast image matching algorithm(FIMA)is proposed which utilizes the geometry feature of extended cen...Feature-based image matching algorithms play an indispensable role in automatic target recognition(ATR).In this work,a fast image matching algorithm(FIMA)is proposed which utilizes the geometry feature of extended centroid(EC)to build affine invariants.Based on affine invariants of the length ratio of two parallel line segments,FIMA overcomes the invalidation problem of the state-of-the-art algorithms based on affine geometry features,and increases the feature diversity of different targets,thus reducing misjudgment rate during recognizing targets.However,it is found that FIMA suffers from the parallelogram contour problem and the coincidence invalidation.An advanced FIMA is designed to cope with these problems.Experiments prove that the proposed algorithms have better robustness for Gaussian noise,gray-scale change,contrast change,illumination and small three-dimensional rotation.Compared with the latest fast image matching algorithms based on geometry features,FIMA reaches the speedup of approximate 1.75 times.Thus,FIMA would be more suitable for actual ATR applications.展开更多
Searchable encryption allows cloud users to outsource the massive encrypted data to the remote cloud and to search over the data without revealing the sensitive information. Many schemes have been proposed to support ...Searchable encryption allows cloud users to outsource the massive encrypted data to the remote cloud and to search over the data without revealing the sensitive information. Many schemes have been proposed to support the keyword search in a public cloud. However,they have some potential limitations. First,most of the existing schemes only consider the scenario with the single data owner. Second,they need secure channels to guarantee the secure transmission of secret keys from the data owner to data users. Third,in some schemes,the data owner should be online to help data users when data users intend to perform the search,which is inconvenient.In this paper,we propose a novel searchable scheme which supports the multi-owner keyword search without secure channels. More than that,our scheme is a non-interactive solution,in which all the users only need to communicate with the cloud server. Furthermore,the analysis proves that our scheme can guarantee the security even without secure channels. Unlike most existing public key encryption based searchable schemes,we evaluate the performance of our scheme,which shows that our scheme is practical.展开更多
The resource allocation problem in data centre networks refers to a map of the workloads provided by the cloud users/tenants to the Substrate Network(SN)which are provided by the cloud providers.Existing studies consi...The resource allocation problem in data centre networks refers to a map of the workloads provided by the cloud users/tenants to the Substrate Network(SN)which are provided by the cloud providers.Existing studies consider the dynamic arrival and departure of the workloads,while the dynamics of the substrate are ignored.In this paper,we first propose the resource allocation with the dynamic SN,and denote it as GraphMap-DS.Then,we propose an efficient mapping algorithm for GraphMap-DS.The performance of the proposed algorithm is evaluated by performing simulation experiments.Our results show that the proposed algorithm can effectively solve the GraphMap-DS.展开更多
文摘Massive multiple-input multiple-output provides improved energy efficiency and spectral efficiency in 5 G. However it requires large-scale matrix computation with tremendous complexity, especially for data detection and precoding. Recently, many detection and precoding methods were proposed using approximate iteration methods, which meet the demand of precision with low complexity. In this paper, we compare these approximate iteration methods in precision and complexity, and then improve these methods with iteration refinement at the cost of little complexity and no extra hardware resource. By derivation, our proposal is a combination of three approximate iteration methods in essence and provides remarkable precision improvement on desired vectors. The results show that our proposal provides 27%-83% normalized mean-squared error improvement of the detection symbol vector and precoding symbol vector. Moreover, we find the bit-error rate is mainly controlled by soft-input soft-output Viterbi decoding when using approximate iteration methods. Further, only considering the effect on soft-input soft-output Viterbi decoding, the simulation results show that using a rough estimation for the filter matrix of minimum mean square error detection to calculating log-likelihood ratio could provideenough good bit-error rate performance, especially when the ratio of base station antennas number and the users number is not too large.
基金supported by the National Natural Science of China(Grant No.61376109)
文摘FinFET technologies are becoming the mainstream process as technology scales down. Based on a 28-nm bulk p- FinFET device, we have investigated the fin width and height dependence of bipolar amplification for heavy-ion-irradiated FinFETs by 3D TCAD numerical simulation. Simulation results show that due to a well bipolar conduction mechanism rather than a channel (fin) conduction path, the transistors with narrower fins exhibit a diminished bipolar amplification effect, while the fin height presents a trivial effect on the bipolar amplification and charge collection. The results also indicate that the single event transient (SET) pulse width can be mitigated about 35% at least by optimizing the ratio of fin width and height, which can provide guidance for radiation-hardened applications in bulk FinFET technology.
基金supported by the National Natural Science Foundation of China(Grant Nos.61376109,61434007,and 61176030)
文摘Charge sharing is becoming an important topic as the feature size scales down in fin field-effect-transistor (FinFET) technology. However, the studies of charge sharing induced single-event transient (SET) pulse quenching with bulk FinFET are reported seldomly. Using three-dimensional technology computer aided design (3DTCAD) mixed-mode simulations, the effects of supply voltage and body-biasing on SET pulse quenching are investigated for the first time in bulk FinFET process. Research results indicate that due to an enhanced charge sharing effect, the propagating SET pulse width decreases with reducing supply voltage. Moreover, compared with reverse body-biasing (RBB), the circuit with forward body-biasing (FBB) is vulnerable to charge sharing and can effectively mitigate the propagating SET pulse width up to 53% at least. This can provide guidance for radiation-hardened bulk FinFET technology especially in low power and high performance applications.
基金Project(2008AA01A201) supported the National High-tech Research and Development Program of ChinaProjects(60833004, 60633050) supported by the National Natural Science Foundation of China
文摘Developing parallel applications on heterogeneous processors is facing the challenges of 'memory wall',due to limited capacity of local storage,limited bandwidth and long latency for memory access. Aiming at this problem,a parallelization approach was proposed with six memory optimization schemes for CG,four schemes of them aiming at all kinds of sparse matrix-vector multiplication (SPMV) operation. Conducted on IBM QS20,the parallelization approach can reach up to 21 and 133 times speedups with size A and B,respectively,compared with single power processor element. Finally,the conclusion is drawn that the peak bandwidth of memory access on Cell BE can be obtained in SPMV,simple computation is more efficient on heterogeneous processors and loop-unrolling can hide local storage access latency while executing scalar operation on SIMD cores.
基金supported by the National Science Foundation for Distinguished Young Scholars of China(Grant Nos.61003082 and 60903059)the National Natural Science Foundation of China(Grant No.60873014)the Foundation for Innovative Research Groups of the National Natural Science Foundation of China(Grant No.60921062)
文摘Many real-world networks are found to be scale-free. However, graph partition technology, as a technology capable of parallel computing, performs poorly when scale-free graphs are provided. The reason for this is that traditional partitioning algorithms are designed for random networks and regular networks, rather than for scale-free networks. Multilevel graph-partitioning algorithms are currently considered to be the state of the art and are used extensively. In this paper, we analyse the reasons why traditional multilevel graph-partitioning algorithms perform poorly and present a new multilevel graph-partitioning paradigm, top down partitioning, which derives its name from the comparison with the traditional bottom-up partitioning. A new multilevel partitioning algorithm, named betweenness-based partitioning algorithm, is also presented as an implementation of top-down partitioning paradigm. An experimental evaluation of seven different real-world scale-free networks shows that the betweenness-based partitioning algorithm significantly outperforms the existing state-of-the-art approaches.
基金This work is sponsored in part by the National Basic Research Program of China (973) under Grant No.2014CB340303, the National Natural Science Foundation of China under Grant No. 61222205, the Program for New Century Excellent Talents in University, and the Fok Ying-Tong Education Foundation under Grant No. 141066.
文摘Stragglers can temporize jobs and reduce cluster efficiency seriously.Many researches have been contributed to the solution, such as Blacklist[8], speculative execution[1, 6], Dolly[8]. In this paper, we put forward a new approach for mitigating stragglers in MapReduce, name Hummer. It starts task clones only for high-risk delaying tasks. Related experiments have been carried and results show that it can decrease the job delaying risk with fewer resources consumption. For small jobs, Hummer also improves job completion time by 48% and 10% compared to LATE and Dolly.
基金supported by the National Grand Fundamental Research of China (973 Program) under Grant No. 2011CB302601the National High Technology Research and Development of China (863 Program) under GrantNo. 2013AA01A213+2 种基金the National Natural Science Foundation of China under Grant No. 60873215the Natural Science Foundation for Distinguished Young Scholars of Hunan Province under Grant No. S2010J5050Specialized Research Fund for the Doctoral Program of Higher Education under Grant No. 20124307110015
文摘To reduce the time required to complete the regeneration process of erasure codes, we propose a Tree-structured Parallel Regeneration (TPR) scheme for multiple data losses in distributed storage systems. Under the scheme, two algorithms are proposed for the construction of multiple regeneration trees, namely the edge-disjoint algorithm and edge-sharing algorithm. The edge-disjoint algorithm constructs multiple independent trees, and is simple and appropriate for environments where newcomers and their providers are distributed over a large area and have few intersections. The edge-sharing algorithm constructs multiple trees that compete to utilize the bandwidth, and make a better utilization of the bandwidth, although it needs to measure the available band-width and deal with the bandwidth changes; it is therefore difficult to implement in practical systems. The parallel regeneration for multiple data losses of TPR primarily includes two optimizations: firstly, transferring the data through the bandwidth optimized-paths in a pipe-line manner; secondly, executing data regeneration over multiple trees in parallel. To evaluate the proposal, we implement an event-based simulator and make a detailed comparison with some popular regeneration methods. The quantitative comparison results show that the use of TPR employing either the edge-disjoint algorithm or edge-sharing algorithm reduces the regeneration time significantly.
基金supported by the National Natural Science Foundation of China(Nos.61434007 and 61376109)
文摘Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28-nm technology and 0.2-lm technology to analyze the impact of strike location on SET sensitivity in FDSOI devices. Simulation results show that the most SET-sensitive region in FDSOI transistors is the drain region near the gate. An in-depth analysis shows that the bipolar amplification effect in FDSOI devices is dependent on the strike locations. In addition, when the drain contact is moved toward the drain direction, the most sensitive region drifts toward the drain and collects more charge. This provides theoretical guidance for SET hardening.
基金This work is supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000105 and the National Natural Science Foundation of China under Grant Nos. 61272154 and 61421091.
文摘Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applications demand to run across several clouds to satisfy the requirements like best cost efficiency, avoidance of vender lock-in, and geolocation sensitive service. JointCloud computing is a new research initiated by Chinese institutes to address the computing issues concerned with multiple clouds. In JointCloud, users' diverse and dynamic requirements on cloud resources axe satisfied by providing users virtual cloud (VC) for special purposes. A virtual cloud for special purposes is in essence a user's specific cloud working environment having the customized software stacks, configurations and computing resources readily available. This paper first introduces what is JointCloud computing and then describes the design rationales, motivation examples, mechanisms and enabling technologies of VC in JointCloud.
基金supported by the Young Scientists Fund of the National Natural Science Foundation of China (Grant No. 61003082) the Science Fund for Creative Research Groups of the National Natural Science Foundation of China (Grant No. 60921062)
文摘As the fourth passive circuit component, a memristor is a nonlinear resistor that can "remember" the amount of charge passing through it. The characteristic of "remembering" the charge and non-volatility makes memristors great potential candidates in many fields. Nowadays, only a few groups have the ability to fabricate memristors, and most researchers study them by theoretic analysis and simulation. In this paper, we first analyse the theoretical base and characteristics of memristors, then use a simulation program with integrated circuit emphasis as our tool to simulate the theoretical model of memristors and change the parameters in the model to see the influence of each parameter on the characteristics. Our work supplies researchers engaged in memristor-based circuits with advice on how to choose the proper parameters.
基金Project(2016-YFB1000805)supported by the National Grand R&D Plan,ChinaProjects(61502512,61432020,61472430,61532004)supported by the National Natural Science Foundation of China
基金Project supported by the National Basic Research Program(973)of China(No.2014CB340303) the National Natural Science Foundation of China(Nos.61222205 and 61402490)+1 种基金 the Program for New Century Excellent Talents in University,China(No.141066) the Fok Ying-Tong Education Foundation
基金Project supported by the National Natural Science Foundation of China(Grant Nos.61376109,61434007,and 61176030)the Advanced Research Project of National University of Defense Technology,China(Grant No.0100066314001)
文摘In this paper, the effect of floating body effect (FBE) on a single event transient generation mechanism in fully depleted (FD) silicon-on-insulator (SOI) technology is investigated using three-dimensional technology computer-aided design (3D- TCAD) numerical simulation. The results indicate that the main SET generation mechanism is not carder drift/diffusion but floating body effect (FBE) whether for positive or negative channel metal oxide semiconductor (PMOS or NMOS). Two stacking layout designs mitigating FBE are investigated as well, and the results indicate that the in-line stacking (IS) layout can mitigate FBE completely and is area penalty saving compared with the conventional stacking layout.
基金Project supported by the Science Fund for Creative Research Groups of the National Natural Science Foundation of China (Grant No.60921062)the National Natural Science Foundation of China (Grant No.60873014)the Young Scientists Fund of the National Natural Science Foundation of China (Grant Nos.61003082 and 60903059)
文摘The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks with millions, or more, of vertices. The MATLAB language, with its mass of statistical functions, is a good choice to rapidly realize an algorithm prototype of complex networks. The performance of the MATLAB codes can be further improved by using graphic processor units (GPU). This paper presents the strategies and performance of the GPU implementation of a complex networks package, and the Jacket toolbox of MATLAB is used. Compared with some commercially available CPU implementations, GPU can achieve a speedup of, on average, 11.3x. The experimental result proves that the GPU platform combined with the MATLAB language is a good combination for complex network research.
基金supported by the National Natural Science Foundation of China(Nos.61379055 and 61379053)
文摘Internet-based virtual computing environment(iVCE) has been proposed to combine data centers and other kinds of computing resources on the Internet to provide efficient and economical services.Virtual machines(VMs) have been widely used in iVCE to isolate different users/jobs and ensure trustworthiness,but traditionally VMs require a long period of time for booting,which cannot meet the requirement of iVCE's large-scale and highly dynamic applications.To address this problem,in this paper we design and implement VirtMan,a fast booting system for a large number of virtual machines in iVCE.VirtMan uses the Linux Small Computer System Interface(SCSI) target to remotely mount to the source image in a scalable hierarchy,and leverages the homogeneity of a set of VMs to transfer only necessary image data at runtime.We have implemented VirtMan both as a standalone system and for Open Stack.In our 100-server testbed,VirtMan boots up 1000 VMs(with a 15 GB image of Windows Server 2008) on 100 physical servers in less than 120 s,which is three orders of magnitude lower than current public clouds.
基金Projects(2012AA010901,2012AA01A301)supported by National High Technology Research and Development Program of ChinaProjects(61272142,61103082,61003075,61170261,61103193)supported by the National Natural Science Foundation of ChinaProjects(B120601,CX2012A002)supported by Fund Sponsor Project of Excellent Postgraduate Student of NUDT,China
文摘Feature-based image matching algorithms play an indispensable role in automatic target recognition(ATR).In this work,a fast image matching algorithm(FIMA)is proposed which utilizes the geometry feature of extended centroid(EC)to build affine invariants.Based on affine invariants of the length ratio of two parallel line segments,FIMA overcomes the invalidation problem of the state-of-the-art algorithms based on affine geometry features,and increases the feature diversity of different targets,thus reducing misjudgment rate during recognizing targets.However,it is found that FIMA suffers from the parallelogram contour problem and the coincidence invalidation.An advanced FIMA is designed to cope with these problems.Experiments prove that the proposed algorithms have better robustness for Gaussian noise,gray-scale change,contrast change,illumination and small three-dimensional rotation.Compared with the latest fast image matching algorithms based on geometry features,FIMA reaches the speedup of approximate 1.75 times.Thus,FIMA would be more suitable for actual ATR applications.
基金supported by Natural Science Foundation of China(No.61303264)
文摘Searchable encryption allows cloud users to outsource the massive encrypted data to the remote cloud and to search over the data without revealing the sensitive information. Many schemes have been proposed to support the keyword search in a public cloud. However,they have some potential limitations. First,most of the existing schemes only consider the scenario with the single data owner. Second,they need secure channels to guarantee the secure transmission of secret keys from the data owner to data users. Third,in some schemes,the data owner should be online to help data users when data users intend to perform the search,which is inconvenient.In this paper,we propose a novel searchable scheme which supports the multi-owner keyword search without secure channels. More than that,our scheme is a non-interactive solution,in which all the users only need to communicate with the cloud server. Furthermore,the analysis proves that our scheme can guarantee the security even without secure channels. Unlike most existing public key encryption based searchable schemes,we evaluate the performance of our scheme,which shows that our scheme is practical.
基金supported by the National Basic Research of China(973 Program)under Grant No.2011CB302601the National Natural Science Foundation of China under Grants No.90818028,No.6903043,No.61202117the National High Technology Research and Development Program of China(863 Program)under Grant No.2012AA011201
文摘The resource allocation problem in data centre networks refers to a map of the workloads provided by the cloud users/tenants to the Substrate Network(SN)which are provided by the cloud providers.Existing studies consider the dynamic arrival and departure of the workloads,while the dynamics of the substrate are ignored.In this paper,we first propose the resource allocation with the dynamic SN,and denote it as GraphMap-DS.Then,we propose an efficient mapping algorithm for GraphMap-DS.The performance of the proposed algorithm is evaluated by performing simulation experiments.Our results show that the proposed algorithm can effectively solve the GraphMap-DS.
基金Project(2018-YFB1004202)supported by the National Key R&D Program of ChinaProject(61702534)supported by the National Natural Science Foundation of China