Speaker separation in complex acoustic environment is one of challenging tasks in speech separation.In practice,speakers are very often unmoving or moving slowly in normal communication.In this case,the spatial featur...Speaker separation in complex acoustic environment is one of challenging tasks in speech separation.In practice,speakers are very often unmoving or moving slowly in normal communication.In this case,the spatial features among the consecutive speech frames become highly correlated such that it is helpful for speaker separation by providing additional spatial information.To fully exploit this information,we design a separation system on Recurrent Neural Network(RNN)with long short-term memory(LSTM)which effectively learns the temporal dynamics of spatial features.In detail,a LSTM-based speaker separation algorithm is proposed to extract the spatial features in each time-frequency(TF)unit and form the corresponding feature vector.Then,we treat speaker separation as a supervised learning problem,where a modified ideal ratio mask(IRM)is defined as the training function during LSTM learning.Simulations show that the proposed system achieves attractive separation performance in noisy and reverberant environments.Specifically,during the untrained acoustic test with limited priors,e.g.,unmatched signal to noise ratio(SNR)and reverberation,the proposed LSTM based algorithm can still outperforms the existing DNN based method in the measures of PESQ and STOI.It indicates our method is more robust in untrained conditions.展开更多
A 1 kbit antifuse one time programmable(OTP) memory IP,which is one of the non-volatile memory IPs,was designed and used for power management integrated circuits(ICs).A conventional antifuse OTP cell using a single po...A 1 kbit antifuse one time programmable(OTP) memory IP,which is one of the non-volatile memory IPs,was designed and used for power management integrated circuits(ICs).A conventional antifuse OTP cell using a single positive program voltage(VPP) has a problem when applying a higher voltage than the breakdown voltage of the thin gate oxides and at the same time,securing the reliability of medium voltage(VM) devices that are thick gate transistors.A new antifuse OTP cell using a dual program voltage was proposed to prevent the possibility for failures in a qualification test or the yield drop.For the newly proposed cell,a stable sensing is secured from the post-program resistances of several ten thousand ohms or below due to the voltage higher than the hard breakdown voltage applied to the terminals of the antifuse.The layout size of the designed 1 kbit antifuse OTP memory IP with Dongbu HiTek's 0.18 μm Bipolar-CMOS-DMOS(BCD) process is 567.9 μm×205.135 μm and the post-program resistance of an antifuse is predicted to be several ten thousand ohms.展开更多
For the conventional single-ended eFuse cell, sensing failures can occur due to a variation of a post-program eFuse resistance during the data retention time and a relatively high program resistance of several kilo oh...For the conventional single-ended eFuse cell, sensing failures can occur due to a variation of a post-program eFuse resistance during the data retention time and a relatively high program resistance of several kilo ohms. A differential paired eFuse cell is designed which is about half the size smaller in sensing resistance of a programmed eFuse link than the conventional single-ended eFuse cell. Also, a sensing circuit of sense amplifier is proposed, based on D flip-flop structure to implement a simple sensing circuit. Furthermore, a sensing margin test circuit is proposed with variable pull-up loads out of consideration for resistance variation of a programmed eFuse. When an 8 bit eFuse OTP IP is designed with 0.18 ~tm standard CMOS logic of TSMC, the layout dimensions are 229.04 μm ×100.15μm. All the chips function successfully when 20 test chips are tested with a program voltage of 4.2 V.展开更多
In daily life,people use their hands in various ways for most daily activities.There are many applications based on the position,direction,and joints of the hand,including gesture recognition,gesture prediction,roboti...In daily life,people use their hands in various ways for most daily activities.There are many applications based on the position,direction,and joints of the hand,including gesture recognition,gesture prediction,robotics and so on.This paper proposes a gesture prediction system that uses hand joint coordinate features collected by the Leap Motion to predict dynamic hand gestures.The model is applied to the NAO robot to verify the effectiveness of the proposed method.First of all,in order to reduce jitter or jump generated in the process of data acquisition by the Leap Motion,the Kalman filter is applied to the original data.Then some new feature descriptors are introduced.The length feature,angle feature and angular velocity feature are extracted from the filtered data.These features are fed into the long-short time memory recurrent neural network(LSTM-RNN)with different combinations.Experimental results show that the combination of coordinate,length and angle features achieves the highest accuracy of 99.31%,and it can also run in real time.Finally,the trained model is applied to the NAO robot to play the finger-guessing game.Based on the predicted gesture,the NAO robot can respond in advance.展开更多
Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of meth...Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.展开更多
Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly co...Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly coupled with the CPU chips. Designing a new interconnection technology thus requires considering not only the interconnection itself, but also the design of the processors that will rely on it. In this paper, we study memory hierarchy implications for the design of high-speed datacenter interconnects particularly as they affect remote memory access -- and we use PCIe as the vehicle for our investigations. To that end, we build three complementary platforms: a PCIe-interconnected prototype server with which we measure and analyze current bottlenecks; a software simulator that lets us model microarchitectural and cache hierarchy changes; and an FPGA prototype system with a streamlined switchless customized protocol Thunder with which we study hardware optimizations outside the processor. We highlight several architectural modifications to better support remote memory access and communication, and quantify their impact and ]imitations.展开更多
Car taillights are ubiquitous during the deceleration process in real traffic,while drivers have a memory for historical information.The collective effect may greatly affect driving behavior and traffic flow performan...Car taillights are ubiquitous during the deceleration process in real traffic,while drivers have a memory for historical information.The collective effect may greatly affect driving behavior and traffic flow performance.In this paper,we propose a continuum model with the driver's memory time and the preceding vehicle's taillight.To better reflect reality,the continuous driving process is also considered.To this end,we first develop a unique version of a car-following model.By converting micro variables into macro variables with a macro conversion method,the micro carfollowing model is transformed into a new continuum model.Based on a linear stability analysis,the stability conditions of the new continuum model are obtained.We proceed to deduce the modified KdV-Burgers equation of the model in a nonlinear stability analysis,where the solution can be used to describe the propagation and evolution characteristics of the density wave near the neutral stability curve.The results show that memory time has a negative impact on the stability of traffic flow,whereas the provision of the preceding vehicle's taillight contributes to mitigating traffic congestion and reducing energy consumption.展开更多
Accesses Per Cycle(APC),Concurrent Average Memory Access Time(C-AMAT),and Layered Performance Matching(LPM)are three memory performance models that consider both data locality and memory assess concurrency.The APC mod...Accesses Per Cycle(APC),Concurrent Average Memory Access Time(C-AMAT),and Layered Performance Matching(LPM)are three memory performance models that consider both data locality and memory assess concurrency.The APC model measures the throughput of a memory architecture and therefore reflects the quality of service(QoS)of a memory system.The C-AMAT model provides a recursive expression for the memory access delay and therefore can be used for identifying the potential bottlenecks in a memory hierarchy.The LPM method transforms a global memory system optimization into localized optimizations at each memory layer by matching the data access demands of the applications with the underlying memory system design.These three models have been proposed separately through prior efforts.This paper reexamines the three models under one coherent mathematical framework.More specifically,we present a new memorycentric view of data accesses.We divide the memory cycles at each memory layer into four distinct categories and use them to recursively define the memory access latency and concurrency along the memory hierarchy.This new perspective offers new insights with a clear formulation of the memory performance considering both locality and concurrency.Consequently,the performance model can be easily understood and applied in engineering practices.As such,the memory-centric approach helps establish a unified mathematical foundation for model-driven performance analysis and optimization of contemporary and future memory systems.展开更多
Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency a...Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency are the two essential factors influencing the performance of modern memory systems. However, existing studies in reducing data stall time rarely focus on utilizing data access concurrency because the impact of memory concurrency on overall memory system performance is not well understood. In this study, a pair of novel data stall time models, the L-C model for the combined effort of locality and concurrency and the P-M model for the effect of pure miss on data stall time, are presented. The models provide a new understanding of data access delay and provide new directions for performance optimization. Based on these new models, a summary table of advanced cache optimizations is presented. It has 38 entries contributed by data concurrency while only has 21 entries contributed by data locality, which shows the value of data concurrency. The L-C and P-M models and their associated results and opportunities introduced in this study are important and necessary for future data-centric architecture and algorithm design of modern computing systems.展开更多
Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accur...Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accurate residual life prediction plays a crucial role in guaranteeing machine operation safety and reliability and reducing maintenance cost. In order to increase the forecasting precision of the remaining useful life(RUL) of the rolling bearing, an advanced approach combining elastic net with long short-time memory network(LSTM) is proposed, and the new approach is referred to as E-LSTM. The E-LSTM algorithm consists of an elastic mesh and LSTM, taking temporal-spatial correlation into consideration to forecast the RUL through the LSTM. To solve the over-fitting problem of the LSTM neural network during the training process, the elastic net based regularization term is introduced to the LSTM structure.In this way, the change of the output can be well characterized to express the bearing degradation mode. Experimental results from the real-world data demonstrate that the proposed E-LSTM method can obtain higher stability and relevant values that are useful for the RUL forecasting of bearing. Furthermore, these results also indicate that E-LSTM can achieve better performance.展开更多
The interception problem of Hypersonic Gliding Vehicles(HGVs)has been an important aspect of missile defense systems.In order to provide interceptors with accurate information of target trajectory,a model based on an ...The interception problem of Hypersonic Gliding Vehicles(HGVs)has been an important aspect of missile defense systems.In order to provide interceptors with accurate information of target trajectory,a model based on an improved Long Short-Time Memory(LSTM)network for trajectory prediction pipeline is proposed for the interception of a skip gliding hypersonic target.Firstly,for trajectory prediction required by intercepting guidance laws,the altitude,velocity and velocity direction of the target are formulated in the form of analytic functions,consisting of linear decay terms and amplitude decay sinusoidal terms.Then,the dynamic characteristics of the model parameters are analyzed,and the target trajectory prediction pipeline is proposed with the prediction error considered.Finally,an improved LSTM network is designed to estimate parameters in a dynamically-updated manner,and estimation results are used for the calculation of the final trajectory prediction pipeline.The proposed prediction algorithm provides information on the velocity vector for midcourse guidance with the effect of prediction errors on interception taken into account.Simulation is conducted and the results show the high accuracy of the algorithm in HGVs’trajectory prediction which is conducive to increasing the interception success rate.展开更多
基金This work is supported by the National Nature Science Foundation of China(NSFC)under Grant Nos.61571106,61501169,41706103the Fundamental Research Funds for the Central Universities under Grant No.2242013K30010.
文摘Speaker separation in complex acoustic environment is one of challenging tasks in speech separation.In practice,speakers are very often unmoving or moving slowly in normal communication.In this case,the spatial features among the consecutive speech frames become highly correlated such that it is helpful for speaker separation by providing additional spatial information.To fully exploit this information,we design a separation system on Recurrent Neural Network(RNN)with long short-term memory(LSTM)which effectively learns the temporal dynamics of spatial features.In detail,a LSTM-based speaker separation algorithm is proposed to extract the spatial features in each time-frequency(TF)unit and form the corresponding feature vector.Then,we treat speaker separation as a supervised learning problem,where a modified ideal ratio mask(IRM)is defined as the training function during LSTM learning.Simulations show that the proposed system achieves attractive separation performance in noisy and reverberant environments.Specifically,during the untrained acoustic test with limited priors,e.g.,unmatched signal to noise ratio(SNR)and reverberation,the proposed LSTM based algorithm can still outperforms the existing DNN based method in the measures of PESQ and STOI.It indicates our method is more robust in untrained conditions.
基金Work supported by the Second Stage of Brain Korea 21 Projectssupported by Changwon National University in 2009-2010
文摘A 1 kbit antifuse one time programmable(OTP) memory IP,which is one of the non-volatile memory IPs,was designed and used for power management integrated circuits(ICs).A conventional antifuse OTP cell using a single positive program voltage(VPP) has a problem when applying a higher voltage than the breakdown voltage of the thin gate oxides and at the same time,securing the reliability of medium voltage(VM) devices that are thick gate transistors.A new antifuse OTP cell using a dual program voltage was proposed to prevent the possibility for failures in a qualification test or the yield drop.For the newly proposed cell,a stable sensing is secured from the post-program resistances of several ten thousand ohms or below due to the voltage higher than the hard breakdown voltage applied to the terminals of the antifuse.The layout size of the designed 1 kbit antifuse OTP memory IP with Dongbu HiTek's 0.18 μm Bipolar-CMOS-DMOS(BCD) process is 567.9 μm×205.135 μm and the post-program resistance of an antifuse is predicted to be several ten thousand ohms.
文摘For the conventional single-ended eFuse cell, sensing failures can occur due to a variation of a post-program eFuse resistance during the data retention time and a relatively high program resistance of several kilo ohms. A differential paired eFuse cell is designed which is about half the size smaller in sensing resistance of a programmed eFuse link than the conventional single-ended eFuse cell. Also, a sensing circuit of sense amplifier is proposed, based on D flip-flop structure to implement a simple sensing circuit. Furthermore, a sensing margin test circuit is proposed with variable pull-up loads out of consideration for resistance variation of a programmed eFuse. When an 8 bit eFuse OTP IP is designed with 0.18 ~tm standard CMOS logic of TSMC, the layout dimensions are 229.04 μm ×100.15μm. All the chips function successfully when 20 test chips are tested with a program voltage of 4.2 V.
基金supported in part by National Nature Science Foundation of China(NSFC)(U20A20200,61861136009)in part by Guangdong Basic and Applied Basic Research Foundation(2019B1515120076,2020B1515120054)in part by Industrial Key Technologies R&D Program of Foshan(2020001006308)。
文摘In daily life,people use their hands in various ways for most daily activities.There are many applications based on the position,direction,and joints of the hand,including gesture recognition,gesture prediction,robotics and so on.This paper proposes a gesture prediction system that uses hand joint coordinate features collected by the Leap Motion to predict dynamic hand gestures.The model is applied to the NAO robot to verify the effectiveness of the proposed method.First of all,in order to reduce jitter or jump generated in the process of data acquisition by the Leap Motion,the Kalman filter is applied to the original data.Then some new feature descriptors are introduced.The length feature,angle feature and angular velocity feature are extracted from the filtered data.These features are fed into the long-short time memory recurrent neural network(LSTM-RNN)with different combinations.Experimental results show that the combination of coordinate,length and angle features achieves the highest accuracy of 99.31%,and it can also run in real time.Finally,the trained model is applied to the NAO robot to play the finger-guessing game.Based on the predicted gesture,the NAO robot can respond in advance.
文摘Traffic flow prediction,as the basis of signal coordination and travel time prediction,has become a research point in the field of transportation.For traffic flow prediction,researchers have proposed a variety of methods,but most of these methods only use the time domain information of traffic flow data to predict the traffic flow,ignoring the impact of spatial correlation on the prediction of target road segment flow,which leads to poor prediction accuracy.In this paper,a traffic flow prediction model called as long short time memory and random forest(LSTM-RF)was proposed based on the combination model.In the process of traffic flow prediction,the long short time memory(LSTM)model was used to extract the time sequence features of the predicted target road segment.Then,the predicted value of LSTM and the collected information of adjacent upstream and downstream sections were simultaneously used as the input features of the random forest model to analyze the spatial-temporal correlation of traffic flow,so as to obtain the final prediction results.The traffic flow data of 132 urban road sections collected by the license plate recognition system in Guiyang City were tested and verified.The results show that the method is better than the single model in prediction accuracy,and the prediction error is obviously reduced compared with the single model.
基金This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010401, and the National Natural Science Foundation of China under Grant Nos. 61100010, 61402438, and 61402439.
文摘Efficient resource utilization requires that emerging datacenter interconnects support both high performance communication and efficient remote resource sharing. These goals require that the network be more tightly coupled with the CPU chips. Designing a new interconnection technology thus requires considering not only the interconnection itself, but also the design of the processors that will rely on it. In this paper, we study memory hierarchy implications for the design of high-speed datacenter interconnects particularly as they affect remote memory access -- and we use PCIe as the vehicle for our investigations. To that end, we build three complementary platforms: a PCIe-interconnected prototype server with which we measure and analyze current bottlenecks; a software simulator that lets us model microarchitectural and cache hierarchy changes; and an FPGA prototype system with a streamlined switchless customized protocol Thunder with which we study hardware optimizations outside the processor. We highlight several architectural modifications to better support remote memory access and communication, and quantify their impact and ]imitations.
基金jointly supported by the Foundation and Applied Research Funds Project of Guangdong,China(Project No.2019A1515111200)the Youth Innovation Talents Funds of Colleges and Universities in Guangdong Province(Project Nos.2018KQNCX287,2019KTSCX008)+1 种基金the Science and Technology Program of Guangzhou,China(Project No.201904010202)the National Science Foundation of China(Project No.61703165)。
文摘Car taillights are ubiquitous during the deceleration process in real traffic,while drivers have a memory for historical information.The collective effect may greatly affect driving behavior and traffic flow performance.In this paper,we propose a continuum model with the driver's memory time and the preceding vehicle's taillight.To better reflect reality,the continuous driving process is also considered.To this end,we first develop a unique version of a car-following model.By converting micro variables into macro variables with a macro conversion method,the micro carfollowing model is transformed into a new continuum model.Based on a linear stability analysis,the stability conditions of the new continuum model are obtained.We proceed to deduce the modified KdV-Burgers equation of the model in a nonlinear stability analysis,where the solution can be used to describe the propagation and evolution characteristics of the density wave near the neutral stability curve.The results show that memory time has a negative impact on the stability of traffic flow,whereas the provision of the preceding vehicle's taillight contributes to mitigating traffic congestion and reducing energy consumption.
基金supported in part by the U.S.National Science Foundation under Grant Nos.CCF-2008000,CNS-1730488,and CCF-2008907the U.S.Department of Homeland Security under Grant No.2017-ST-062-000002.
文摘Accesses Per Cycle(APC),Concurrent Average Memory Access Time(C-AMAT),and Layered Performance Matching(LPM)are three memory performance models that consider both data locality and memory assess concurrency.The APC model measures the throughput of a memory architecture and therefore reflects the quality of service(QoS)of a memory system.The C-AMAT model provides a recursive expression for the memory access delay and therefore can be used for identifying the potential bottlenecks in a memory hierarchy.The LPM method transforms a global memory system optimization into localized optimizations at each memory layer by matching the data access demands of the applications with the underlying memory system design.These three models have been proposed separately through prior efforts.This paper reexamines the three models under one coherent mathematical framework.More specifically,we present a new memorycentric view of data accesses.We divide the memory cycles at each memory layer into four distinct categories and use them to recursively define the memory access latency and concurrency along the memory hierarchy.This new perspective offers new insights with a clear formulation of the memory performance considering both locality and concurrency.Consequently,the performance model can be easily understood and applied in engineering practices.As such,the memory-centric approach helps establish a unified mathematical foundation for model-driven performance analysis and optimization of contemporary and future memory systems.
基金The work was supported in part by the National Science Foundation of USA under Grant Nos. CNS-1162540, CCF-0937877, and CNS-0751200. We would like to thank the Scalable Computing Software (SCS) group in the Illi- nois Institute of Technology and anonymous reviewers for their valuable and professional comments on earlier drafts of this work.
文摘Data access delay has become the prominent performance bottleneck of high-end computing systems. The key to reducing data access delay in system design is to diminish data stall time. Memory locality and concurrency are the two essential factors influencing the performance of modern memory systems. However, existing studies in reducing data stall time rarely focus on utilizing data access concurrency because the impact of memory concurrency on overall memory system performance is not well understood. In this study, a pair of novel data stall time models, the L-C model for the combined effort of locality and concurrency and the P-M model for the effect of pure miss on data stall time, are presented. The models provide a new understanding of data access delay and provide new directions for performance optimization. Based on these new models, a summary table of advanced cache optimizations is presented. It has 38 entries contributed by data concurrency while only has 21 entries contributed by data locality, which shows the value of data concurrency. The L-C and P-M models and their associated results and opportunities introduced in this study are important and necessary for future data-centric architecture and algorithm design of modern computing systems.
基金by National Natural Science Foundation of China(No.61972443)National Key Research and Development Plan Program of China(No.2019YFE0105300)+1 种基金Hunan Provincial Hu-Xiang Young Talents Project of China(No.2018RS3095)Hunan Provincial Natural Science Foundation of China(No.2020JJ5199).
文摘Rotating machinery is important to industrial production. Any failure of rotating machinery, especially the failure of rolling bearings, can lead to equipment shutdown and even more serious incidents. Therefore, accurate residual life prediction plays a crucial role in guaranteeing machine operation safety and reliability and reducing maintenance cost. In order to increase the forecasting precision of the remaining useful life(RUL) of the rolling bearing, an advanced approach combining elastic net with long short-time memory network(LSTM) is proposed, and the new approach is referred to as E-LSTM. The E-LSTM algorithm consists of an elastic mesh and LSTM, taking temporal-spatial correlation into consideration to forecast the RUL through the LSTM. To solve the over-fitting problem of the LSTM neural network during the training process, the elastic net based regularization term is introduced to the LSTM structure.In this way, the change of the output can be well characterized to express the bearing degradation mode. Experimental results from the real-world data demonstrate that the proposed E-LSTM method can obtain higher stability and relevant values that are useful for the RUL forecasting of bearing. Furthermore, these results also indicate that E-LSTM can achieve better performance.
基金co-supported by the National Natural Science Foundation of China(No.61427809).
文摘The interception problem of Hypersonic Gliding Vehicles(HGVs)has been an important aspect of missile defense systems.In order to provide interceptors with accurate information of target trajectory,a model based on an improved Long Short-Time Memory(LSTM)network for trajectory prediction pipeline is proposed for the interception of a skip gliding hypersonic target.Firstly,for trajectory prediction required by intercepting guidance laws,the altitude,velocity and velocity direction of the target are formulated in the form of analytic functions,consisting of linear decay terms and amplitude decay sinusoidal terms.Then,the dynamic characteristics of the model parameters are analyzed,and the target trajectory prediction pipeline is proposed with the prediction error considered.Finally,an improved LSTM network is designed to estimate parameters in a dynamically-updated manner,and estimation results are used for the calculation of the final trajectory prediction pipeline.The proposed prediction algorithm provides information on the velocity vector for midcourse guidance with the effect of prediction errors on interception taken into account.Simulation is conducted and the results show the high accuracy of the algorithm in HGVs’trajectory prediction which is conducive to increasing the interception success rate.