期刊文献+
共找到21篇文章
< 1 2 >
每页显示 20 50 100
A reordered first fit algorithm based novel storage scheme for parallel turbo decoder
1
作者 张乐 贺翔 +1 位作者 徐友云 罗汉文 《Journal of Shanghai University(English Edition)》 CAS 2007年第4期380-384,共5页
In this paper we discuss a novel storage scheme for simultaneous memory access in parallel turbo decoder. The new scheme employs vertex coloring in graph theory. Compared to a similar method that also uses unnatural o... In this paper we discuss a novel storage scheme for simultaneous memory access in parallel turbo decoder. The new scheme employs vertex coloring in graph theory. Compared to a similar method that also uses unnatural order in storage, our scheme requires 25 more memory blocks but allows a simpler configuration for variable sizes of code lengths that can be implemented on-chip. Experiment shows that for a moderate to high decoding throughput (40-100 Mbps), the hardware cost is still affordable for 3GPP's (3rd generation partnership project) interleaver. 展开更多
关键词 turbo codes parallel turbo decoding INTERLEAVER vertex coloring reordered first fit algorithm (RFFA) fieldprogrammable gate array (FPGA).
下载PDF
BAR:a branch-alternation-resorting algorithm for locality exploration in graph processing
2
作者 邓军勇 WANG Junjie +2 位作者 JIANG Lin XIE Xiaoyan ZHOU Kai 《High Technology Letters》 EI CAS 2024年第1期31-42,共12页
Unstructured and irregular graph data causes strong randomness and poor locality of data accesses in graph processing.This paper optimizes the depth-branch-resorting algorithm(DBR),and proposes a branch-alternation-re... Unstructured and irregular graph data causes strong randomness and poor locality of data accesses in graph processing.This paper optimizes the depth-branch-resorting algorithm(DBR),and proposes a branch-alternation-resorting algorithm(BAR).In order to make the algorithm run in parallel and improve the efficiency of algorithm operation,the BAR algorithm is mapped onto the reconfigurable array processor(APR-16)to achieve vertex reordering,effectively improving the locality of graph data.This paper validates the BAR algorithm on the GraphBIG framework,by utilizing the reordered dataset with BAR on breadth-first search(BFS),single source shortest paht(SSSP)and betweenness centrality(BC)algorithms for traversal.The results show that compared with DBR and Corder algorithms,BAR can reduce execution time by up to 33.00%,and 51.00%seperatively.In terms of data movement,the BAR algorithm has a maximum reduction of 39.00%compared with the DBR algorithm and 29.66%compared with Corder algorithm.In terms of computational complexity,the BAR algorithm has a maximum reduction of 32.56%compared with DBR algorithm and53.05%compared with Corder algorithm. 展开更多
关键词 graph processing vertex reordering branch-alternation-resorting algorithm(BAR) reconfigurable array processor
下载PDF
Efficient semi-quantum secret sharing protocol using single particles
3
作者 邢丁 王艺霏 +3 位作者 窦钊 李剑 陈秀波 李丽香 《Chinese Physics B》 SCIE EI CAS CSCD 2023年第7期273-278,共6页
Semi-quantum secret sharing(SQSS)is a branch of quantum cryptography which only requires the dealer to have quantum capabilities,reducing the difficulty of protocol implementation.However,the efficiency of the SQSS pr... Semi-quantum secret sharing(SQSS)is a branch of quantum cryptography which only requires the dealer to have quantum capabilities,reducing the difficulty of protocol implementation.However,the efficiency of the SQSS protocol still needs to be further studied.In this paper,we propose a semi-quantum secret sharing protocol,whose efficiency can approach 100%as the length of message increases.The protocol is based on single particles to reduce the difficulty of resource preparation.Particle reordering,a simple but effective operation,is used in the protocol to improve efficiency and ensure security.Furthermore,our protocol can share specific secrets while most SQSS protocols could not.We also prove that the protocol is secure against common attacks. 展开更多
关键词 semi-quantum secret sharing efficiency single particles specific secret particle reordering
下载PDF
Chaos Identification Based on Component Reordering and Visibility Graph 被引量:1
4
作者 朱胜利 甘露 《Chinese Physics Letters》 SCIE CAS CSCD 2017年第5期18-21,共4页
The identification between chaotic systems and stochastic processes is not easy since they have numerous similarities. In this study, we propose a novel approach to distinguish between chaotic systems and stochastic p... The identification between chaotic systems and stochastic processes is not easy since they have numerous similarities. In this study, we propose a novel approach to distinguish between chaotic systems and stochastic processes based on the component reordering procedure and the visibility graph algorithm. It is found that time series and their reordered components will show diverse characteristics in the 'visibility domain'. For chaotic series, there are huge differences between the degree distribution obtained from the original series and that obtained from the corresponding reordered component. For correlated stochastic series, there are only small differences between the two degree distributions. For uncorrelated stochastic series, there are slight differences between them. Based on this discovery, the well-known Kullback Leible divergence is used to quantify the difference between the two degree distributions and to distinguish between chaotic systems, correlated and uncorrelated stochastic processes. Moreover, one chaotic map, three chaotic systems and three different stochastic processes are utilized to illustrate the feasibility and effectiveness of the proposed method. Numerical results show that the proposed method is not only effective to distinguish between chaotic systems, correlated and uncorrelated stochastic processes, but also easy to operate. 展开更多
关键词 Chaos Identification Based on Component Reordering and Visibility Graph
下载PDF
Robust scalable pre-compressed video coding based on MPEG4-FGS over the Internet
5
作者 ZhangFang XiaoSong WuChengke 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2005年第1期167-172,共6页
Streaming video is becoming increasingly popular among Internet multimedia applications. A robust coding scheme for DCT-based scalable video streaming over the Internet is proposed in this paper. Compared with convent... Streaming video is becoming increasingly popular among Internet multimedia applications. A robust coding scheme for DCT-based scalable video streaming over the Internet is proposed in this paper. Compared with conventional MPEG4-FGS (fine granular scalable) and progressive FGS(PFGS), the proposed method generates the base layer including some sub-base layers by DCT coefficient reordering and VLC reshuffling, which enables the video stream of to adapt itself to long-term bandwidth time-varying of channel. Furthermore, a novel end-to-end transmission architecture for scalable video streaming over the Internet is also presented, in which an adaptive unequal packet loss protection (AUPLP) strategy is proposed to determine the currently available network bandwidth and adjust the sending rates according to different situations, such as network congestion or unreliable transmission. Experimental results show that the proposed progressive scalable scheme can improve the average coding efficiency up to 1.2 dB compared with MPEG4-FGS and PFGS in lower bandwidth, and the AUPLP strategy can improve the transmitting performances not only of the proposed scheme, but also of MPEG4-FGS, PFGS system. 展开更多
关键词 DCT reordering MPEG4-FGS PFGS AUPLP streaming video.
下载PDF
Dynamic graph exploration by interactively linked node-link diagrams and matrix visualizations
6
作者 Michael Burch Kiet Bennema ten Brinke +3 位作者 Adrien Castella Ghassen Karray Sebastiaan Peters Vasil Shteriyanov Rinse Vlasvinkel 《Visual Computing for Industry,Biomedicine,and Art》 EI 2021年第1期219-232,共14页
The visualization of dynamic graphs is a challenging task owing to the various properties of the underlying relational data and the additional time-varying property.For sparse and small graphs,the most efficient appro... The visualization of dynamic graphs is a challenging task owing to the various properties of the underlying relational data and the additional time-varying property.For sparse and small graphs,the most efficient approach to such visualization is node-link diagrams,whereas for dense graphs with attached data,adjacency matrices might be the better choice.Because graphs can contain both properties,being globally sparse and locally dense,a combination of several visual metaphors as well as static and dynamic visualizations is beneficial.In this paper,a visually and algorithmically scalable approach that provides views and perspectives on graphs as interactively linked node-link and adjacency matrix visualizations is described.As the novelty of this technique,insights such as clusters or anomalies from one or several combined views can be used to influence the layout or reordering of the other views.Moreover,the importance of nodes and node groups can be detected,computed,and visualized by considering several layout and reordering properties in combination as well as different edge properties for the same set of nodes.As an additional feature set,an automatic identification of groups,clusters,and outliers is provided over time,and based on the visual outcome of the node-link and matrix visualizations,the repertoire of the supported layout and matrix reordering techniques is extended,and more interaction techniques are provided when considering the dynamics of the graph data.Finally,a small user experiment was conducted to investigate the usability of the proposed approach.The usefulness of the proposed tool is illustrated by applying it to a graph dataset,such as e co-authorships,co-citations,and a Comprehensible Perl Archive Network distribution. 展开更多
关键词 Dynamic graph visualization Node-link diagrams Adjacency matrices LAYOUTS Reorderings
下载PDF
TCP-R with EPDN: Handling out of Order Packets in Error Prone Satellite Networks
7
作者 Arjuna SATHIASEELAN 《International Journal of Communications, Network and System Sciences》 2009年第7期675-686,共12页
Studies have shown that packet reordering is common, especially in satellite networks where there are link level retransmissions and multipath routing. Moreover, traditional satellite networks exhibit high corruption ... Studies have shown that packet reordering is common, especially in satellite networks where there are link level retransmissions and multipath routing. Moreover, traditional satellite networks exhibit high corruption rates causing packet losses. Reordering and corruption of packets decrease the TCP performance of a network, mainly because it leads to overestimation of the congestion in the network. We consider satellite networks and analyze the performance of such networks when reordering and corruption of packets occurs. We propose a solution that could significantly improve the performance of the network when reordering and corruption of packets occur in a satellite network. We report results of our simulation experiments, which support this claim. 展开更多
关键词 TCP Packet Reorder THROUGHPUT DROPS CONGESTION MULTIPATH Satellite
下载PDF
Optimization of High-Concurrency Conflict Issues in Execute-Order-Validate Blockchain
8
作者 MA Qianli ZHANG Shengli +2 位作者 WANG Taotao YANG Qing WANG Jigang 《ZTE Communications》 2024年第2期19-29,共11页
With the maturation and advancement of blockchain technology,a novel execute-order-validate(EOV)architecture has been proposed,allowing transactions to be executed in parallel during the execution phase.However,parall... With the maturation and advancement of blockchain technology,a novel execute-order-validate(EOV)architecture has been proposed,allowing transactions to be executed in parallel during the execution phase.However,parallel execution may lead to multi-version concurrency control(MVCC)conflicts during the validation phase,resulting in transaction invalidation.Based on different causes,we categorize conflicts in the EOV blockchain into two types:within-block conflicts and cross-block conflicts,and propose an optimization solution called FabricMan based on Fabric v2.4.For within-block conflicts,a reordering algorithm is designed to improve the transaction success rate and parallel validation is implemented based on the transaction conflict graph.We also merge transfer transactions to prevent triggering multiple version checks.For cross-block conflicts,a cache-based version validation mechanism is implemented to detect and terminate invalid transactions in advance.Experimental comparisons are conducted between FabricMan and two other systems,Fabric and Fabric++.The results show that FabricMan outperforms the other two systems in terms of throughput,transaction abort rate,algorithm execution time,and other experimental metrics. 展开更多
关键词 blockchain MVCC conflict reordering parallel validation transaction merging
下载PDF
Memory access optimization for particle operations in computational fluid dynamics-discrete element method simulations
9
作者 Deepthi Vaidhynathan Hariswaran Sitaraman +3 位作者 Ray Grout Thomas Hauser Christine M.Hrenya Jordan Musser 《Particuology》 SCIE EI CAS CSCD 2023年第7期97-110,共14页
Computational Fluid Dynamics-Discrete Element Method is used to model gas-solid systems in several applications in energy,pharmaceutical and petrochemical industries.Computational performance bot-tlenecks often limit ... Computational Fluid Dynamics-Discrete Element Method is used to model gas-solid systems in several applications in energy,pharmaceutical and petrochemical industries.Computational performance bot-tlenecks often limit the problem sizes that can be simulated at industrial scale.The data structures used to store several millions of particles in such large-scale simulations have a large memory footprint that does not fit into the processor cache hierarchies on current high-performance-computing platforms,leading to reduced computational performance.This paper specifically addresses this aspect of memory access bottlenecks in industrial scale simulations.The use of space-flling curves to improve memory access patterns is described and their impact on computational performance is quantified in both shared and distributed memory parallelization paradigms.The Morton space flling curve applied to uniform grids and k-dimensional tree partitions are used to reorder the particle data-structure thus improving spatial and temporal locality in memory.The performance impact of these techniques when applied to two benchmark problems,namely the homogeneous-cooling-system and a fluidized-bed,are presented.These optimization techniques lead to approximately two-fold performance improvement in particle focused operations such as neighbor-list creation and data-exchange,with~1.5 times overall improvement in a fluidization simulation with 1.27 million particles. 展开更多
关键词 CFD-DEM Memory access optimization Spatial reordering Performance optimization
原文传递
EXAFS study of cation reordering in NaYF_4:Yb^(3+),Tb^(3+) up-conversion luminescence materials 被引量:1
10
作者 H.F.Brito J.Hls +5 位作者 T.Laamanen T.Laihinen M.Lastusaari L.Pihlgren L.C.V.Rodrigues T.Soukka 《Journal of Rare Earths》 SCIE EI CAS CSCD 2014年第3期226-229,共4页
The NaYF4:yb3+,Tb3+ (Xyb: 0.20, XTb: 0.04) materials were prepared using the co-precipitation method, lne as-preparea material was washed either with or without water in addition to ethanol and thereafter annea... The NaYF4:yb3+,Tb3+ (Xyb: 0.20, XTb: 0.04) materials were prepared using the co-precipitation method, lne as-preparea material was washed either with or without water in addition to ethanol and thereafter annealed for 5 h at 500℃. This resulted in materials with moderate or very high up-conversion luminescence intensity, respectively. The structural study carried out with X-ray powder diffraction revealed microstrains in the rare earth (R) sublattice that were relaxed for the material with very high up-conversion intensity thus decreasing energy losses. The local structural details were investigated with R LⅢ and Y K edge ex- tended X-ray absorption fine structure (EXAFS) using synchrotron radiation. Around 10 tool.% of the Yb3+ ions were found to occupy the Na site in the material with very high up-conversion intensity. These Yb species formed clusters with the Tb3+ ions occupying the regular Na/R sites. Such clustering enhanced the energy transfer between Yb3+ and Tb3+ thus intensifying the up-conversion emission. 展开更多
关键词 EXAFS UP-CONVERSION YTTERBIUM TERBIUM cation reordering
原文传递
Adaptive ordering and filament polymerization of cell cytoskeleton by tunable nanoarrays 被引量:1
11
作者 Jing Dai Yuan Yao 《Nano Research》 SCIE EI CAS CSCD 2021年第3期620-627,共8页
Matrix adaptation reconstructs the architecture of cell,and has important implication in proper biological functioning.However,how the network of cytoskeleton filament undergoes reconstruction,reordering,and surface a... Matrix adaptation reconstructs the architecture of cell,and has important implication in proper biological functioning.However,how the network of cytoskeleton filament undergoes reconstruction,reordering,and surface adaptation,requires a systematic investigation.Here,we show that surface sensing and adaptation occur correspondingly with related reorganization of cytoskeleton filaments(actin,tubulin,and vimentin).The microstructure of filament network is built by adaptive change of chemical polymerization on cytoskeleton filaments.The transition of cellular morphology,from spheroidal architecture on nanoarray to extending structure with stress fibers on flat surface,involves spatial reorganization and polymerization modulation of filaments.The dimension of filaments(diameter,orientation,and density)are changed accordingly to spatiotemporal distribution of cytoskeleton network.In addition,our findings elucidate how cell can tune their architecture at nanoscale by matrix adaptation,and provide a novel information on interplay between cytoskeleton and pathophysiology. 展开更多
关键词 adaptation CYTOSKELETON POLYMERIZATION REORDERING
原文传递
TCP-ACC: performance and analysis of an active congestion control algorithm for heterogeneous networks 被引量:1
12
作者 Jun ZHANG Jiangtao WEN Yuxing HAN 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第6期1061-1074,共14页
Transmission control protocol (TCP) is a reli- able transport layer protocol widely used in the Internet over decades. However, the performances of existing TCP conges- tion control algorithms degrade severely in mo... Transmission control protocol (TCP) is a reli- able transport layer protocol widely used in the Internet over decades. However, the performances of existing TCP conges- tion control algorithms degrade severely in modern heteroge- neous networks with random packet losses, packet reordering and congestion. In this paper, we propose a novel TCP algo- rithm named TCP-ACC to handle all three challenges men- tioned above. It integrates 1) a real-time reorder metric for calculating the probabilities of unnecessary Fast Retransmit (FRetran) and Timeouts (TO), 2) an improved RTT estima- tion algorithm giving more weights to packets that are sent (as opposed to received) more recently, and 3) an improved congestion control mechanism based on packet loss and re- order rate measurements. Theoretical analysis demonstrates the equilibrium throughput of TCP-ACC is much higher than traditional TCP, while maintaining good fairness with regard to other TCP algorithms in ideal network conditions. Ex- tensive experimental results using both network emulators and real network show that the algorithm achieves signifi- cant throughput improvement in heterogeneous networks as compared with other state-of-the-art algorithms. 展开更多
关键词 TCP packet reordering wireless networks con-gestion control
原文传递
The Case of Using Multiple Streams in Streaming 被引量:1
13
作者 Muhammad Abid Mughal Hai-Xia Wang Dong-Sheng Wang 《International Journal of Automation and computing》 EI CSCD 2013年第6期587-596,共10页
Off-chip replacement (capacity and conflict) and coherent read misses in a distributed shared memory system cause execution to stall for hundreds of cycles. These off-chip replacement and coherent read misses are re... Off-chip replacement (capacity and conflict) and coherent read misses in a distributed shared memory system cause execution to stall for hundreds of cycles. These off-chip replacement and coherent read misses are recurring and forming sequences of two or more misses called streams. Prior streaming techniques ignored reordering of misses and not-recently-accessed streams while streaming data. In this paper, we present stream prefetcher design that can deal with both problems. Our stream prefetcher design utilizes stream waiting rooms to store not-recently-accessed streams. Stream waiting rooms help remove more off-chip misses. Using trace based simulation% our stream prefetcher design can remove 8% to 66% (on average 40%) and 17% to 63% (on average 39%) replacement and coherent read misses, respectively. Using cycle-accurate full-system simulation, our design gives speedups from 1.00 to 1.17 of princeton application repository for shared-memory computers (PARSEC) workloads running on a distributed shared memory system with the exception of dedup and swaptions workloads. 展开更多
关键词 PREFETCHING stream first in first out (FIFO) princeton application repository for shared-memory computers (PARSEC) stream waiting rooms reordering of misses sequitur.
原文传递
Dynamic robust optimal reorder point with uncertain lead time and changeable demand distribution
14
作者 Masaki TAMURA Kazuko MORIZAWA Hiroyuki NAGASAWA 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2010年第12期938-945,共8页
In fixed order quantity systems,uncertainty in lead time is expressed as a set of scenarios with occurrence probabilities,and the mean and variance in demand distribution are supposed to be changeable according to a k... In fixed order quantity systems,uncertainty in lead time is expressed as a set of scenarios with occurrence probabilities,and the mean and variance in demand distribution are supposed to be changeable according to a known pattern.A new concept of "dynamic robust optimal reorder point" is proposed in this paper and its value is calculated as a "robust optimal reorder point function with respect to reorder time".Two approaches were employed in determining the dynamic optimal reorder point.The first is a shortage rate satisfaction approach and the second is a backorder cost minimization approach.The former aims at finding the minimum value of reorder point at each reorder time which satisfies the condition that the cumulative distribution function (CDF) of shortage rate under a given set of scenarios in lead time is greater than or equal to a basic CDF of shortage rate predetermined by a decision-maker.In the latter approach,the CDF of closeness of reorder point is defined at each reorder time to express how close to the optimal reorder points under the set of scenarios,and the dynamic optimal reorder point is defined according to stochastic ordering.Some numerical examples demonstrate the features of these dynamic robust optimal reorder points. 展开更多
关键词 Reorder point Lead time Robust optimum UNCERTAINTY SCENARIO
原文传递
Ginix: Generalized Inverted Index for Keyword Search
15
作者 Hao Wu Guoliang Li Lizhu Zhou 《Tsinghua Science and Technology》 SCIE EI CAS 2013年第1期77-87,共11页
Keyword search has become a ubiquitous method for users to access text data in the face of information explosion. Inverted lists are usually used to index underlying documents to retrieve documents according to a set ... Keyword search has become a ubiquitous method for users to access text data in the face of information explosion. Inverted lists are usually used to index underlying documents to retrieve documents according to a set of keywords efficiently. Since inverted lists are usually large, many compression techniques have been proposed to reduce the storage space and disk I/O time. However, these techniques usually perform decompression operations on the fly, which increases the CPU time. This paper presents a more efficient index structure, the Generalized INverted IndeX (Ginix), which merges consecutive IDs in inverted lists into intervals to save storage space. With this index structure, more efficient algorithms can be devised to perform basic keyword search operations, i.e., the union and the intersection operations, by taking the advantage of intervals. Specifically, these algorithms do not require conversions from interval lists back to ID lists. As a result, keyword search using Ginix can be more efficient than those using traditional inverted indices. The performance of Ginix is also improved by reordering the documents in datasets using two scalable algorithms. Experiments on the performance and scalability of Ginix on real datasets show that Ginix not only requires less storage space, but also improves the keyword search performance, compared with traditional inverted indexes. 展开更多
关键词 keyword search index compression document reordering
原文传递
An Effective Discrete Artificial Bee Colony Based SPARQL Query Path Optimization by Reordering Triples
16
作者 Zeynep Banu Ozger Nurgul Yuzbasioglu Uslu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第2期445-462,共18页
Semantic Web has emerged to make web content machine-readable,and with the rapid increase in the number of web pages,its importance has increased.Resource description framework(RDF)is a special data graph format where... Semantic Web has emerged to make web content machine-readable,and with the rapid increase in the number of web pages,its importance has increased.Resource description framework(RDF)is a special data graph format where Semantic Web data are stored and it can be queried by SPARQL query language.The challenge is to find the optimal query order that results in the shortest period of time.In this paper,the discrete Artificial Bee Colony(dABCSPARQL)algorithm is proposed,based on a novel heuristic approach,namely reordering SPARQL queries.The processing time of queries with different shapes and sizes is minimized using the dABCSPARQL algorithm.The performance of the proposed method is evaluated on chain,star,cyclic,and chain-star queries of different sizes from the Lehigh University Benchmark(LUBM)dataset.The results obtained by the proposed method are compared with those of ARQ(a SPARQL processor for Jena)query engine,the Ant System,the Elitist Ant System,and MAX-MIN Ant System algorithms.The experiments demonstrate that the proposed method significantly reduces the processing time,and in most queries,the reduction rate is higher compared with other optimization methods. 展开更多
关键词 artificial bee colony resource description framework(RDF) query optimization reordering triple pattern SPARQL
原文传递
Scan Cell Positioning for Boosting the Compression of Fan-Out Networks
17
作者 Ozgur Sinanoglu Mohammed Al-Mulla +1 位作者 Noora A.Shunaiber Alex Orailoglu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第5期939-948,共10页
Ensuring a high manufacturing test quality of an integrated electronic circuit mandates the application of a large volume test set. Even if the test data can be fit into the memory of an external tester, the consequen... Ensuring a high manufacturing test quality of an integrated electronic circuit mandates the application of a large volume test set. Even if the test data can be fit into the memory of an external tester, the consequent increase in test application time reflects into elevated production costs. Test data compression solutions have been proposed to address the test time and data volume problem by storing and delivering the test data in a compressed format, and subsequently by expanding the data on-chip. In this paper, we propose a scan cell positioning methodology that accompanies a compression technique in order to boost the compression ratio, and squash the test data even further. While we present the application of the proposed approach in conjunction with the fan-out based decompression architecture, this approach can be extended for application along with other compression solutions as well. The experimental results also confirm the compression enhancement of the proposed methodology. 展开更多
关键词 scan-based testing test data compression scan cell reordering scan architecture design
原文传递
Revisiting the space-time gradient method:A time-clocking perspective, high order difference time discretization and comparison with the harmonic balance method
18
作者 Boqian WANG Dingxi WANG +1 位作者 Mohammad RAHMATI Xiuquan HUANG 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2022年第11期45-58,共14页
This paper revisits the Space-Time Gradient(STG) method which was developed for efficient analysis of unsteady flows due to rotor–stator interaction and presents the method from an alternative time-clocking perspecti... This paper revisits the Space-Time Gradient(STG) method which was developed for efficient analysis of unsteady flows due to rotor–stator interaction and presents the method from an alternative time-clocking perspective. The STG method requires reordering of blade passages according to their relative clocking positions with respect to blades of an adjacent blade row. As the space-clocking is linked to an equivalent time-clocking, the passage reordering can be performed according to the alternative time-clocking. With the time-clocking perspective, unsteady flow solutions from different passages of the same blade row are mapped to flow solutions of the same passage at different time instants or phase angles. Accordingly, the time derivative of the unsteady flow equation is discretized in time directly, which is more natural than transforming the time derivative to a spatial one as with the original STG method. To improve the solution accuracy, a ninth order difference scheme has been investigated for discretizing the time derivative. To achieve a stable solution for the high order scheme, the implicit solution method of Lower-Upper Symmetric GaussSeidel/Gauss-Seidel(LU-SGS/GS) has been employed. The NASA Stage 35 and its blade-countreduced variant are used to demonstrate the validity of the time-clocking based passage reordering and the advantages of the high order difference scheme for the STG method. Results from an existing harmonic balance flow solver are also provided to contrast the two methods in terms of solution stability and computational cost. 展开更多
关键词 Harmonic balance method High order difference scheme Passage reordering Space-time gradient method Unsteady flows
原文传递
Optimizing non-coalesced memory access for irregular applications with GPU computing
19
作者 Ran ZHENG Yuan-dong LIU Hai JIN 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2020年第9期1285-1301,共17页
General purpose graphics processing units(GPGPUs)can be used to improve computing performance considerably for regular applications.However,irregular memory access exists in many applications,and the benefits of graph... General purpose graphics processing units(GPGPUs)can be used to improve computing performance considerably for regular applications.However,irregular memory access exists in many applications,and the benefits of graphics processing units(GPUs)are less substantial for irregular applications.In recent years,several studies have presented some solutions to remove static irregular memory access.However,eliminating dynamic irregular memory access with software remains a serious challenge.A pure software solution without hardware extensions or offline profiling is proposed to eliminate dynamic irregular memory access,especially for indirect memory access.Data reordering and index redirection are suggested to reduce the number of memory transactions,thereby improving the performance of GPU kernels.To improve the efficiency of data reordering,an operation to reorder data is offloaded to a GPU to reduce overhead and thus transfer data.Through concurrently executing the compute unified device architecture(CUDA)streams of data reordering and the data processing kernel,the overhead of data reordering can be reduced.After these optimizations,the volume of memory transactions can be reduced by 16.7%-50%compared with CUSPARSE-based benchmarks,and the performance of irregular kernels can be improved by 9.64%-34.9%using an NVIDIA Tesla P4 GPU. 展开更多
关键词 General purpose graphics processing units Memory coalescing Non-coalesced memory access Data reordering
原文传递
ROCO:Using a Solid State Drive Cache to Improve the Performance of a Host-Aware Shingled Magnetic Recording Drive
20
作者 Wen-Guo Liu Ling-Fang Zeng +1 位作者 Dan Feng Kenneth B.Kent 《Journal of Computer Science & Technology》 SCIE EI CSCD 2019年第1期61-76,共16页
Shingled magnetic recording (SMR)can effectively increase the capacity of hard disk drives (HDDs).Hostaware SMR (HA-SMR)is expected to be more popular than other SMR models because of its backward compatibility and ne... Shingled magnetic recording (SMR)can effectively increase the capacity of hard disk drives (HDDs).Hostaware SMR (HA-SMR)is expected to be more popular than other SMR models because of its backward compatibility and new SMR-specific APIs.However,an HA-SMR drive often suffers performance degradation under write-intensive workloads because of frequent non-sequential writes buffered in the disk cache.The non-sequential writes mainly come from update writes,small random writes and out-of-order writes.In this paper,we propose a hybrid storage system called ROCO which aims to use a solid state drive (SSD)cache to improve the performance of an HA-SMR drive.ROCO reorders out-of-order writes belonging to the same zone and uses the SSD cache to absorb update writes and small random writes.We also design a data replacement algorithm called CREA for the SSD cache.CREA first conducts zone-oriented hot/cold data identification to identify cold-cached zones and hot-cached zones,and then evicts data blocks belonging to colder zones with higher priorities that can be sequentially written or written through host-side read-modify-write operations.It gives the lowest priority to data blocks belonging to the hottest-cached zone that have to be non-sequentially written.Experimental results show that ROCO can effectively reduce non-sequential writes to the HA-SMR drive and improve the performance of the HA-SMR drive. 展开更多
关键词 solid state reordering zone-oriented DRIVE (SSD)cache host-aware shingled magnetic recording (HA-SMR)drive zone-oriented block hot/cold DATA identification DATA replacement algorithm
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部