This book is a revision of my Ph. D. thesis dissertation submitted to Carnegie Mellon University in 1987. It documents the research and results of the compiler technology developed for the Warp machine. Warp is a systolic array built out of custom, high-performance processors, each of which can execute up to 10 million floating-point operations per second (10 MFLOPS). Under the direction of H. T. Kung, the Warp machine matured from an academic, experimental prototype to a commercial product of General Electric. The Warp machine demonstrated that the scalable architecture of high-peiformance, programmable systolic arrays represents a practical, cost-effective solu tion to the present and future computation-intensive applications. The success of Warp led to the follow-on iWarp project, a joint project with Intel, to develop a single-chip 20 MFLOPS processor. The availability of the highly integrated iWarp processor will have a significant impact on parallel computing. One of the major challenges in the development of Warp was to build an optimizing compiler for the machine. First, the processors in the xx A Systolic Array Optimizing Compiler array cooperate at a fine granularity of parallelism, interaction between processors must be considered in the generation of code for individual processors. Second, the individual processors themselves derive their performance from a VLIW (Very Long Instruction Word) instruction set and a high degree of internal pipelining and parallelism. The compiler contains optimizations pertaining to the array level of parallelism, as well as optimizations for the individual VLIW processors.
Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.
1. Introduction.- 1.1. Research approach.- 1.2. Overview of results.- 1.2.1. A machine abstraction for systolic arrays.- 1.2.2. Cell level optimizations.- 1.3. This presentation.- 2. Architecture of Warp.- 2.1. The architecture.- 2.1.1. Warp cell.- 2.1.2. Interface unit.- 2.1.3. The host system.- 2.2. Application domain of Warp.- 2.3. Programming complexity.- 3. A Machine Abstraction.- 3.1. Previous systolic array synthesis techniques.- 3.2. Comparisons of machine abstractions.- 3.2.1. Programmability.- 3.2.1.1. Partitioning methods.- 3.2.1.2. Programmability of synchronous models.- 3.2.2. Efficiency.- 3.3. Proposed abstraction: asynchronous communication.- 3.3.1. Effect of parallelism in cells.- 3.3.2. Scheduling constraints between receives and sends.- 3.3.2.1. The problem.- 3.3.2.2. The analysis.- 3.3.2.3. Implications.- 3.3.3. Discussion on the asynchronous communication model.- 3.4. Hardware and software support.- 3.4.1. Minimum hardware support: queues.- 3.4.2. Compile-time flow control.- 3.4.2.1. The skewed computation model.- 3.4.2.2. Algorithm to find the minimum skew.- 3.4.2.3. Hardware design.- 3.5. Chapter summary.- 4. The W2 Language and Compiler.- 4.1. The W2 language.- 4.2. Compiler overview.- 4.3. Scheduling a basic block.- 4.3.1. Problem definition.- 4.3.2. List scheduling.- 4.3.3. Ordering and priority function.- 5. Software Pipelining.- 5.1. Introduction to software pipelining.- 5.2. The scheduling problem.- 5.2.1. Scheduling constraints.- 5.2.2. Definition and complexity of problem.- 5.3. Scheduling algorithm.- 5.3.1. Bounds on the initiation interval.- 5.3.2. Scheduling an acyclic graph.- 5.3.3. Scheduling a cyclic graph.- 5.3.3.1. Combining strongly connected components.- 5.3.3.2. Scheduling a strongly connected component.- 5.3.3.3. Complete algorithm.- 5.4. Modulo variable expansion.- 5.5. Code size requirement.- 5.6. Comparison with previous work.- 5.6.1. The FPS compiler.- 5.6.2. The polycyclic machine.- 5.7. Chapter summary.- 6. Hierarchical Reduction.- 6.1. The iterative construct.- 6.2. The conditional construct.- 6.2.1. Branches taking different amounts of time.- 6.2.2. Code size.- 6.3. Global code motions.- 6.4. Comparison with previous work.- 6.4.1. Trace scheduling.- 6.4.1.1. Loop branches.- 6.4.1.2. Conditionals.- 6.4.2. Comparison with vector instructions.- 7. Evaluation.- 7.1. The experiment.- 7.1.1. Status of compiler.- 7.1.2. The programs.- 7.2. Performance analysis of global scheduling techniques.- 7.2.1. Speed up of global scheduling techniques.- 7.2.2. Efficiency of scheduler.- 7.2.2.1. Exclusive I/O time.- 7.2.2.2. Global resource use count.- 7.2.2.3. Data dependency.- 7.2.2.4. Other factors.- 7.2.3. Discussion on effectiveness of the Warp architecture.- 7.3. Performance of software pipelining.- 7.3.1. Characteristics of the loops.- 7.3.2. Effectiveness of software pipelining.- 7.3.3. Feasibility of software pipelining.- 7.4. Livermore Loops.- 7.5. Summary and discussion.- 8. Conclusions.- 8.1. Machine abstraction for systolic arrays.- 8.2. Code scheduling techniques.- References.
Le informazioni nella sezione "Su questo libro" possono far riferimento a edizioni diverse di questo titolo.
EUR 29,37 per la spedizione da Regno Unito a U.S.A.
Destinazione, tempi e costiEUR 3,53 per la spedizione in U.S.A.
Destinazione, tempi e costiDa: Lucky's Textbooks, Dallas, TX, U.S.A.
Condizione: New. Codice articolo ABLIING23Mar2716030030171
Quantità: Più di 20 disponibili
Da: BuchWeltWeit Ludwig Meier e.K., Bergisch Gladbach, Germania
Taschenbuch. Condizione: Neu. This item is printed on demand - it takes 3-4 days longer - Neuware -This book is a revision of my Ph. D. thesis dissertation submitted to Carnegie Mellon University in 1987. It documents the research and results of the compiler technology developed for the Warp machine. Warp is a systolic array built out of custom, high-performance processors, each of which can execute up to 10 million floating-point operations per second (10 MFLOPS). Under the direction of H. T. Kung, the Warp machine matured from an academic, experimental prototype to a commercial product of General Electric. The Warp machine demonstrated that the scalable architecture of high-peiformance, programmable systolic arrays represents a practical, cost-effective solu tion to the present and future computation-intensive applications. The success of Warp led to the follow-on iWarp project, a joint project with Intel, to develop a single-chip 20 MFLOPS processor. The availability of the highly integrated iWarp processor will have a significant impact on parallel computing. One of the major challenges in the development of Warp was to build an optimizing compiler for the machine. First, the processors in the xx A Systolic Array Optimizing Compiler array cooperate at a fine granularity of parallelism, interaction between processors must be considered in the generation of code for individual processors. Second, the individual processors themselves derive their performance from a VLIW (Very Long Instruction Word) instruction set and a high degree of internal pipelining and parallelism. The compiler contains optimizations pertaining to the array level of parallelism, as well as optimizations for the individual VLIW processors. 228 pp. Englisch. Codice articolo 9781461289616
Quantità: 2 disponibili
Da: Ria Christie Collections, Uxbridge, Regno Unito
Condizione: New. In. Codice articolo ria9781461289616_new
Quantità: Più di 20 disponibili
Da: AHA-BUCH GmbH, Einbeck, Germania
Taschenbuch. Condizione: Neu. Druck auf Anfrage Neuware - Printed after ordering - This book is a revision of my Ph. D. thesis dissertation submitted to Carnegie Mellon University in 1987. It documents the research and results of the compiler technology developed for the Warp machine. Warp is a systolic array built out of custom, high-performance processors, each of which can execute up to 10 million floating-point operations per second (10 MFLOPS). Under the direction of H. T. Kung, the Warp machine matured from an academic, experimental prototype to a commercial product of General Electric. The Warp machine demonstrated that the scalable architecture of high-peiformance, programmable systolic arrays represents a practical, cost-effective solu tion to the present and future computation-intensive applications. The success of Warp led to the follow-on iWarp project, a joint project with Intel, to develop a single-chip 20 MFLOPS processor. The availability of the highly integrated iWarp processor will have a significant impact on parallel computing. One of the major challenges in the development of Warp was to build an optimizing compiler for the machine. First, the processors in the xx A Systolic Array Optimizing Compiler array cooperate at a fine granularity of parallelism, interaction between processors must be considered in the generation of code for individual processors. Second, the individual processors themselves derive their performance from a VLIW (Very Long Instruction Word) instruction set and a high degree of internal pipelining and parallelism. The compiler contains optimizations pertaining to the array level of parallelism, as well as optimizations for the individual VLIW processors. Codice articolo 9781461289616
Quantità: 1 disponibili
Da: THE SAINT BOOKSTORE, Southport, Regno Unito
Paperback / softback. Condizione: New. This item is printed on demand. New copy - Usually dispatched within 5-9 working days 357. Codice articolo C9781461289616
Quantità: Più di 20 disponibili
Da: Mispah books, Redhill, SURRE, Regno Unito
Paperback. Condizione: Like New. Like New. book. Codice articolo ERICA77314612896106
Quantità: 1 disponibili