Course Curriculum
- HPCA-Module-01 Introduction and Course Outline
- HPCA-Module-02 Performance
- HPCA-Module-03 Instruction Set Architecture
- HPCA-Module-04 MIPS ISA Processor Part-1
- HPCA-Module-04 MIPS ISA Processor Part-2
- HPCA-Module-05 Pipelining Introduction
- HPCA-Module-06 Instruction Pipelining
- HPCA-Module-07 Pipeline Hazards
- HPCA-Module-08 Data Hazards
- HPCA-Module-09 Software Pipelining
- HPCA-Module-10 In Quest of Higher ILP Part-1
- HPCA-Module-10 In Quest of Higher ILP Part-2
- HPCA-Module-11 Dynamic Instruction Scheduling Part-1
- HPCA-Module-11 Dynamic Instruction Scheduling Part-2
- HPCA-Module-12 Control Hazards
- HPCA-Module-13 Branch Prediction Part-1
- HPCA-Module-13 Branch Prediction Part-2
- HPCA-Module-14 Dynamic Instruction Scheduling with Branch Prediction
- HPCA-Module-15 Hardware Based Speculation
- HPCA-Module-16 Tutorial - I
- HPCA-Module-17 Hierarchical Memory Organization Part-1
- HPCA-Module-17 Hierarchical Memory Organization Part-2
- HPCA-Module-17 Hierarchical Memory Organization Part-3
- HPCA-Module-17 Hierarchical Memory Organization Part-4
- HPCA-Module-18 Cache Optimization Techniques Part-1
- HPCA-Module-18 Cache Optimization Techniques Part-2
- HPCA-Module-19 High Performance Computer Architecture
- HPCA-Module-20 Main Memory Optimizations
- HPCA-Module-21 Virtual Memory Part-1
- HPCA-Module-21 Virtual Memory Part-2
- HPCA-Module-22 Virtual Machines
- HPCA-Module-23 Storage Technology Part-1
- HPCA-Module-23 Storage Technology Part-2
- HPCA-Module-24 Case Studies Part-1
- HPCA-Module-24 Case Studies Part-2
- HPCA-Module-24 Case Studies Part-3
- HPCA-Module-25 Multithreading and Multiprocessing
- HPCA-Module-26 Simultanoues Multithreading
- HPCA-Module-27 Symmetric Multiprocessors
- HPCA-Module-28 Distributed Memory Multiprocessors
- HPCA-Module-29 Cluster, Grid and Cloud Computing
- What is the Historical perspective of Computers?
- What are the Five Generations of Electronic Computers?
- What are the Elements of Modern Computers and what is Instruction set architecture (ISA)?
- How Computer Architecture differs from Computer Organization?
- What is Moore's Law and its interpretation?
- How to improve the performance of Processor?
- What is Thread-level Parallelism?
- What is Process-Level Parallelism and what are the objectives of this course?
- How to measure the performance?
- How to define performance in terms of Time and what is Execution Time?
- What is the Iron Law of Processor Performance?
- How to enhance Processor Performance and what is the example of MIPS?
- How to measure performance using benchmarks?
- What is Amdahl's Law?
- What is Instruction Set Architecture (ISA)?
- What are the various ISAS design choices?
- How different architectures are compared?
- How data transfer takes place?
- What are the different Addressing Modes?
- What is the controversy of RISC/CISC, what are the features of a CISC and RISC processor?
- What are the different operands in MIPS?
- What are the different conventions for register usage?
- What are the various data types in MIPS?
- What are the various addressimg modes in MIPS?
- What does MIPS Instruction Set survey says?
- How to compare single and multi-clock cycle design?
- What is the Design summary and the basic abstract view of the data path?
- What is the data path for instruction fetching and R-type instruction and how to add different data paths?
- How to combine these data paths?
- What is the Truth Table for Main Control Unit?
- How to add the Unconditional Jump?
- What are the various Pipelining Instructions and its example?
- What is a Synchronous and Asynchronous Pipeline and what are the different Pipeline concepts?
- What is an Ideal Pipeline Speedup and what are the different types of Pipeline?
- What is a Pipelined Fixed Point Multiplier?
- What is a Pipelined Floating Point Adder?
- What is an Instruction Pipeline?
- How to implement Instruction Pipeline and what is the datapath for simple RISC?
- How to implement Pipelining in a RISC -like Processor and Execute?
- How to access memory and what is the CPI for the Multiple-Cycle Implementation?
- What is the CPI for the Multiple-Cycle Implementation and what are Pipeline Registers?
- How Pipeline Register are depicted and why Pipelining RISC Processors is easy?
- How to add Pipeline Registers and what are the limits of Pipelining?
- What is Speedup and how optimal number of pipelines improve the performance?
- What are Pipeline Hazards?
- What is a Structural Hazard and what are the common methods to eliminate them?
- What is Data Hazard and what are Program and Data Dependences?
- How to detect Data Dependences and what are Name Dependences and its types?
- What is Control Dependence and what are the different aspects of Data Hazard?
- How Data Hazards are classified and what are the different techniques to reduce Data Hazards?
- What is Forwarding and Bypassing technique to reduce data hazard?
- What is Basic Compiler Pipeline Scheduling?
- What is the Loop program to implement Scheduling?
- What is Loop unrolling?
- What is Loop unrolling with Scheduling?
- What is the concept of Software Pipelining?
- How Static loop unrolling is illustrated using an example?
- What is Software Pipelined Code, what are the limitations of Scalar Pipelines what are the two paths to Higher ILP?
- What is the need of Dynamic Instruction Scheduling?
- What is Dynamic Instruction Scheduling?
- What is HIgher ILP Processor?
- What are the two paths to Higher ILP and what is VLIW Processor?
- What is the basic VLIW approach?
- What are the different examples of VLIW and Transmeta's Crusoe Processor?
- What is Code Morphing Software and how PC software is translated to VLIW?
- How Dynamic Software Execution takes place and what are the different problems with VLIW?
- What is Instruction pipeline cycle and how to classify ILP Machines and what are the limitations of Scaler Pipelines?
- What are the two Paths to higher ILP and what are the various drawbacks of VLIW?
- What are the Limits on ILP and what is the motivation behind Superscalar processor?
- What arethe two Paths to higher ILP and what is the proposal for Superscalar processor?
- What is the Superpipelined Organization?
- How to classify ILP Machines, what is Superpipelined Performance and what ate the limitations of Scalar Pipelines?
- What is the need of Dynamic Instruction Scheduling and how to design a Superscalar Pipeline?
- How Dataflow Execution takes place and wha are the advantages of Dynamic Scheduling?
- What is Dynamic Instruction Scheduling and Scoreboarding?
- What is Instruction Parallelism and what are the implications of Scoreboard and four Stages of Scoreboard Control?
- What is the Detailed Scoreboard Pipeline Control and how to assess Scoreboarding?
- What is the example to illustrate Scoreboarding?
- What is Dynamic Scheduling and Summary of Scoreboard?
- What is Tomasulo's Algorithm and its example?
- How Tomasulo's Algorithm differs from Scoreboarding and what is Tomasulo's Scheme?
- What are the key innovations in Dynamic Instruction Scheduling and what is Reservation station and Tomasulo's algorithm?
- What are the different stages in Tomasulo's Algorithm and its example?
- How to illustrate Tomasulo's algorithm with tthe help of an example?
- What are the advantages of Tomasulo's Scheme and its drawbacks?
- What are Control (Branch) Hazards?
- What is the Branch Penalty of 3 Cycles Stall and how to reduce Branch Penalty to 1 cycle and what are Control Instruction Statistics?
- How to Deal with Control Hazards?
- What are Delayed Branches?
- What is the example of Delayed Branches and how to schedule the Branch- Delay Slot?
- What is the performance for Different Alternatives and importance of Stall Reduction?
- What is the importance of Stall Reduction?
- What is Control Hazard and Branch Prediction?
- What is the limitation of 1-bit predictor and its example?
- What is a 2-bit Dynamic Branch Prediction Scheme, a 2-bit Predictor and its Prediction Accuracy?
- What is a Correlating Branch Predictor?
- What is the prediction accuracy of Correlating Predictor and its example?
- What is the example for 1-bit Predictor and (1,1) Correlating Predictor?
- What are the various Branch Prediction Schemes and what is a 1-bit and 2-bit Branch Predictor?
- What are Tournament Predictors?
- What Fraction of predictions is coming from the local predictor and how the performance comparison of the Predictors is done?
- What is the need of Branch Target Buffers?
- What is Prediction and Address and what are Branch Target Buffers?
- How to compibe Target and Prediction Buffers?
- What are Return Address Predictors and what are the misprediction rates for Different Sizes of Return Stack (SPEC CPU95) and what is Branch Folding?
- How predictors are used in Pentium processors and what is the overall summary of Dynamic Branch Prediction?
- What are the Data-Flow Architectures and what is Dynamic Instruction Scheduling?
- What is the example of Tomasulo's Loop and what are the possible hazards Due to out-of-order Execution?
- What is the example of Loop execution (part - 1)?
- What is the example of Loop execution (part - 2)?
- Why can Tomasulo's Scheme Overlap Iterations of Loops and what are its advantages and drawbacks?
- What is the Hardware-Based Speculation?
- How to add the speculation to Tomasulo's Scheme and what are the Exceptions, Interrupts and Major Changes Over Tomasulo's Scheme?
- What is the Support for Shadow Execution and what is Recorder Buffer?
- What are the four steps of Speculative Execution and how Tomasulo's algorithm with Reorder Buffer looks like?
- How to avoid Memory Hazards and its examples?
- What is the Multiple Issue without and with Speculation?
- What are the advantages of Speculation and how it differs from Heat Dissipation?
- How problems are illustrated and solved (Part - 1)?
- How problems are illustrated and solved (Part - 2)?
- How problems are illustrated and solved (Part - 3)?
- How problems are illustrated and solved (Part - 4)?
- What is Von Neumann Computer Architecture?
- What are the Key Characteristics of Computer Memory Systems (Location, Capacity and Access Methods)?
- What are the Key Characteristics (Performance and Physical Type)?
- What are the Key Characteristics (Organization)?
- What are the Key Characteristics (Storage Capacity and Cost)?
- What is Hierarchical Memory Organization (Part - 1)?
- What is Hierarchical Memory Organization (Part - 2)?
- What are the basic principles of Cache Memory?
- What are the basic issues related to Cache Memory?
- What is Block Identification?
- What is Direct Mapping (Part - 1)?
- What is Direct Mapping (Part - 2)?
- What is Associative Mapping?
- What are Mapping Functions and what is Fully Associative Mapping?
- What is Set-Associative Mapping?
- How Size of Tags and Associativity are compared?
- What is the issue of replacement algorithms in Set-Associative Mapping?
- What is the issue of a write in Set-Associative Mapping?
- What is the issue of Block Size in Set-Associative Mapping?
- What is the issue of Block Size in Set-Associative Mapping and what is unified, split memory and caches?
- Can we look at the Case Study of the Alpha 21264 Cache?
- What is Alpha 21264 Data Cache, Memory System Performance and Average Memory Access Time (AMAT)?
- What is Cache Performance, its example and parameters?
- How to improve Cache Performance (Part - 1)?
- How to improve Cache Performance (Part - 2)?
- How to improve Cache Performance (Part - 3)?
- What is Virtual Cache and how to access Pipelined cache?
- How to reduce Miss rate and how to classify Cache Misses?
- What is 3Cs Absolute Miss Rate (SPEC92)?
- What are the insights of cache, what will be impact of Larger Cache, higher associativity and Larger block size?
- What are Compiler Optimizations, how to reduce Misses by Compiler Optimizations and what are the examples of Merging Arrays and Loop Interchange?
- What is Row-Major, Column-Major and example of Loop Interchange and Loop Fusion?
- What is the example of Blocking and Dense Matix Multiplication?
- How to reduce Miss Penalty and what are the various definitions of Multi-Level Cache?
- How to compare Global and Local Miss Rates?
- How Write Buffer and Victim Cache reduce Miss Penalty?
- How the concept of Read Priority over Write on Miss to reduce Miss Penalty?
- How to reduce Miss Penalty by Subblock Placement and what is Early start and Critical Word First?
- How to reduce Miss Penalty by Non- blocking Caches and what is the Cache performance for Out of Order Processors?
- What is the concept of Hardware, Software and Controlled Prefetching for reducing Miss Penalty?
- What is the overall summary of Cache Optimization?
- How the main memory is organized and what are the different types of Semiconductor Memories?
- What is Static RAM (SRAM) Cell?
- How the SRAM chip is organized and what are its Read/Write operations?
- What is Dynamic RAM (DRAM) Cell?
- How the DRAM chip is organized and what are its characteristics?
- What is an EPROM, layout of DRAM pin and what are the various Read Only Memories (ROM)?
- What is an EEPROM and Flash Memory?
- What is the DRAM Memory Gap or Latency and what is the impact of Higher Bandwidth and Wider Memory?
- How to reduce Miss Penalty and what is Interleaved Memory and Memory Bank?
- What is DIMM, what are the advanced DRAM Organizations and what is FPM and EDO DRAM?
- What is Synchronous DRAM (SDRAM), its example and what is Asynchronous DRAM timing?
- What is Dual Inline Memory Module (DIMM), RAMBUS DRAM (RDRAM) and what is the use of RDRAM?
- Why Virtual Memory is needed?
- What are the Motivations for Virtual Memory?
- What is a Cache for Disk and how Cache Memory differs from Virtual Memory?
- What are the design issues in Virtual Memory Design how Virtual to Physical Address Mapping is done?
- What is the concept of Address Mapping, how address is translated via Page Table and what is the operation of Page Table?
- How address is translated in Paging System and what are Page Faults?
- How to service a Page Fault and what are Page Table Entries?
- How to make Address Translation Faster and what is the use of a Translation Lookaside Buffer (TLB)?
- What is TLB and Cache Operation how to handle Page Faults and TLB misses?
- How to manage Memory?
- What is the Page Table Organization (Forward Mapped or Hierarchical Page Table)?
- What are the advantages of Two-level Page Table, what is an inverted Page Table and its structure?
- What is Segmentation and Segment Table and how the address translation takes place in it?
- What is Combined Paging and Segmentation and how the address translation takes place in it?
- What is the Memory Address Translation Mechanism in Pentium II and what is Fetch and Placement Policy?
- What are the Basic Replacement Algorithms?
- How Protection is done via Virtual Memory?
- What is Virtual Machine Monitor (VMM)?
- What are the advantages and disadvantages of VM and what a VMM must do?
- What is Processor Virtualization and ISA Support for VMs?
- What is the impact of Virtual Machines over Virtual Memory and what are Process Virtual Machines?
- What are Magnetic Disks?
- How formatting of Magnetic Disks is done and what are the disk areas?
- What is Constant Bit Density and how to Read/Write in a disk?
- What is Seek Time and Transfer Time?
- What is Average Disk Access Time and Locality?
- What are Optical Disks and how to read data from them?
- What is CD-ROM, DVD-ROM and Magnetic Tapes?
- What is Flash Memory and what are Solid State Disks?
- How to combine multiple disks together?
- What are RAIDs and what is RAID-O level?
- What is RAID-1, RAID-2 and RAID-3 level?
- What is RAID-4 and RAID-5 level?
- What is RAID-6 and which RAID Level to choose?
- What is the Historical perspective of Intel Processors?
- What are the features of 80186/80188 and 80286 miroprocessors?
- What are the features of 80386 and 80486 microprocessors and their pipelining?
- What are the features and specifications of Intel P5 and P6 family and what is Pentium and its different aspects?
- What are the various features and operations of Intel P6 family?
- What are the various features of Pentium Pro and Pentium II/III?
- What are the different features of Pentium 4 and what is Netburst Micro and Instruction Set Architecture?
- How SSE and SSE2, Pentium III and Pentium IV are compared?
- What are the various Caches?
- What is Branch Prediction and Branch Hints?
- What is the concept of Advanced Dynamic Execution?
- What is a System Bus, EPIC,IA-64 and Itanium?
- What is Pentium 2, 3, 4, EPIC and IA-64?
- What is Itanium and main ideas of EPIC and IA-64 Register model?
- What are Register Windows and IA-64 Micro-Architecture?
- What is the General Organization of the IA-64 Architecture, its instruction, Template Field and Stops?
- What is Branch Prediction, IA-64 Solution to Memory Latencies and Speculative Load example?
- What are the other features of IA-64, Itanium Pipeline, functional units and Itanium II?
- What is a Thread and Process -Level Parallelism and how a process is different from thread?
- What are Single and Multithreaded Processes and How can Threads be Created?
- Can we look at the case for Processor Support for Thread-level Parallelism and its example?
- What is a Taxonomy of Parallel Architectures?
- How MIMD Computers are classified and what is shared and distributed memory?
- What is Multithreading within a Single Processor?
- How Multithreading is explained pictorially?
- What is Coarse-Grained Multithreading and its processors?
- What Fine-Grained Multithreading and Simultaneous Multithreading (SMT)?
- What are the advantages of SMT and its issues?
- What is the block diagram of SMT, its model, caching and Performance Implications?
- How UMA and NUMA Computers are compared and what are the different SMP Organizations, its Pros and Cons?
- Why Multicores are needed and what are the Cache Organizations for Multicores and SMPs?
- What is Cache Coherence problem, its Possible Approaches and Solutions?
- What is Snooping protocol and its categories and Cache coherence problem?
- What is a Snoopy-Cache State Machine-I and Machine-II?
- What are the Limitations of SMPs what is Directory-based Protocol?
- What is the Directory-Based Solution for NUMA computers?
- What is the State Transition Diagram for the Directory?
- What is a Directory State Machine and its example?
- How communication overhead is explained with the help of examples?
- What is Cluster Computing and motivation behind it?
- What are the components, configurations, their types, Pros and Cons?
- What are Storage Area Networks (SANs), Flat Neighbourhood Neetworks and what is Beowulf Cluster and its example?
- What are the SMPs Clusters and what is the concept of Grid Computing?
- What are Cluster Grids, Desktop Grids (SETI@home) and components of Grid Middleware?
- What is Cloud Computing, its benefits and segments?
No Comments