Outline Pipeline, Pipelined datapath Dependences, Hazards Structural, Data - Stalling, Forwarding Control Hazards Branch prediction

Similar documents
Outline. Pipeline, Pipelined datapath Dependences, Hazards. Control Hazards Branch prediction. Structural, Data - Stalling, Forwarding

Consider executing this sequence of instructions in the pipeline:

Control Hazards. EE 457 Unit 6c. Option 1: Stalling. An Opening Example. Control Hazards

Computer Architecture ELEC2401 & ELEC3441

EE457 Lab 6 Design of a Pipelined CPU Lab 6 Part 4 Questions 10/29/2006. Lecture by Gandhi Puvvada. University of Southern California

Florida State University Libraries

Abstract. 1 Introduction

Chapter 8 Multioperand Addition

An Analysis of Correlation and Predictability: What Makes Two-Level Branch Predictors Work

Confidence Estimation for Speculation Control

Reducing Power Requirements of Instruction Scheduling Through Dynamic Allocation of Multiple Datapath Resources*

MultiCycle MIPS. Motivation

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF MECHANICAL ENGINEERING QUESTION BANK. Subject Name: ELECTRONICS AND MICRIPROCESSORS UNIT I

Confidence Estimation for Speculation Control

Improving Branch Prediction by considering Affectors and Affectees Correlations

The Significance of Affectors and Affectees Correlations for Branch Prediction

ADVANCED VBA FOR PROJECT FINANCE Near Future Ltd. Registration no

CHAPTER 4 CONTENT LECTURE 1 November :28 AM

Sporting Capital Resource Sheet 1 1 Sporting Capital what is it and why is it important to sports policy and practice?

SNJB College of Engineering Department of Computer Engineering

Adaptive Mode Control: A Static-Power-Efficient Cache Design

Interrupts in Detail. A hardware and software look at interrupts

Instruction latencies and throughput for AMD and Intel x86 processors

Biosecurity Logical, Implementable Biosecurity Plans for Horseshows

AUTOMATED DEBUGGING. Session Leader: Zongxu Mu

Post-Silicon Bug Diagnosis with Inconsistent Executions

Section 2 8. Section 2. Instructions How To Get The Most From This Program Dr. Larry Van Such.

40 GbE and 100 GbE PCS Considerations Key Questions to be Answered concerning OTN mapping for MLD (CTBI) architecture

Institutional Review Boards and Human Subjects Protection

Genes and Inheritance (11-12)

Logic Model. When do you create a Logic Model? Create your LM at the beginning of an evaluation and review and update it periodically.

Copyright Strengthworks International Publishing. All rights are reserved. Updated

Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing

Fully Automated IFA Processor LIS User Manual

Trio Motion Technology Ltd. Shannon Way, Tewkesbury, Gloucestershire. GL20 8ND United Kingdom Tel: +44 (0) Fax: +44 (0)

GLP-1 Initiation and Follow Up. Version Date Summary of Changes 1.0 April 2013 This is a new template

GENETICS - CLUTCH CH.2 MENDEL'S LAWS OF INHERITANCE.

Professional learning: Helping children who are experiencing mental health difficulties Topic 1: Understanding mental health. Leadership team guide

Dräger Alcotest 8610

Rational Agents (Ch. 2)

Appendix B. Nodulus Observer XT Instructional Guide. 1. Setting up your project p. 2. a. Observation p. 2. b. Subjects, behaviors and coding p.

Dräger Alcotest Hard protective case Robust and weatherproof. All case components power on when the case is opened.

Understanding Models (chapter 1) Information (chapter 2)

Automatic Fault Tree Derivation from Little-JIL Process Definitions

Intelligent Agents. Soleymani. Artificial Intelligence: A Modern Approach, Chapter 2

Diagnosis via Model Based Reasoning

AUDIT OUTLINE INFORMATION SUMMARY

System. Humeral Nail. Surgical Technique

Speed Accuracy Trade-Off

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

A Better World for Women: Moving Forward

Understanding myself and others. Evaluation questions

TG100 Implementation Guide - FTA

COPING WITH POTS RESULTS FROM SURVEY. Georgina Hardy

Transfusion Medicine Kris0ne Kra1s, M.D.

- Conduct effective follow up visits when missing children return home ensuring intelligence is shared with appropriate partners.

OECD QSAR Toolbox v.4.2. An example illustrating RAAF scenario 6 and related assessment elements

The Glenn Garcelon Foundation Sponsorship Package

3/6/2017-6/15/2017 Permission to Take Part in a Human Research Study Page 1 of 6

Enhancing LTP-Driven Cache Management Using Reuse Distance Information

An Energy Efficient Sensor for Thyroid Monitoring through the IoT

Pocket Reference Chart

TIMS Lab 1: Phase-Locked Loop

WINSTA-R. Distal Radius System

MRSH-MEM: Approximate Matching on Raw Memory Dumps

A SAS sy Study of ediary Data

Rational Agents (Ch. 2)

SUB-REGIONAL SEMINAR: THE ROLE OF RESEARCH IN

Revision notes 7.5. Section 7.5 Health promotion

Foundations of software engineering

Implications of Longitudinal Data in Machine Learning for Medicine and Epidemiology

Introduction to HACCP for the Agri Feed/ Food Supply chain

worry free diabetes care Ian Holmes [Manager], Tang Zhang [Designer], Albert Chen (hselin) [Documentation]

UK Biobank. Arterial Pulse-Wave Velocity. Version th April 2011

Design Methodology for Fine-Grained Leakage Control in MTCMOS

Review of Appropriate Adult provision for vulnerable adults

DEPARTMENT OF ANIMAL HEALTH TECHNOLOGY COURSE OUTLINE AH 343 DIAGNOSTIC IMAGING

ACAPMA Tobacco Compliance Webinar powered by ACAPMAcademy. This webinar will commence at 11:00am ESDT, thank you for joining us

Man-Environment Relationship the elementatary concept in the HUMAN BEHAVIOUR IN THE BUILT ENVIRONMENT

Progress Report. Flood Hazard Mapping in Thailand

FM SYSTEMS. with the FONIX 6500-CX Hearing Aid Analyzer. (Requires software version 4.20 or above) FRYE ELECTRONICS, INC.

Timing-Based Delay Test for Screening Small Delay Defects

GlucCell TM SYSTEM USER S GUIDE Ver 2.1 CELL CULTURE GLUCOSE METER. Important Information. Intended Use. Caution. About the System

DATE 2006 Session 5B: Timing and Noise Analysis

Single Axis Revision Knee System

Connectivity Issues Impacting Patient Safety

Voice Switch Manual. InvoTek, Inc Riverview Drive Alma, AR USA (479)

5. Low-Power OpAmps. Francesc Serra Graells

Title: Biopsychology Specification: The divisions of the nervous system: central and peripheral (somatic and autonomic). SAMPLE

A BIST Scheme for Testing and Repair of Multi-Mode Power Switches

Excerpts from Eat, Drink, Heal, by Dr. Gregory A. Buford

Integrating the Patient Perspective Into Value Frameworks

HTLV Serology Quality Assessment Program Summary for Panel HTLVSER 2017Apr19

PFL780 PCB Fault Locator

OneTouch Reveal Web Application. User Manual for Healthcare Professionals Instructions for Use

Reactive agents and perceptual ambiguity

Section 32: BIMM Institute Student Disciplinary Procedure

Point of Care testing refers to all laboratory testing that is done outside of the walls of the clinical laboratory in the proximity of the patient.

HIV Navigation Services (HNS)

LogiCORE IP Fast Simplex Link (FSL) V20 Bus (v2.11c)

Transcription:

Control Hazard

Outline Pipeline, Pipelined datapath Dependences, Hazards Structural, Data - Stalling, Forwarding Control Hazards Branch prediction

Control Hazard Arise from the pipelining of branches and other instructions that change the PC

Control Hazard Arise from the pipelining of branches and other instructions that change the PC Also called Branch Hazards

Branch Evaluation

beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw lw $4, $4, 50($7) Control Hazard

beq Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9

Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq IF and

Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq IF ID EX and or IF ID IF

beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2 Control Hazard Branch Taken...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq IF ID EX and or IF ID IF

beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2 Control Hazard NPC Updates...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq and IF ID EX MEM IF ID EX or add IF ID IF

beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2 Control Hazard NPC Updates...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq and IF ID EX MEM IF ID EX or add IF ID IF Insert NOP NOP

Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 Branch add add $14, $2, $2, $2 target $2 enters the the pipeline...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq and or IF ID EX MEM WB IF ID EX MEM IF ID EX add IF ID Target: lw IF

Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq and or add IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM IF ID EX Target: lw IF ID Target+1 IF

Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 beq IF ID EX MEM WB and IF ID EX MEM WB or IF ID EX MEM WB add IF ID EX MEM WB Target: lw IF ID EX MEM WB Target+1 IF ID EX MEM

Control Hazard beq beq $1, $1, $3, $3, 28 28 and and $12, $2, $2, $5 $5 or or $13, $6, $6, $2 $2 add add $14, $2, $2, $2 $2...... lw Time lw $4, $4, 50($7) (clock cycles) 1 2 3 4 5 6 7 8 9 Target: beq and or add lw IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB Branch Penalty = 3 cycles

Control Hazard Branch evaluation occurs in EX stage. PC updates in MEM stage. Branch target loads in the next cycle The delay in determining the proper instruction to fetch is called a Control Hazard or Branch Hazard.

Reducing Pipeline Branch Penalties

Reducing Pipeline Branch Penalties Freeze the pipeline

Reducing Pipeline Branch Penalties Freeze the pipeline Time 1 2 3 4 5 6 (clock cycles) 7 8 9 beq nop nop nop lw IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB

Reducing Branch Delay

Reducing Branch Delay Most branches rely on simple tests (equality or sign) Move the branch decision up

Reducing Branch Delay Most branches rely on simple tests (equality or sign) Move the branch decision up Compute NPC earlier Evaluate branch at the earliest

Reducing Branch Delay

NPC Reducing Branch Delay

Reducing Branch Delay NPC Branch Delay: Delay between Branch instruction fetch and Branch Target fetch of of a taken branch.

Branch Delay ADD BEQ SUB... R2, R3,R4 R2, R4, loop R5, R5,R4 Time 1 2 3 4 5 6 (clock cycles) 7 8 9 10 11 ADD BEQ IF ID IF

Branch Delay ADD BEQ SUB... R2, R3,R4 R2, R4, loop R5, R5,R4 Time 1 2 3 4 5 6 (clock cycles) 7 8 9 10 11 ADD IF ID EX BEQ IF ID SUB IF

Branch Delay ADD BEQ SUB... R2, R3,R4 R2, R4, loop R5, R5,R4 Time 1 2 3 4 5 6 (clock cycles) 7 8 9 10 11 ADD IF ID EX BEQ IF ID SUB IF PC = PC + Offset

Branch Delay ADD BEQ SUB... R2, R3,R4 R2, R4, loop R5, R5,R4 Time 1 2 3 4 5 6 (clock cycles) 7 8 9 10 11 ADD IF ID EX MEM BEQ IF ID EX SUB IF NOP ID Branch Target IF

Branch Delay ADD BEQ SUB... R2, R3,R4 R2, R4, loop R5, R5,R4 Time 1 2 3 4 5 6 (clock cycles) 7 8 9 10 11 ADD IF ID EX MEM WB BEQ IF ID EX MEM SUB IF NOP ID NOP EX Branch Target IF ID

Branch Delay ADD BEQ SUB... R2, R3,R4 R2, R4, loop R5, R5,R4 Time 1 2 3 4 5 6 (clock cycles) 7 8 9 10 11 ADD IF ID EX MEM WB BEQ IF ID EX MEM WB SUB IF NOP ID NOP EX MEM NOP NOP WB Branch Target IF ID EX MEM WB

Branch Delay Slot Time 1 2 3 4 5 6 (clock cycles) 7 8 9 Branch IF ID EX MEM WB Instruction i + 1 Branch Target Branch Target + 1 IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB Branch Delay Slot

Branch Hazards Time 1 2 3 4 5 6 (clock cycles) 7 8 9 Branch IF ID EX MEM WB Instruction i + 1 Branch Target Branch Target + 1 IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB Branch Target IF IF ID EX MEM WB 1 stall cycle for every branch yields a performance loss of 10% to 30%!

Example Branch Taken

Branch Evaluation in ID Steps Are beq inputs ready?

Branch Evaluation in ID Steps Decode the instruction Decide whether to bypass/forward to the equality unit Forwarding Logic has to be modfied Complete the comparison If branch taken, set the PC to the branch target address.

Branch Evaluation Forwarding New forwarding logic required Source operands can come from ALU/MEM or MEM/WB If source operands are not ready, data hazard can occur and a stall will be needed....... add add $1, $1, $2, $2, $2 $2 beq beq $1, $1, $3, $3, 28 28......

Reducing Pipeline Branch Penalties Freeze the pipeline Static Prediction Predict Untaken, Predict Taken Delayed Branch Fill Branch Delay Slot

Predict Untaken Scheme Time (clock cycles) 1 2 3 4 5 6 7 8 9 Untaken Branch IF ID EX MEM WB Instruction i + 1 Instruction i + 2 Instruction i + 3 IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB Time (clock cycles) 1 2 3 4 5 6 7 8 9 Taken Branch IF ID EX MEM WB Instruction i + 1 Branch Target Branch Target + 1 IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB

Reducing Pipeline Branch Penalties beq $1, $3, 28 and and $12, $12, $2, $2, $5 $5 or or $13, $13, $6, $6, $2 $2 add add $14, $14, $2, $2, $2 $2 lw lw $4, $4, 50($7) 50($7) Predict Taken beq $1, $3, 28

Reducing Pipeline Branch Penalties Predict Taken Great if prediction is correct beq beq $1, $1, $3, $3, 28 28 and and $12, $12, $2, $2, $5 $5 or or $13, $13, $6, $6, $2 $2 add add $14, $14, $2, $2, $2 $2 lw lw $4, $4, 50($7) 50($7)

Reducing Pipeline Branch Penalties Predict Taken Great if prediction is correct. Penalty on misprediction only. Time 1 2 3 4 5 6 (clock cycles) 7 8 9 beq beq $1, $1, $3, $3, 28 28 and and $12, $12, $2, $2, $5 $5 or or $13, $13, $6, $6, $2 $2 add add $14, $14, $2, $2, $2 $2 lw lw $4, $4, 50($7) 50($7) beq lw...... and IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB

Reducing Pipeline Branch Penalties sll... srl srl $14, $14, $2, $2, $2 $2 beq beq $1, $1, $3, $3, 28 28 and and $12, $12, $2, $2, $5 $5...... Fill the branch delay slot sll...

Reducing Pipeline Branch Penalties Fill the branch delay slot sll sll...... srl srl $14, $14, $2, $2, $2 $2 beq beq $1, $1, $3, $3, 28 28 and and $12, $12, $2, $2, $5 $5...... Time 1 2 3 4 5 6 (clock cycles) 7 8 9 beq srl IF ID EX MEM WB IF ID EX MEM WB branch target IF ID EX MEM WB

Branch Delay Slot Predict taken Predict untaken

Static Branch Prediction

Dynamic Branch Prediction 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }}

Dynamic Branch Prediction 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }} BRANCH TARGET BUFFER PC Prediction Target 0x0100 1 0x128 0x0154 0 0x080 0x0210 1 0x300... 1 Addresses of branches in the program

Dynamic Branch Prediction Branch Target Buffer Single bit predictors (1-bit bimodal predictor) Change prediction with branch behaviour No. of wrong predictions? 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }} BRANCH TARGET BUFFER PC Prediction Target 0x0100 1 0x128 0x0154 0 0x080 0x0210 1 0x300... 1 Addresses of branches in the program

Dynamic Branch Prediction Branch Target Buffer Single bit predictors (1-bit bimodal predictor) Change prediction with branch behaviour No. of wrong predictions? 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }} Branch instruction behaviour T T T T N T T T T T T T T T T T T Wrong Predictions BRANCH TARGET BUFFER PC Prediction Target 0x0100 1 0x128 0x0154 0 0x080 0x0210 1 0x300... 1 Addresses of branches in the program

Dynamic Branch Prediction Branch Target Buffer Single bit predictors (1-bit bimodal predictor) Change prediction with branch behaviour No. of wrong predictions? 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }} Branch instruction behaviour T T T T N T T T T T T T T T T T T Wrong Predictions BRANCH TARGET BUFFER PC Prediction Target 0x0100 1 0x128 0x0154 0 0x080 0x0210 1 0x300... 1 Addresses of branches in the program Can Can we we do do better?

Dynamic Branch Prediction 0010 Branch Prediction Bits 00 11 10 11 11 00 00 00

Dynamic Branch Prediction 2-bit predictors Branch Prediction Bits 2-bit Bimodal Saturating Counter 00 0010 11 10 11 11 00 00 00

Dynamic Branch Prediction 2-bit predictors Branch Prediction Bits 2-bit Bimodal Saturating Counter 00 0010 11 10 11 11 00 00 00

Dynamic Branch Prediction 2-bit predictors... T T T N T T T T T T T T...

Dynamic Branch Prediction 2-bit predictors... T T T N T T T T T T T T...

Dynamic Branch Prediction 2-bit predictors... T T T N T T T T T T T T...

Dynamic Branch Prediction 2-bit predictors... T T T N T T T T T T T T...

Dynamic Branch Prediction 2-bit predictors... T T T N T T T T T T T T...

Dynamic Branch Prediction 2-bit predictors... T T T N T T T T T T T T...

Dynamic Branch Prediction 2-bit predictors T T T N T T T T T T T T...

Dynamic Branch Prediction n-bit saturating counters

Branch Predictors Without Branch Predictor With With Branch Predictor

Observations? Dynamic Branch Prediction

Correlating Branch Predictors eqntott code code if (aa == 2) aa = 0; if (bb == 2) bb = 0; if (aa!=bb) {

Correlating Branch Predictors eqntott code code if (aa == 2) aa = 0; if (bb == 2) bb = 0; if (aa!=bb) { Outcome of the previous branch X 0010 Branch Prediction Buffer 00 11 10 11 11 00 00 01

Correlating Branch Predictors eqntott code code if (aa == 2) aa = 0; if (bb == 2) bb = 0; if (aa!=bb) { Two-level predictors (1,2) predictor Outcome of the previous branch X 0010 Branch Prediction Buffer 00 11 10 11 11 00 00 01

Correlating Branch Predictors eqntott code code if (aa == 2) aa = 0; if (bb == 2) bb = 0; if (aa!=bb) { Two-level predictors (1,2) predictor Outcome of the previous branch X 0010 Branch Prediction Buffer 00 11 10 11 11 00 00 01 Yeh and Patt, Alternative implementations of two-level adaptive branch prediction, ISCA, 1992.

Correlating Branch Predictors if (aa == 2) aa = 0; Outcomes of the previous 2 branches Branch Prediction Buffer if (bb == 2) bb = 0; if (aa!=bb) { XX 0010 A 4096 bit, (2,2) buffer supports how many branch instructions? 00 11 10 11 11 00 00 01 Yeh and Patt, Alternative implementations of two-level adaptive branch prediction, ISCA, 1992.

Correlating Branch Predictors if (aa == 2) aa = 0; Outcomes of the previous 2 branches Branch Prediction Buffer if (bb == 2) bb = 0; if (aa!=bb) { XX 0010 A 4096 bit, (2,2) buffer supports how many branch instructions? 00 11 10 11 11 00 00 01 (m,n) BPB bits=2 m n No. of prediction entries Yeh and Patt, Alternative implementations of two-level adaptive branch prediction, ISCA, 1992.

Correlating Branch Predictors Yeh and Patt, ISCA, 1992.

Local Predictors Previous outcomes of the same branch Branch Prediction Buffer XXXXXXXXXX 00 11............ 00 01 1K entries

Local Predictors A history of branch behaviour is recorded Previous outcomes of the same branch Branch Prediction Buffer XXXXXXXXXX 00 11............ 00 01 1K entries

Local Predictors A history of branch behaviour is recorded One for each possible combination of outcomes for the last n occurrences of this branch Previous outcomes of the same branch Branch Prediction Buffer XXXXXXXXXX 00 11............ 00 01 1K entries

Local Predictors Example 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }} Branch instruction behaviour T T N T T T T N T T T T N T T 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1

Local Predictors Example 0x0100 0x0100 while(1) {{...... for(i=0; i<count; i++) i++) {{ }} BRANCH BRANCH }} Branch instruction behaviour T T N T T T T N T T T T N T T 1 1 0 1 1 1 1 0 1 1 1 1 0 1 1 Local History 0 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 Prediction NT NT T T T T T T T T NT NT

Tournament Predictors

Tournament Predictors Branch PC PC Local predictor Global predictor Predictor Selector Prediction

Tournament Predictors Use multiple predictors: Global, local or mix Branch PC PC Local predictor Global predictor Predictor Selector Prediction

Tournament Predictors Use multiple predictors: Global, local or mix Combine them with a selector 2 bit saturating counter to select the right predictor for the branch (global vs. local) Branch PC PC Local predictor Global predictor Predictor Selector Prediction

Tournament Predictors

Outline Pipeline, Pipelined datapath Dependences, Hazards Structural, Data - Stalling, Forwarding Control Hazards Branch prediction