Technical Report GIT-CERCS The Sleepy Keeper Approach: Methodology, Layout and Power Results for a 4-bit Adder

Similar documents
Technical Report GIT-CC Some Layouts Using the Sleepy Stack Approach

Sleepy Stack Reduction of Leakage Power

Adiabatic CMOS Circuit Design: Principles and Examples

P AND K IN POTATOES. Donald A Horneck Oregon State University Extension Service

static principle: output determined by a connection with strong node dynamic principle: output (sometimes) determined by a weak (floating) node

EFFECT OF DIETARY ENZYME ON PERFORMANCE OF WEANLING PIGS

Operating Systems Principles. Page Replacement Algorithms

Java Application Development

Whangarei District Council Class 4 Gambling Venue Policy

EECS150 - Digital Design Lecture 5 - Boolean Algebra II

Other Uses for Cluster Sampling

EECS150 - Digital Design Lecture 7 - Boolean Algebra II

Low Power Integrated Scan-Retention Mechanism

CS Artificial Intelligence 2007 Semester 2. CompSci 366. Classical Planning: Regression Planning. Part II: Lecture 5 1 of 20

Outline. EECS150 - Digital Design Lecture 5 - Boolean Algebra II. Canonical Forms. Sum of Products (cont.)

PTSE RATES IN PNNI NETWORKS

Poultry No The replacement value of betaine for DL-methionine and Choline in broiler diets

Provider How To. Software Process Service Results

Input from external experts and manufacturer on the 2 nd draft project plan Stool DNA testing for early detection of colorectal cancer

build Firm, sexy arms

Introduction to Study Designs II

Agilent G6825AA MassHunter Pathways to PCDL Software Quick Start Guide

2. Hubs and authorities, a more detailed evaluation of the importance of Web pages using a variant of

A LAYOUT-AWARE APPROACH FOR IMPROVING LOCALIZED SWITCHING TO DETECT HARDWARE TROJANS IN INTEGRATED CIRCUITS

Using Contrapositives to Enhance the Implication Graphs of Logic Circuits

Mediating Multi-Party Negotiation Through Marker-Based Tracking of Mobile Phones

SUPPLEMENTARY INFORMATION

LALR Analysis. LALR Analysis. LALR Analysis. LALR Analysis

Finite-Dimensional Linear Algebra Errata for the first printing

Using Paclobutrazol to Suppress Inflorescence Height of Potted Phalaenopsis Orchids

LHb VTA. VTA-projecting RMTg-projecting overlay. Supplemental Figure 2. Retrograde labeling of LHb neurons. a. VTA-projecting LHb

EFFECTS OF DIETARY CALCIUM LEVELS ON GROWTH-PERFORMANCE AND DIGESTIVE FUNCTION IN CATTLE FED A HIGH-FAT FINISHING DIET

A savings procedure based construction heuristic for the offshore wind cable layout optimization problem

Mechanisms underlying cross-orientation suppression in cat visual cortex

Review TEACHING FOR GENERALIZATION & MAINTENANCE

Open Access RESEARCH ARTICLE. Genetics Selection Evolution

Shear behaviour of regular and irregular rock joints under cyclic conditions

Using Contrapositive Law in an Implication Graph to Identify Logic Redundancies

Lipid Composition of Egg Yolk and Serum in Laying Hens Fed Diets Containing Black Cumin (Nigella sativa)

Iranian Food Science and Technology Research Journal Vol. 6, No. 3, Fall, 2010.

Prediction of the Wrist Joint Position During a Postural Tremor Using Neural Oscillators and an Adaptive Controller

THE EVALUATION OF DEHULLED CANOLA MEAL IN THE DIETS OF GROWING AND FINISHING PIGS

FRAMEstar. 2-Component PCR Plates

SUPPLEMENTARY INFORMATION

Lesions of prefrontal cortex reduce attentional modulation of neuronal responses. and synchrony in V4

CSE 5311 Notes 2: Binary Search Trees

Not for Citation or Publication Without Consent of the Author

I.G.C.S.E. Sine and Cosine Rules. You can access the solutions from the end of each question

Provide a Buffet and Carvery Service

Math 254 Calculus Exam 1 Review Three-Dimensional Coordinate System Vectors The Dot Product

Learning to see: experience and attention in primary visual cortex

The Role of Background Statistics in Face Adaptation

Scenarios. 22 AUG 2017 VMware Validated Design 4.1 VMware Validated Design for IT Automating IT 4.1

single smooth muscle cells from guinea-pig and rabbit jejunum

THE CARDIOVASCULAR RESPONSES OF THE RED-EARED SLIDER (TRACHEMYS SCRIPTA) ACCLIMATED TO EITHER 22 OR 5 C

The effect of manure, zeolite and soil ageing in the dynamics of hexavalent chromium in Cichorium spinosum

Supplementary Figure 1. Scheme of unilateral pyramidotomy used for detecting compensatory sprouting of intact CST axons.

Single-Molecule Studies of Unlabelled Full-Length p53 Protein Binding to DNA

Title of Experiment: Author, Institute and address:

Chloride Nutrition Regulates Water Balance in Plants

SUPPLEMENTARY INFORMATION

Once small always small? To what extent morphometric characteristics and postweaning starter regime affect pig lifetime growth performance

Check your understanding 3

TOURNAMENT REGULATIONS INDOOR COMPETITIONS

The Effect of Substituting Sugar with Artificial. Sweeteners on the Texture and Palatability of Pancakes

The linear oligomer 1 + SnCl 2 2 DPA G2

Minimum effective dose of chenic acid for gallstone patients: reduction with bedtime administration and

chapter 7. Colposcopic terminology: the 2011 IFCPC nomenclature

SUPPLEMENTARY INFORMATION

Effects of Feeding Citrus Pulp or Corn Supplements With Increasing Levels of Added Undegraded Intake Protein on the Performance of Growing Cattle

Aquaculture (2012) Contents lists available at SciVerse ScienceDirect. Aquaculture

Variations in burn perfusion over time as measured by portable ICG fluorescence: A case series

Glycemic Index: The Analytical Perspective

CAUSES OF DIARRHEA, PNEUMONIA, AND ABORTION IN 1991 CATTLE SUBMISSIONS TO THE KSU VETERINARY DIAGNOSTIC LABORATORY

Some aspects of nutritive and sensory quality of meat of restrictively fattened chickens

Optimisation of diets for Atlantic cod (Gadus morhua) broodstock: effect of arachidonic acid on egg & larval quality

An Energy Efficient Seizure Prediction Algorithm

Kiwanis Dawn Busters of Metairie of the Louisiana Mississippi West Tennessee District of Kiwanis International

Neural population coding of sound level adapts to stimulus statistics

Neighbourhood Watch London

Laminar sources of synaptic input to cortical inhibitory interneurons and pyramidal neurons

TOURNAMENT REGULATIONS INDOOR COMPETITIONS

WesternBright Quantum

REVIEW Study of the Formation of trans Fatty Acids in Model Oils (triacylglycerols) and Edible Oils during the Heating Process

TOURNAMENT REGULATIONS HOCKEY INDIA SANCTIONED ALL INDIA TOURNAMENTS

FOCUSED ION BEAM TREATMENT OF ZnO NANOWIRES. G.Sh. Shmavonyan

Including CD-ROM for whiteboard use or printing. Primary THE ALGEBRA BOOK. Written by Laura Sumner

Research Article A Comparison of Inflammatory and Oxidative Stress Markers in Adipose Tissue from Weight-Matched Obese Male and Female Mice

New strategies in haemodiafiltration (HDF): prospective comparative analysis between on-line mixed HDF and mid-dilution HDF

Efficient sensory cortical coding optimizes pursuit eye movements

PARKINSON S DISEASE: MODELING THE TREMOR AND OPTIMIZING THE TREATMENT. Keywords: Medical, Optimization, Modelling, Oscillation, Noise characteristics.

Objectives. R/S determination. R/S determination. Epoxidation. Last lecture Chirality

The Journal of Physiology

In vivo intracellular recording and perturbation of persistent activity in a neural integrator

Meat and Food Safety. B.A. Crow, M.E. Dikeman, L.C. Hollis, R.A. Phebus, A.N. Ray, T.A. Houser, and J.P. Grobbel

Supplementary Information

EE247 Lecture 4. EECS 247 Lecture 4: Filters 2005 H.K. Page 1. This Lecture

Meat Science 81 (2009) Contents lists available at ScienceDirect. Meat Science. journal homepage:

Z. Wang, S. Cerrate, C. Coto, F. Yan and P.W. Waldroup 2 Department of Poultry Science, University of Arkansas, Fayetteville AR 72701, USA

SUPPLEMENTARY INFORMATION

Transcription:

Tehnil Report GIT-CERCS-06-03 The Sleepy Keeper Approh: Methodology, Lyout nd Power Results for 4-it Adder Se Hun Kim, Vinent J. Mooney III nd Jun Cheol Prk Center for Reserh on Emedded Systems nd Tehnology Shool of Eletril nd Computer Engineering Georgi Institute of Tehnology, Atlnt, Georgi, U.S.A. 29 Mrh 2006

1. Introdution This tehnil report explins new pproh to low lekge power Very Lrge Sle Integrtion (VLSI) design; we nme the new pproh sleepy keeper. This report first introdues previous pprohes to redue lekge power onsumption nd then explins the methodology nd findings regrding the sleepy keeper pproh. The sope of this report inludes test proedures with shemtis nd lyouts for ll onsidered pprohes s well s test results suh s dt on dely plus dynmi nd stti power. The sleepy keeper results re ompred with the previous pprohes. 2. Bse Cse All lyouts nd shemtis re designed using the North Crolin Stte University (NCSU) [10] design kit trgeting the Tiwn Semiondutor Mnufturing Compny (TSMC) 0.18μm proess [18]. Trnsistor sizes re speified s rtio of Width / Length (W/L). The smllest possile trnsistor for the TSMC 0.18μm proess hs width of 270nm nd length of 180nm, resulting in rtio of W/L = 270nm / 180nm = 1.5. This rtio of W/L = 1.5 indites the smllest fesile trnsistor size throughout this report. This report evlutes ll onsidered pprohes using 4-it dder s test se. The se se for this test iruit is si Complementry Metl Oxide Semiondutor (CMOS) implementtion [13]. In ll pprohes, trnsistors re pled in-etween two prllel rows of ontinuous VDD nd GND. For the se se, the 4-it dder is implemented y using full dder shown in Figure 1 (repeted here, for onveniene, from Figure A.1. of Appendix A). In Figure 1, nd re two inputs, is rry input, nd Crry nd Sum re outputs. Figure 1 lso shows the trnsistor sizing. 3. Prior Stti Current Redution Approhes In order to ompre with the sleepy keeper pproh, this setion explins severl previous lekge redution pprohes: trnsistor stking [4][5], soure gting vi sleep trnsistors [1][6], seletive soure gting vi lternting sleep trnsistors (the so-lled zigzg pproh) [7], nd omintion of stk nd sleep pproh lled sleepy stk [2][3]. 1

Figure 1. A 1-it dder shemti for se se 3.1 Stk For the stk pproh, every trnsistor in the se se network is duplited with oth originl nd duplite ering hlf the originl trnsistor width s shown in Figure 2. Duplited trnsistors use slight reverse is etween the gte nd soure when oth trnsistors re turned off. Beuse suthreshold urrent is exponentilly dependent on gte is, sustntil urrent redution is otined [4]. Sine ll trnsistors re pled in-etween two prllel rows of ontinuous VDD nd GND, stk pproh design fores n inrese in row length euse of n inrese in the numer of trnsistors nd derese in trnsistor width. 3.2 Sleep For the sleep pproh, trnsistors gting VDD nd GND re dded to the se se [1][6]. The dded trnsistors ut off supply of power when in sleep mode. Eh dded trnsistor is referred to s sleep trnsistor nd tkes the width of the lrgest trnsistor in the se se. As shown in Figure 3, PMOS sleep trnsistor is pled etween VDD nd the pull-up network, nd n NMOS sleep trnsistor is pled etween GND nd the pull-down network. The sleep trnsistors re driven y Sleep (S) nd Sleep (S ) signls. Note tht the trnsistor widths in Figure 2 re set to show equl 2

resistnes, nd the trnsistor widths in Figure 3 re set sed on the widths shown in Figure 2. Figure 2. Stk pproh. Figure 3. Sleep pproh The sleep trnsistors disonnet the iruit from VDD nd GND when the logi iruit is not in use (i.e., when in sleep mode). By isolting logi iruitry using sleep trnsistors, the pproh redues suthreshold lekge urrent ut unfortuntely lso loses stte. In ddition, time nd energy for wking up re neessry. Also, the dditionl trnsistors s well s wires for S nd S require n inrese in re. Finlly, suthreshold lekge urrent n further e redued y utilizing high threshold voltge (high-v th ) sleep trnsistors. 3.3 Zigzg By plement of lternting sleep trnsistors sed on whih prtiulr network (pull-up or pull-down) is off given speifi input vetor, the zigzg pproh redues wke-up overhed dely used y sleep trnsistors [7]. For exmple in Figure 4, if the output is 1 when input is sserted to prtiulr vlue, then sleep trnsistor is pled in the ssoited pull-down network; if the output is 0, then sleep trnsistor is pled in the ssoited pull-up network. In order to evlute this pproh, the result of stti power dissiption for ll zero inputs is hosen for omprison with other pprohes euse reset input vlues re typilly ll zeros in most ses. In ddition, suthreshold lekge n further e redued y using high-v th sleep trnsistors. The redued numer of sleep trnsistors in this zigzg pproh results in smller inrese in re thn y using the sleep pproh. 3

3.4 Sleepy Stk The sleepy stk pproh hs struture omining the stk nd sleep pprohes y dividing every trnsistor into two trnsistors of hlf width nd pling sleep trnsistor in prllel with one of the divided trnsistor [2] [3]. As shown in Figure 5, sleep trnsistors re pled in prllel to the divided trnsistor losest to VDD for pull-up nd in prllel to the divided trnsistor losest to GND for pull-down. The sleepy stk pproh n hve dvntges of oth the stk pproh nd the sleep pproh. During tive mode, the sleepy stk pproh results in lower dely thn the stk pproh euse sleep trnsistors pled in prllel (i) redue resistne nd (ii) re lredy on. When sleep trnsistors re turned off, the existene of pth from either VDD or GND prevents floting output. Also, lekge urrent n further e redued y pplying high-v th on sleep trnsistors nd the trnsistors in prllel to the sleep trnsistors (e.g., the slightly shded/olored trnsistors in Figure 5). However, re penlty is signifint mtter sine every trnsistor is repled y three trnsistors nd sine dditionl wires re dded for S nd S, whih re sleep signls. Figure 4. Zigzg pproh. Figure 5. Sleepy-stk pproh. We riefly summrized severl prior stti urrent redution pprohes in this setion. We will mention these pprohes gin s we motivte nd explin our new pproh in following setions. 4. Motivtion Lekge power onsists minly of suthreshold lekge nd gte-oxide lekge. A potentil solution widely reported for gte-oxide lekge power is the possile use of high-k (high dieletri onstnt) gte insultors [16]. Currently, suthreshold lekge 4

power seems to e the mjority ontriutor to totl lekge power [17]. In ny se, this tehnil pper trgets redution of the suthreshold lekge omponent of stti power onsumption; other pprohes (most likely orthogonl to wht we propose here in this pper) should e onsidered for redution of gte-oxide lekge. Do plese note, however, tht ll results reported in this pper inlude ll soures of lekge power (to the extent tht the HSPICE models we use urtely model soures of lekge). With pplition of dul V th tehniques, the sleep, zigzg nd sleepy stk pprohes result in orders of mgnitude suthreshold lekge power redution [3]. The mjor dvntge of the sleepy stk pproh (see previous setion) over the sleep nd zigzg pprohes is tht the sleepy stk pproh sves ext logi stte. However, the sleepy stk pproh rries nontrivil penlty: eh trnsistor in the originl, se se, trditionl CMOS design results in three trnsistors in the sleepy stk equivlent. The gol of our new pproh is to hieve the enefits of the sleepy stk pproh without the lrge ssoited penlties due to the tripled trnsistor ount. One finl omment out motivtion is tht we ssume proper logi design nd timing for trnsition to sleep mode for sleepy keeper VLSI iruits. In prtiulr, we ssume tht there is smll dely (perhps few lok yles of gighertz lok) etween the finl omputtion in tive mode nd the trnsition to sleep mode. This llows the trnsition to sleep mode to only require tht existing logi stte/vlues e mintined. Finlly, we further ssume tht trnsition from sleep mode k to tive mode lso hs few lok yles of dely etween turning sleep trnsistors k on nd eginning to tively lulte new logi vlues (i.e., eginning to hnge stte gin). 5. New Stti Current Redution Approh: Sleepy Keeper In this setion we will desrie the new VLSI pproh to lekge power redution proposed in this tehnil report: the "sleepy keeper" pproh. First, we will disuss the struture of the sleepy keeper pproh nd how it opertes. Then, we explin how lyouts for the sleepy keeper pproh re reted. The si prolem with trditionl CMOS is tht the trnsistors re used only in their most effiient, nd nturlly inverting, wy: nmely, PMOS trnsistors onnet to VDD nd NMOS trnsistors onnet to GND. It is well know tht PMOS trnsistors re not effiient t pssing GND; similrly, it is well know tht NMOS trnsistors re not 5

effiient t pssing VDD. However, to mintin vlue of 1 in sleep mode, given tht the 1 vlue hs lredy een lulted, the sleepy keeper pproh uses this output vlue of 1 nd n NMOS trnsistor onneted to VDD to mintin output vlue equl to 1 when in sleep mode. For exmple, when the output is 1 for n inverter designed utilizing the sleepy keeper pproh, the urrent pth is shown in Figure 6. Similrly, to mintin vlue of 0 in sleep mode, given tht the 0 vlue hs lredy een lulted, the sleepy keeper pproh uses this output vlue of 0 nd PMOS trnsistor onneted to GND to mintin output vlue equl to 0 when in sleep mode. For exmple, when the output is 0 for n inverter implemented using the sleepy keeper pproh, the urrent pth is shown in Figure 7. Figure 6. Inverter for sleepy keeper pproh (output = 1 ) Figure 7. Inverter for sleepy keeper pproh (output = 0 ) For this sleepy keeper pproh to work, ll tht is needed is for the NMOS onneted to VDD nd the PMOS onneted to GND to e le to mintin proper logi stte. This seems likely to e possile s other reserhers hve desried wys to use fr lower VDD vlues to mintin logi stte. For exmple, Flutner et l. propose some signifintly redued VDD vlues suffiient to mintin stte [14]. In ny se, we do not investigte eyond the use of HSPICE [8] simultions ll of the possile side effets due to using PMOS trnsistors to onnet to GND nd NMOS trnsistors to onnet to VDD. Insted, we ssume tht the HSPICE simultions re roughly urte nd report results sed on HSPICE. Consider Figure 8. Note tht there is sleepy keeper PMOS trnsistor onneting 6

GND to the pull-down network. When in sleep mode, this PMOS trnsistor is the only soure of GND sine the sleep trnsistor is off. On the other hnd, in Figure 8, there is n dditionl single NMOS trnsistor onneting VDD to the pull-up network. During sleep mode, this NMOS trnsistor is the only soure of VDD whih is the dul se of the PMOS trnsistor se explined ove. Figure 8. Sleepy keeper pproh struture. We wish to here emphsize tht, s explined t the end of Setion 4, we emphtilly do not use sleepy keeper trnsistors (the NMOS onneted to VDD nd the PMOS onneted to GND) to dynmilly hnge the output voltge ut insted only use them to mintin n lredy lulted output voltge. Speifilly, only few lok yles fter entering sleep to few lok yles prior to exiting sleep do the sleepy keeper trnsistors ts s the sole onnetion to keep the output voltge unhnged. 6. Experimentl Methodology In this setion, we explin our experimentl methods. First, we desrie how we rete lyouts nd shemtis in preprtion for HSPICE simultion. Seond, we explin how we otin estimted results for dely, power onsumption nd re. 6.1 Lyouts, Shemtis nd HSPICE Shemtis nd lyouts re designed for ll onsidered design pprohes. Shemtis re used to otin netlists orresponding to the test iruit, nd the netlists re 7

used to simulte nd test performne for the Berkeley Preditive Tehnology Model (BPTM) [11] [12] 0.18, 0.13, 0.10, nd 0.07μm proesses nd the TSMC 0.18μm proess using HSPICE. Lyouts re used to mesure nd predit re usge. The estimtion proedure we use is summrized in Figure 10. NCSU CDK TSMC 0.18um Lyouts Shemtis HSPICE simultion TSMC 0.18um BPTM 0.18, 0.13, 0.10, 0.07 um Are estimtion Power, Dely estimtion Figure 10. Experimentl methodology. We rete shemtis of 4-it dders for ll onsidered pprohes using Cdene Virtuoso Shemti Editor [9]. We extrt netlists from the shemtis y using Cdene Virtuoso Anlog Environment [9]. For exmple, the shemti of n interter with sleepy keeper pproh is shown in Figure 11. Sine the shemtis re designed for the TSMC 0.18μm proess, the netlists do not extly mth for the BPTM proesses euse the netlists inlude lirry nd prmeters for the TSMC 0.18μm proess (e.g., see Exmple 1 on the next pge). Sine modifition of netlists nd performing HSPICE simultion inlude mny repetitions of the sme or similr proedure, we use n utomti system whih genertes templte netlists nd performs HSPICE simultion for the BPTM 0.18, 0.13, 0.10 nd 0.07μm proesses nd the TSMC 0.18μm proess. The templte netlists re modified from the originl extrted netlist s needed so tht the netlists n e used for ll onsidered tehnologies. Some perl sripts re used to mke templte netlists nd run the HSPICE simultions for different tehnologies. Note tht vriety of progrmming lnguges n e used to perform this utomti system. Exmple 1 shows the proess of generting smple templte netlist. 8

Figure 11. Shemti pture of n inverter with sleepy keeper pproh. * # FILE NAME: /HOME/SYNTHESIS/CADENCE/SIMULATION/4ADDER_CHAIN/ * HSPICES/SCHEMATIC/NETLIST/4ADDER_CHAIN.C.RAW * NETLIST OUTPUT FOR HSPICES MN2 VDD! A_INV NET28 VDD! TSMC18DN L=180E-9 W=1.08E-6 AD=486E-15 AS=486E- 15 +PD=3.06E-6 PS=3.06E-6 M=1 MN1 A_INV A NET9 0 TSMC18DN L=180E-9 W=540E-9 AD=243E-15 AS=243E-15 +PD=1.98E-6 PS=1.98E-6 M=1....li "/nsu/dene/lol/models/hspie/puli/pulimodel/tsm18dp" PMOS.li "/nsu/dene/lol/models/hspie/puli/pulimodel/tsm18dn" NMOS.END Figure 12. An exmple rw netlist (TSMC 0.18μm proess). Exmple 1 : Figure 12 shows smple netlist extrted from the shemti of Figure 11. This netlist inludes some informtion suh s referene to the TSMC 0.18μm proess (e.g., TSMC18DP ), whih is not proper ontent for BPTM proesses. First, we keep ll listed of onnetions from this netlist. In Figure 12, MN2 VDD! A_INV NET28 VDD! TSMC18DN L=180E-9 W=1.08E-6 AD=486E-15 AS=486E-15 +PD=3.06E-6 PS=3.06E-6 M=1 defines PMOS trnsistor nmed MP2 with its drin onneted to node 0, gte to SUM_INV, soure to NET71 nd ulk to node 0. Seond, in our templte netlists, vrile length is used for Length (L), nd vrile pwidth is used for width (W) for different tehnologies (e.g., see underlined words in Figure 13). For different tehnologies, we define proper vlues for the vriles. Lstly, pproprite prmeters, test vetors, lirries nd reports (e.g., olded words in Figure 13) re fed into the templte netlists for the HSPICE 9

simultions for different tehnologies. * utogenerted netlist file for dder * sed on Keeper-4dder_test.sp.inlude /home/hspie/prmeters/prmeters_#size#u.sp MN2 VDD! A_INV NET28 VDD! NMOSH L='length' W='1.5*pwidth' MN1 A_INV A NET9 0 NMOS L='length' W='4*pwidth'....glol vdd!.inlude "/home/hspie/test_vetors/dder_#testtype##input#.sp".inlude "/home/hspie/erkeley_models/#library#_#size#u.sp".inlude "/home/hspie/reports/dder_#testtype#_report.sp".end Figure 13. An exmple templte netlist. All six onsidered pprohes re evluted for performne y using single threshold voltge (V th ) for ll trnsistors. Dul V th tehnology is pplied nd tested only for the sleep, zigzg, sleepy stk, nd sleepy keeper pprohes sine pplying high-v th to the se se nd the stk pproh uses drmti inrese of dely (t lest 2-5X). For oth single V th nd dul V th tehniques, dely, dynmi power onsumption nd stti power onsumption re mesured y using HSPICE. In order to mesure performne of sleep, zigzg, sleepy stk, nd sleepy keeper with dul V th vlues, every sleep trnsistor nd ny trnsistor prllel to the sleep trnsistor re onfigured s high- V th trnsistors. The high-v th is set to hve 2.0 times higher V th thn the V th of norml trnsistor (low-v th ). The Delvto option of HSPICE is used to hnge V th. In order to distinguish two different V th vlues, NMOSH or PMOSH is used to indite high-v th nd NMOS or PMOS is used for low-v th. Figure 12 shows n exmple of the PMOS se. 6.2 Dely Worst se propgtion dely is mesured for eh pproh. Input vetors nd input/output triggers re hosen to mesure the dely of ritil pth. The propgtion dely is mesured from the trigger input edge rehing 50% of the supply voltge to the iruit output edge rehing 50% of the supply voltge vlue. 6.3 Stti Power 10

Stti power is mesured y sserting sets of input vetors in HSPICE. The input vetors inlude susets of possile input omintions. The verge power dissiption over the speifi suset of input omintions hosen is determined s the stti power for the se se, stk, sleep, sleepy stk nd sleepy keeper tehniques. All sleep trnsistors re turned off for the HSPICE mesurements. As mentioned in stti urrent redution pprohes setion, stti power for the zigzg pproh is determined to e the power dissiption of tested result for inputs ll zeros (reset input vlues). 6.4 Dynmi Power In order to mesure dynmi power, loked semi-rndom input vetors for numer of lok yles re sserted, nd verge power dissiption during this time reported y HSPICE is onsidered s estimtion of dynmi power onsumption. All sleep trnsistors re turned on for HSPICE mesurements. 6.5 Are Lyouts of 1-it full dder for ll the onsidered pprohes re designed sed on TSMC 0.18μm proess y using Cdene Virtuoso Lyout Editor [9] nd NCSU Cdene Design Kit. Lyouts re verified with Virtuoso s Design Rule Cheker (DRC). Ares for elow 0.18μm tehnology re estimted y sling the re of eh pproh lyout designed sed on TSMC 0.18μm proess. The res re sled y rtio of squres with ddition of 10% overhed for nonliner sling lyers (i.e., metl lyers). For exmple, if n re of 100.00μm 2 is mesured for 0.18μm tehnology, the re for 0.10μm tehnology would e 100.00μm 2 * (0.10 2 / 0.18 2 ) * 1.1 = 33.95 μm 2. 6.5 Equtions used for omprison When we ompre our results to nother result, we often sy one is less thn the other. In prtiulr, X is n% less thn Y mens wht Eq. 1 shows: Y X n 1 Eq. 1 [19] 100 For exmple, when two propgtion dely mesurements result in, X is 8.18E-10s nd Y is1.23e-09s, n is 50 from lultion using Eq. 1. In this se, we sy X is 50% less 11

dely thn Y. This eqution is used for ll other omprison suh s re nd power onsumption. 7. Test Ciruit 4-it dder We use full dder s n exmple of typil omplex CMOS gte. Our 4-it dder is implemented y using four 1-it full dders. A 1-it full dder is reted from four logi loks, one lok to generte inverted Crry out (Cout ), one lok to generte n inverted Sum (Sum ) nd two inverters s shown in Figure 13. The omplex loks re sized to hve n equl rise nd fll time. Appendix A.1. shows the sizing for the se se. In deed, plese see Appendix A for ext trnsistor sizing for ll onsidered VLSI pprohes. A B Cin A B A Cin Cout Cout B Cout Cin Sum Figure 13. Network of omplex gtes nd inverters omposing 1-it full dder. Sum. Dely The ritil pth of our se se 4-it dder is the pth B 0 C out0 C in1 C out1 C in2 C out2 C in3 Sum 3. In order to mesure the worst se propgtion dely, initil input signls re set s shown in Figure 14. When B 0 is hnged to 1, the dely is mesure from B 0 to Sum 3. Figure 14. Inputs of 4-it dder for ritil pth dely 12

. Stti Power Nine input its (A[3:0], B[3:0], Cin0) provide 2 9 (512) possile input omintions. Eight input vetors out of 512 possile input omintions re hosen for the 4-it dder. Tle 1 shows the eight input omintions. The verge power dissiption for eh input vetor during 20ns (per stti input vetor) is reorded s the stti power of eh iruit onsidered. Tle 1. Stti power ssessment inputs used for 4-it dder. C in A 0 B 0 A 1 B 1 A 2 B 2 A 3 B 3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1. Dynmi Power Similr to our pproh to stti power estimtion, to estimte dynmi power we ssert input vetors overing eight possile inputs out of 512 possile input omintions. Eh input vetor is sserted followed y ll zero inputs (reset vlues) exept the se when the inputs re ll zeros for the first time. The wveform in Figure 15 shows input vetors sserted for eh one it dder, where the input vetor hnges in every 4ns. The verge power dissiption during the 30ns of Figure 15 is reorded s the dynmi power of the iruit. A B C Sum Cout Figure 15. Dynmi power ssessment wveform for full dder 13

d. Are We rete full trnsistor-level lyout of 1-it dder sed on TSMC 0.18μm tehnology. The re for the 1-it dder is mesured; the re for the 4-it dder is determined s the sum of four 1-it dders. Are results for the other tehnologies onsidered (e.g., 0.07μm) re lulted s explined in Setion 6.5. 8. Experimentl Results For the 4-it dder iruit, propgtion dely, stti power, dynmi power, nd re re shown in Figure 16 nd Tle 2. Propgtion dely Stti power 1.70E-09 1.00E-07 TSMC 0.18u Berkeley 0.18u 1.50E-09 1.00E-08 Berkeley 0.13u Berkeley 0.10u 1.30E-09 Berkeley 0.07u 1.00E-09 1.10E-09 9.00E-10 1.00E-10 7.00E-10 1.00E-11 5.00E-10 3.00E-10 Bse se Stk Sleep ZigZg Sleepy Stk Sleepy Keeper Sleep* ZigZg* Sleepy Stk* Sleepy Keeper* 1.00E-12 Bse se Stk Sleep ZigZg Sleepy Stk Sleepy Keeper Sleep* ZigZg* Sleepy Stk* Sleepy Keeper* Dynmi Power Are 1.00E-03 10000 1.00E-04 1000 100 1.00E-05 10 1.00E-06 Bse se Stk Sleep ZigZg Sleepy Stk Sleepy Keeper Sleep* ZigZg* Sleepy Stk* Sleepy Keeper* Bse se Stk Sleep ZigZg Sleepy Stk Sleepy Keeper Sleep* ZigZg* Sleepy Stk* Sleepy Keeper* Figure 16. Results for 4-it dder (*dul V th ) 14

Tle 2. Power, dely, re estimtion for 0.07μm Berkeley 0.07 m Propgtion dely Stti Power Dynmi Power Are (s) (W) (W) ( m 2 ) Bse se 3.82E-10 8.97E-08 8.28E-06 91.84 Stk 1.16E-09 6.83E-09 7.41E-06 123.76 Sleep 5.29E-10 1.25E-08 8.66E-06 123.76 ZigZg 5.25E-10 1.84E-08 8.37E-06 110.48 Sleepy Stk 8.64E-10 1.08E-08 7.06E-06 263.52 Sleepy Keeper 5.85E-10 4.40E-09 9.62E-06 177.11 Sleep (dul Vth) 7.45E-10 2.23E-11 9.02E-06 123.76 ZigZg (dul Vth) 7.43E-10 5.05E-11 8.46E-06 110.48 Sleepy Stk (dul Vth) 1.24E-09 3.50E-11 7.26E-06 263.52 Sleepy Keeper(dul Vth) 8.33E-10 1.22E-11 9.99E-06 177.11 In 0.07μm tehnology, the sleepy keeper pproh (with dul V th ) hieves 7350X lekge redution over the se se nd 560X lekge redution over the stk pproh. The result is similr to the previous est lekge redution tehnique with stte sving, sleepy stk, ut sleepy keeper hieves less dely thn sleepy stk. In 0.07μm tehnology, sleepy keeper results in 48% less dely thn sleepy stk with single V th nd 49% less dely with dul V th. Sleepy keeper onsumes 36% more dynmi power thn sleepy stk with single V th nd 38% more dynmi power thn sleepy stk with dul V th. The dynmi power result is roughly 21% inrese of dynmi power over the se se. Finlly, re usge of the sleepy keeper is 93% lrger thn the se se, ut it is 49% smller thn re usge of the sleepy stk. The experimentl results for ll other onsidered tehnologies re ville in Appendix B. The Figure 16 nd Tle 2 re sed on the results in Appendix B. 9. Conlusion Bsed on the 4-it dder test results, we hve verified tht the sleepy keeper pproh n result in ultr-low stti power onsumption with stte sving. Furthermore, the sleepy keeper pproh is pplile to single nd multiple threshold voltges. Compred to the sleepy stk pproh, sleepy keeper requires 49% less re s well s hieves smller propgtion dely (up to 49% less). Therefore, sleepy keeper n redue the min penlties to using the sleepy stk pproh, while still hieving the sme twin dvntges of ultr-low lekge nd mintenne of preise logi stte in sleep mode. Bsed on these results, sleepy keeper ppers to e the most effiient pproh 15

known to redue lekge urrent with the smllest dely nd sving stte. In terms of re, sleepy keeper is expeted to e more ttrtive for omplex logi iruits, euse the portion of inresed re for the required dditionl trnsistors will e smller for omplex logi iruits thn for simple logi iruits (e.g., for n inverter). The sleepy keeper pproh uses dynmi power inrese whih seems to e the min disdvntge of the pproh. The inrese is most likely due to pling n NMOS trnsistor in pull-up network nd PMOS trnsistor in pull-down network where the two dded trnsistors re ontrolled y the output voltge. In order to redue sleepy keeper dynmi power onsumption, dditionl issues, inluding solid stte iruit issues, should e investigted; this is left s future work. 16

Referenes [1] S. Mutoh et l., 1-V Power Supply High-speed Digitl Ciruit Tehnology with Multithreshold-Voltge CMOS, IEEE Journl of Solid-Stte Ciruits, Vol. 30, No. 8, pp. 847-854, August 1995. [2] J.C. Prk, V. J. Mooney III nd P. Pfeiffenerger, Sleepy Stk Redution of Lekge Power, Proeeding of the Interntionl Workshop on Power nd Timing Modeling, Optimiztion nd Simultion, pp.148-158, Septemer 2004. [3] J. Prk, Sleepy Stk: New Approh to Low Power VLSI nd Memory, Ph.D. Disserttion, Shool of Eletril nd Computer Engineering, Georgi Institute of Tehnology, 2005. [Online]. Aville http://etd.gteh.edu/theses /ville/etd-07132005-131806/. [4] Z. Chen, M. Johnson, L. Wei nd K. Roy, Estimtion of Stndy Lekge Power in CMOS Ciruits Considering Aurte Modeling of Trnsistor Stks, Interntionl Symposium on Low Power Eletronis nd Design, pp. 239-244, August 1998. [5] S. Ndr, S. Borkr, V. De, D. Antonidis nd A. Chndrksn, Sling of Stk Effet nd its Applition for Lekge Redution, Interntionl Symposium on Low Power Eletronis nd Design, pp. 195-200, August 2001. [6] M. Powell, S.-H. Yng, B. Flsfi, K. Roy nd T. N. Vijykumr, Gted-VDD: A Ciruit Tehnique to Redue Lekge in Deep-sumiron Che Memories, Interntionl Symposium on Low Power Eletronis nd Design, pp. 90-95, July 2000. [7] K.-S. Min, H. Kwguhi nd T. Skuri, Zigzg Super Cut-off CMOS (ZSCCMOS) Blok Ativtion with Self-Adptive Voltge Level Controller: An Alterntive to Clok-gting Sheme in Lekge Dominnt Er, IEEE Interntionl Solid-Stte Ciruits Conferene, pp. 400-401, Ferury 2003. [8] Synopsys In., http://www.synopsys.om/. [9] Cdene Design Systems, http://www.dene.om/. [10] NC Stte University Cdene Tool Informtion, http://www.dene.nsu.edu/. [11] Berkeley Preditive Tehnology Model (BPTM), http://www.es.su.edu/~ptm/. 17

[12] Y. Co, T. Sto, D. Sylvester, M. Orshnsky nd C. Hu, New Prdigm of Preditive MOSFET nd Interonnet Modeling for Erly Ciruit Design, Pro. of IEEE Custom Integrted Ciruits Conferene, pp. 201-204, June 2000. [13] N. Westel, nd K. Eshrghin, Priniples of CMOS VLSI Design. Snt Clr, Cliforni: Addision Wesley, 1992. [14] K. Flutner, N. S. Kim, S. Mrtin, D. Bluw nd T. Mudge, Drowsy Ches: Simple Tehniques for Reduing Lekge Power, Proeedings of the Interntionl Symposium on Computer Arhiteture, pp. 148-157, My 2002. [15] MOSIS, http://www.mosis.org/. [16] G. Ries, J. Mitrd, M. Denis, S. Bruyere, F. Monsieur, C. Prthsrthy, E. Vinent nd G. Ghiudo, Review on High-k Dieletris Reliility Issues, IEEE Trnstions on Devie nd Mterils Reliility, Vol. 5, Issue 1, pp. 5-19, Mrh 2005. [17] A.B. Khng, S. Muddu nd P. Shrm, Defous-wre Lekge Estimtion nd Control, Interntionl Symposium on Low Power Eletronis nd Design, pp. 263-268, Aug. 2005. [18] Tiwn Semiondutor Mnufturing Compny, http://www.tsm.om/. [19] D. Ptterson nd J. Hennessy, Computer Arhiteture: A Quntittive Approh. Plo Alto, Cliforni: Morgn Kufmnn Pulishers, pp. 5-7, 1990. 18

Appendix A: Shemtis nd Lyouts 1) Bse pproh ) Shemti ) Lyout 2) Stk pproh ) Shemti (i) Cout (ii) Sum ) Lyout 3) Sleep pproh ) Shemti (i) Cout (ii) Sum ) Lyout 4) Zigzg pproh ) Shemti (i) Cout (ii) Sum ) Lyout (i) Cout (ii) Sum (iii) Full Adder 5) Sleepy stk pproh ) Shemti (i) Cout (ii) Sum ) Lyout (i) Cout (ii) Sum (iii) Full Adder 6) Sleepy keeper pproh ) Shemti (i) Cout (ii) Sum ) Lyout (i) Cout (ii) Sum (iii) Full Adder 19

W/L=12 W/L=12 W/L=12 W/L=9 W/L=9 W/L=12 W/L=9 W/L=9 W/L=6 Crry W/L=4 W/L=12 W/L=12 W/L=6 Sum A.1.. Bse se full dder shemti 20

A.1.. Bse se full dder lyout 21

W/L=2.25 W/L=2.25 Crry A.2..i. Stk pproh Full Adder Cout shemti 22

W/L=6 W/L=6 W/L=6 W/L=8 W/L=6 W/L=6 W/L=6 W/L=6 W/L=6 Crry W/L=2 W/L=6 W/L=6 W/L=2 W/L=6 Sum W/L=2.25 W/L=2.25 W/L=2.25 W/L=2.25 W/L=2.25 W/L=2.25 A.2..ii. Stk pproh Full Adder Sum shemti 23

A.2.. Stk pproh Full Adder lyout 24

W/L=9 S W/L=9 W/L=9 W/L=9 W/L=9 Crry S A.3..i. Sleep pproh Full Adder Cout shemti 25

S W/L=12 W/L=12 W/L=12 W/L=12 W/L=12 W/L=12 Crry W/L=4 W/L=12 Sum S A.3..ii. Sleep pproh Full Adder Sum shemti 26

A.3.. Sleep pproh Full Adder lyout 27

W/L=9 W/L=9 W/L=9 W/L=9 Crry S A.4..i. Zigzg pproh Full Adder Cout shemti. 28

W/L=12 W/L=12 W/L=12 W/L=12 W/L=12 Crry W/L=4 W/L=12 Sum S A.4..ii. Zigzg pproh Full Adder Sum shemti 29

A.4..i. Zigzg pproh Full Adder Cout lyout 30

A.4..ii. Zigzg pproh Full Adder Sum lyout 31

A.4..iii. Zigzg pproh Full Adder lyout 32

S S S S W/L=2.25 W/L=2.25 W/L=2.25 S Crry S S S S S A.5..i. Sleepy stk pproh Full Adder Cout Shemti 33

S W/L=6 S W/L=6 W/L=6 W/L=6 W/L=6 W/L=6 S W/L=6 W/L=6 W/L=6 W/L=6 W/L=6 S W/L=6 W/L=6 W/L=6 S W/L=6 S W/L=2 W/L=2 W/L=6 W/L=6 S W/L=2 W/L=6 Crry Sum W/L=2.25 S S W/L=2.25 W/L=2.25 S S S S S W/L=2.25 W/L=2.25 W/L=2.25 W/L=2.25 S W/L=2.25 W/L=2.25 A.5..ii. Sleepy stk pproh Full Adder Sum Shemti 34

A.5..i. Sleepy stk pproh Full Adder Cout Lyout 35

A.5..ii. Sleepy stk pproh Full Adder Sum Lyout 36

A.5..iii. Sleepy stk pproh Full Adder Lyout 37

S W/L=9 W/L=9 W/L=9 W/L=9 W/L=9 Crry S A.6..i. Sleepy keeper pproh Full Adder Cout shemti 38

A.6..ii. Sleepy keeper pproh Full Adder Sum shemti 39

A.6..i. Sleepy keeper pproh Full Adder Cout Lyout 40

A.6..ii. Sleepy keeper pproh Full Adder Sum Lyout 41

A.6..iii. Sleepy keeper pproh Full Adder Lyout 42

Appendix B: 4-it dder results TSMC 0.18 m Propgtion dely (s) Stti Power (W) Dynmi Power (W) Are ( 2 ) Bse se 7.05E-10 3.89E-10 1.48E-04 552.06 Stk 1.70E-09 2.23E-10 1.27E-04 743.94 Sleep 9.49E-10 1.15E-10 1.49E-04 743.94 ZigZg 9.42E-10 5.49E-11 1.40E-04 664.11 Sleepy Stk 1.35E-09 1.77E-10 1.25E-04 1584.05 Sleepy Keeper 1.02E-09 1.16E-10 1.66E-04 1064.61 Sleep (dul Vth) 1.07E-09 3.70E-11 1.50E-04 743.94 ZigZg (dul Vth) 1.07E-09 1.21E-11 1.40E-04 664.11 Sleepy Stk (dul Vth) 1.47E-09 3.44E-11 1.19E-04 1584.05 Sleepy Keeper(dul Vth) 1.20E-09 2.78E-11 1.68E-04 1064.61 Berkeley 0.18 m Propgtion dely (s) Stti Power (W) Dynmi Power (W) Are ( 2 ) Bse se 5.06E-10 3.08E-08 1.38E-04 552.06 Stk 1.50E-09 3.05E-09 1.17E-04 743.94 Sleep 6.78E-10 4.73E-09 1.39E-04 743.94 ZigZg 6.83E-10 2.51E-09 1.32E-04 664.11 Sleepy Stk 1.18E-09 4.43E-09 1.22E-04 1584.05 Sleepy Keeper 7.39E-10 4.57E-09 1.51E-04 1064.61 Sleep (dul Vth) 7.96E-10 4.01E-11 1.43E-04 743.94 ZigZg (dul Vth) 8.07E-10 8.12E-12 1.33E-04 664.11 Sleepy Stk (dul Vth) 1.34E-09 2.89E-11 1.16E-04 1584.05 Sleepy Keeper(dul Vth) 8.73E-10 3.86E-11 1.55E-04 1064.61 Berkeley 0.13 m Propgtion dely (s) Stti Power (W) Dynmi Power (W) Are ( 2 ) Bse se 4.86E-10 1.45E-08 4.25E-05 316.75 Stk 1.43E-09 8.13E-10 3.71E-05 426.85 Sleep 6.32E-10 1.77E-09 4.28E-05 426.85 ZigZg 6.33E-10 1.07E-09 4.07E-05 381.04 Sleepy Stk 1.09E-09 1.35E-09 3.71E-05 908.88 Sleepy Keeper 6.98E-10 1.50E-09 4.68E-05 610.84 Sleep (dul Vth) 7.77E-10 2.67E-11 4.36E-05 426.85 ZigZg (dul Vth) 7.71E-10 2.82E-12 4.08E-05 381.04 Sleepy Stk (dul Vth) 1.26E-09 1.66E-11 3.49E-05 908.88 Sleepy Keeper(dul Vth) 8.53E-10 2.73E-11 4.77E-05 610.84 43

Berkeley 0.10 m Propgtion dely (s) Stti Power (W) Dynmi Power (W) Are ( 2 ) Bse se 4.00E-10 3.74E-08 1.90E-05 187.43 Stk 1.20E-09 2.23E-09 1.63E-05 252.57 Sleep 5.46E-10 4.29E-09 1.92E-05 252.57 ZigZg 5.38E-10 2.33E-09 1.83E-05 225.47 Sleepy Stk 9.10E-10 3.52E-09 1.64E-05 537.80 Sleepy Keeper 5.95E-10 4.27E-09 2.11E-05 361.44 Sleep (dul Vth) 7.05E-10 1.52E-11 1.96E-05 252.57 ZigZg (dul Vth) 6.93E-10 4.71E-12 1.83E-05 225.47 Sleepy Stk (dul Vth) 1.15E-09 1.66E-11 1.53E-05 537.8 Sleepy Keeper(dul Vth) 7.91E-10 1.46E-11 2.17E-05 361.44 Berkeley 0.07 m Propgtion dely (s) Stti Power (W) Dynmi Power (W) Are ( 2 ) Bse se 3.76E-10 8.90E-08 8.63E-06 91.84 Stk 1.16E-09 6.83E-09 7.41E-06 123.76 Sleep 5.38E-10 1.36E-08 8.77E-06 123.76 ZigZg 5.25E-10 9.09E-09 8.37E-06 110.48 Sleepy Stk 8.64E-10 1.08E-08 7.39E-06 263.52 Sleepy Keeper 5.90E-10 1.30E-08 9.71E-06 177.11 Sleep (dul Vth) 7.52E-10 3.65E-11 9.03E-06 123.76 ZigZg (dul Vth) 7.43E-10 2.19E-11 8.46E-06 110.48 Sleepy Stk (dul Vth) 1.24E-09 3.50E-11 7.06E-06 263.52 Sleepy Keeper(dul Vth) 8.30E-10 3.89E-11 1.00E-05 177.11 44