Measuring Dispersion

Similar documents
Measures of Spread: Standard Deviation

Chapter 8 Descriptive Statistics

Statistics 11 Lecture 18 Sampling Distributions (Chapter 6-2, 6-3) 1. Definitions again

Objectives. Sampling Distributions. Overview. Learning Objectives. Statistical Inference. Distribution of Sample Mean. Central Limit Theorem

GOALS. Describing Data: Numerical Measures. Why a Numeric Approach? Concepts & Goals. Characteristics of the Mean. Graphic of the Arithmetic Mean

Statistics Lecture 13 Sampling Distributions (Chapter 18) fe1. Definitions again

Statistical Analysis and Graphing

Estimation and Confidence Intervals

Appendix C: Concepts in Statistics

How is the President Doing? Sampling Distribution for the Mean. Now we move toward inference. Bush Approval Ratings, Week of July 7, 2003

EDEXCEL NATIONAL CERTIFICATE UNIT 28 FURTHER MATHEMATICS FOR TECHNICIANS OUTCOME 1- ALGEBRAIC TECHNIQUES TUTORIAL 3 - STATISTICAL TECHNIQUES

CHAPTER 8 ANSWERS. Copyright 2012 Pearson Education, Inc. Publishing as Addison-Wesley

Technical Assistance Document Algebra I Standard of Learning A.9

Review for Chapter 9

Concepts Module 7: Comparing Datasets and Comparing a Dataset with a Standard

Chapter 21. Recall from previous chapters: Statistical Thinking. Chapter What Is a Confidence Interval? Review: empirical rule

23.3 Sampling Distributions

Sampling Distributions and Confidence Intervals

JUST THE MATHS UNIT NUMBER STATISTICS 3 (Measures of dispersion (or scatter)) A.J.Hobson

5/7/2014. Standard Error. The Sampling Distribution of the Sample Mean. Example: How Much Do Mean Sales Vary From Week to Week?

Sec 7.6 Inferences & Conclusions From Data Central Limit Theorem

Caribbean Examinations Council Secondary Education Certificate School Based Assessment Additional Math Project

Chapter 8 Student Lecture Notes 8-1

Statistics for Managers Using Microsoft Excel Chapter 7 Confidence Interval Estimation

Lecture Outline. BIOST 514/517 Biostatistics I / Applied Biostatistics I. Paradigm of Statistics. Inferential Statistic.

Intro to Scientific Analysis (BIO 100) THE t-test. Plant Height (m)

Should We Care How Long to Publish? Investigating the Correlation between Publishing Delay and Journal Impact Factor 1

Objectives. Types of Statistical Inference. Statistical Inference. Chapter 19 Confidence intervals: Estimating with confidence

Standard deviation The formula for the best estimate of the population standard deviation from a sample is:

Sample Size Determination

Chem 135: First Midterm

International Journal of Mathematical Archive-4(3), 2013, Available online through ISSN

What are minimal important changes for asthma measures in a clinical trial?

DISTRIBUTION AND PROPERTIES OF SPERMATOZOA IN DIFFERENT FRACTIONS OF SPLIT EJACULATES*

Comparison of speed and accuracy between manual and computer-aided measurements of dental arch and jaw arch lengths in study model casts

STATISTICAL ANALYSIS & ASTHMATIC PATIENTS IN SULAIMANIYAH GOVERNORATE IN THE TUBER-CLOSES CENTER

Practical Basics of Statistical Analysis

Estimating Means with Confidence

Reporting Checklist for Nature Neuroscience

ANALYZING ECOLOGICAL DATA

A Supplement to Improved Likelihood Inferences for Weibull Regression Model by Yan Shen and Zhenlin Yang

RADIESSE Dermal Filler for the Correction of Moderate to Severe Facial Wrinkles and Folds, Such As Nasolabial Folds

Modified Early Warning Score Effect in the ICU Patient Population

Introduction. The Journal of Nutrition Methodology and Mathematical Modeling

Methodology CHAPTER OUTLINE

Retention in HIV care among a commercially insured population,

5.1 Description of characteristics of population Bivariate analysis Stratified analysis

Plantar Pressure Difference: Decision Criteria of Motor Relearning Feedback Insole for Hemiplegic Patients

Introduction. Agent Keith Streff. Humane Investigations: Animal Hoarding & Collecting

Chapter 18 - Inference about Means

Methodology National Sports Survey SUMMARY

STATISTICS. , the mean deviation about their mean x is given by. x x M.D (M) =

Lecture 19: Analyzing transcriptome datasets. Spring 2018 May 3, 2018

Drug use in Ireland and Northern Ireland

Copy of: Proc. IEEE 1998 Int. Conference on Microelectronic Test Structures, Vol.11, March 1998

Variability. After reading this chapter, you should be able to do the following:

Bayesian Sequential Estimation of Proportion of Orthopedic Surgery of Type 2 Diabetic Patients Among Different Age Groups A Case Study of Government

Chapter 23 Summary Inferences about Means

M e sotheliom a. a UK nursing and inform ation project. Mavis Robinson Project Manager

Ovarian Cancer Survival

1 Barnes D and Lombardo C (2006) A Profile of Older People s Mental Health Services: Report of Service Mapping 2006, Durham University.

Teacher Manual Module 3: Let s eat healthy

Information Following Treatment for Patients with Early Breast Cancer. Bradford Teaching Hospitals. NHS Foundation Trust

Improving the Bioanalysis of Endogenous Bile Acids as Biomarkers for Hepatobiliary Toxicity using Q Exactive Benchtop Orbitrap?

The relationship between hypercholesterolemia as a risk factor for stroke and blood viscosity measured using Digital Microcapillary

Autism Awareness Education. April 2018

A COMBINATION OF ANALGESIC AND IN POSTOPERATIVE PAIN

SMV Outpatient Zero Suicide Initiative Oct 14 to Dec 16

The Efficiency of the Denver Developmental Screening Test with Rural Disadvantaged Preschool Children 1

Estimation Of Population Total Using Model-Based Approach: A Case Of HIV/AIDS In Nakuru Central District, Kenya

ESTIMATING QUANTITIES AND TYPES OF FOOD WASTE AT THE CITY LEVEL: TECHNICAL APPENDICES

A Method to Determine Cortical Bone Thickness of Human Femur and Tibia Using Clinical CT Scans. Wenjing Du, Jinhuan Zhang, Jingwen Hu

Hypertension in patients with diabetes is a well recognized

Supplemental Material can be found at: 9.DC1.html

Your health matters. Practical tips and sources of support

Confidence Intervals and Point Estimation

Chapter 7 - Hypothesis Tests Applied to Means

Health and Wellbeing. Tackling health inequalities through learning in the West Midlands.

04/11/2014 YES* YES YES. Attitudes = Evaluation. Attitudes = Unique Cognitive Construct. Attitudes Predict Behaviour

Primary: To assess the change on the subject s quality of life between diagnosis and the first 3 months of treatment.

Previous studies have shown that the agestandardized

GSK Medicine Study Number: Title: Rationale: Study Period: Objectives: Primary Secondary Indication: Study Investigators/Centers: Research Methods

Certify your stroke care program. Tell your community you re ready when needed.

Chapter 7 - Hypothesis Tests Applied to Means

Definition of Clinically Relevant Lactic Acidosis in Patients with Internal Diseases

COMPARISON OF A NEW MICROCRYSTALLINE

Chapter - 8 BLOOD PRESSURE CONTROL AND DYSLIPIDAEMIA IN PATIENTS ON DIALYSIS

REPORT TO PLANNING AND DESIGN COMMISSION City of Sacramento

Repeatability of the Glaucoma Hemifield Test in Automated Perimetry

Finite Element Simulation of a Doubled Process of Tube Extrusion and Wall Thickness Reduction

Automatic reasoning evaluation in diet management based on an Italian cookbook

DEGRADATION OF PROTECTIVE GLOVE MATERIALS EXPOSED TO COMMERCIAL PRODUCTS: A COMPARATIVE STUDY OF TENSILE STRENGTH AND GRAVIMETRIC ANALYSES

Sexuality and chronic kidney disease

Study No.: Title: Rationale: Phase: Study Period: Study Design: Centres: Indication: Treatment: Objectives: Primary Outcome/Efficacy Variable:

Routing-Oriented Update SchEme (ROSE) for Link State Updating

Clinical Usefulness of Very High and Very Low Levels of C-Reactive Protein Across the Full Range of Framingham Risk Scores

Evaluation of C-14 Based Radiation Doses from Standard Food Ingestion in Korea

The Suicide Note: Do unemployment rates affect suicide rates? Author: Sarah Choi. Course: A World View of Math and Data Analysis

Measures of Central Tendency - the Mean

Clinical Research The details of the studies undertaken year wise along with the outcomes is given below: SNo Name of Project

Transcription:

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 17 CHAPTER 5 Measurig Dispersio PROLOGUE Comparig two groups by a measure of cetral tedecy may ru the risk for each group of failig to reveal valuable iformatio. I particular, iformatio about the distributio of the scores withi each group may be useful to us but ot revealed by the mea, media, or mode. I some groups, the scores may all fall ear the middle score, whereas i other groups, the scores may be more widely spread above ad below the cetral scores. Accordigly, it is possible that the more bigoted group of the two we compared, usig a measure of cetral tedecy, might cotai some highly bigoted idividuals but possibly also several less bigoted people tha could be foud i the less bigoted group. So i additio to cetral tedecy, we should examie the dispersio of the scores i each group as well. 17

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 18 18 STATISTICS FOR THE SOCIAL SCIENCES INTRODUCTION I additio to fidig measures of cetral tedecy for a set of scores, we also calculate measures of dispersio to aid us i describig the data. Measures of dispersio, also called measures of variability, address the degree of clusterig of the scores about the mea. Are most scores relatively close to the mea, or are they scattered over a wider iterval ad thus farther from the mea? The extet of clusterig or spread of the scores about the mea determies the amout of dispersio. I the istace where all scores are exactly at the mea, there is o dispersio at all; dispersio icreases from zero as the spread of scores wides about the mea. I this chapter, we will cover four measures of dispersio: the rage, the mea deviatio, the variace, ad the stadard deviatio. Measures of dispersio Measures of variability that address the degree of clusterig of the scores about the mea. Dispersio The extet of clusterig or spread of the scores about the mea. VISUALIZING DISPERSION To begi our discussio, let us suppose that i a peology class, three teachig assistats Tom, Dick, ad Harriet had their respective discussio groups role-play court-employed social case workers who read the files of covicted crimials ad recommeded to the judge the pealty to be imposed for each crimial. The teachig assistats the compared each studet s recommeded setece to the oe actually imposed by the real judge. The teachig assistats the rated each studet o a 0 to 10 scale, with 10 beig a totally accurate reproductio of the seteces that were actually haded dow. There were four studets i each discussio group. The results were as follows: Tom s Group Dick s Group Harriet s Group x x x 8 9 10 8 8 10 8 8 6 8 7 6 x 3 x 3 x 3 x Tom 3 8 x Dick 3 8 x Harriet 3 8 4 4 4

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 19 Measurig Dispersio 19 The three groups share the same mea, but the dispersio of the scores varies from oe i Tom s group to some i Dick s group to eve more i Harriet s group. This is illustrated i the histograms to the left. Because the distributio of idividual scores clearly differed from each other i terms of their dispersio, we eed to measure that dispersio i additio to measurig cetral tedecy. I this chapter, we will discuss measures of dispersio i a order that will ultimately brig us to the two measures used to the virtual exclusio of the others, the variace ad its positive square root, the stadard deviatio. The first two measures we will discuss, the rage ad the mea deviatio, may be thought of as buildig blocks for uderstadig the variace ad stadard deviatio. Sice such measures are rarely used with data havig a level of measuremet less sophisticated tha iterval level, they are usually calculated alog with the calculatio of the mea. With the mea as our measure of cetral tedecy, we the calculate a measure of dispersio, most ofte the stadard deviatio. f f f 4 3 1 0 4 3 1 0 4 3 1 0 1 3 4 5 6 7 8 9 10 Tom s Group 1 3 4 5 6 7 8 9 10 Dick s Group 1 3 4 5 6 7 8 9 10 Harriet s Group THE RANGE The rage is the simplest measure of dispersio. It compares the highest score ad the lowest score achieved for a give set of scores. The rage ca be expressed i two ways: (a) with a statemet such as, The scores raged from (the lowest score) to (the highest score), or (b) with a sigle umber represetig the differece betwee the highest ad lowest score. Rage The simplest measure of dispersio that compares the highest score ad the lowest score achieved for a give set of scores. I the case of Harriet s group, whose scores were 6, 6, 10, ad 10, we would say, The scores raged from 6 to 10. Or we could express the rage as the differece betwee 6 ad 10 (10 6) or 4. The scores i Harriet s group had a mea of 8 ad rage of 4. Now we ca compare the rages of the three groups.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 130 130 STATISTICS FOR THE SOCIAL SCIENCES Harriet s Group: Scores raged from 6 to 10. Rage 10 6 4. Dick s Group: Scores raged from 7 to 9. Rage 9 7. Tom s Group: Scores raged from 8 to 8. Rage 8 8 0. These rages correspod to the spread o the histograms for the three groups, with Harriet s group s scores beig most dispersed about the mea, Dick s beig less dispersed, ad Tom s havig o dispersio at all. Although we commoly make use of the rage i our day-to-day discourse, it really is ot a very meaigful measure of dispersio. Because oly the highest ad lowest scores are take ito cosideratio i fidig the rage, the other scores have o impact. Just as i the case of the mea where a extreme value of x ca distort the mea ad lesse its usefuless, the use of oly the extreme values ca reder the rage less useful. Our ext measure, the mea deviatio, rectifies this situatio. THE MEAN DEVIATION The mea deviatio (M.D.) (also called the average deviatio or the mea absolute deviatio) is sesitive to every score i the set. It is based o a strategy of first fidig out how far each score deviated from the mea of the scores (the distace from each score to the mea), summig these distaces to fid the total amout of deviatio from the mea i the etire set of scores, ad dividig by the umber of scores i the set. The result is a mea, or average, distace that a score deviates from the mea. Mea deviatio A average distace that a score deviates from the mea. To get the mea deviatio, we first fid the distace betwee each score ad the mea by subtractig the mea from each score. Let us use Harriet s group as a example. Harriet s Group x x x x 10 8 10 8 6 8 6 8 x 3 x 3 8 4

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 131 At this jucture, we ecouter a problem: We caot add up the x x colum to get the total amout of deviatio i the system. Recallig that the mea is the value of x that satisfies the expressio (x x ) 0, we ca see that if x 8, addig algebraically, the x x s for each studet i Harriet s group produce a sum of zero: (x x ) + 4 40 Measurig Dispersio 131 This is because the positive deviatios (where x is greater tha the mea) exactly balace the egative deviatios (where x is less tha the mea). Recall that we curretly are seekig the distace from each score to the mea, without regard to directio; that is, we do ot care whether x is greater or less tha x. Like a car s odometer, we wat to cout the distaces traveled, disregardig the directio or directios i which we drove. We do this by takig the absolute value of each x x, the distace disregardig its sig (i effect treatig all x x s as if they were positive umbers). We symbolize the absolute value of a deviatio as x x. Whe we add up all these absolute values, x x, we get the total amout of deviatio of the scores from the mea. Whe we divide that sum by the total umber of scores, we get the average amout (the mea amout) that a score deviated from the mea of all of the scores: the mea deviatio. Absolute value The distace or differece disregardig its sig. Here, the distace betwee each value of x ad the mea, regardless of whether x is greater tha the mea (a positive distace) or less tha the mea (a egative distace). Thus, For Harriet s Group: M.D. x x x x x x x x 10 8 10 8 6 8 4 6 8 x 3 x x 8 x 3 x x 4 8 M.D. 8 4.0

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 13 13 STATISTICS FOR THE SOCIAL SCIENCES For Dick s Group: x x x x x x 9 8 1 8 8 0 0 8 8 0 0 4 7 8 1 1 x 3 x x For Tom s Group: x 3 x x 4 8 M.D. 4 0.5 x x x x x x 8 8 0 0 8 8 0 0 8 8 0 0 4 8 8 0 0 x 3 x x 0 x 3 x x 4 8 M.D. 0 4 0 These results are i keepig with our expectatios: Harriet s group has the largest mea deviatio, Dick s has a smaller oe, ad Tom s has the smallest (a value of zero). THE VARIANCE AND STANDARD DEVIATION The formula for the variace resembles that of the mea deviatio except that x x is replaced by the expressio (x x ). Istead of takig the absolute value of each deviatio, we square it to get rid of egative umbers. (Remember that a egative umber times itself is a positive umber, just as a positive umber times itself is a positive umber.) Sice the squares of the deviatios greater tha oe uit will be much larger tha their respective absolute values, (x x ) will usually be larger tha x x, ad the fial variace will usually be larger tha the mea deviatio. To adjust for this ad produce a result more comparable to the

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 133 Measurig Dispersio 133 mea deviatio (more like a average amout of deviatio), we ofte take the positive square root of the variace, thus producig the stadard deviatio, idicated for ow by the letter s. Thus, (x x) Variace s (x x) Stadard Deviatio s Variace A average or mea value of the squared deviatios of the scores from the mea. Stadard deviatio The positive square root of the variace, which provides a measure of dispersio closer i size to the mea deviatio. Let us calculate s ad s for our three groups Tom s, Dick s, ad Harriet s whose mea deviatios were 0, 0.5, ad.0, respectively. Tom s Group x x x x (x x ) 8 8 0 0 8 8 0 0 8 8 0 0 8 8 0 0 (x x ) 0 Thus, s (x x) 0 4 0 s (x x) 0 4 0 0 The variace ad stadard deviatio both equal zero, as does the mea deviatio, for this group i which there is o dispersio at all.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 134 134 STATISTICS FOR THE SOCIAL SCIENCES Dick s Group x x x x (x x ) 9 8 1 1 8 8 0 0 8 8 0 0 7 8 1 1 (x x ) Thus, s (x x) 4 1 0.5 s (x x) 1 4 0.707 Remember that it is the stadard deviatio (0.7), ot the variace, which substitutes for the mea deviatio (0.5). Harriet s Group x x x x (x x ) 10 8 4 10 8 4 6 8 4 6 8 1 4 (x x ) 16 Thus, s (x x) 16 4 4.0 s (x x) 16 4 4.0 Let us compare our measures. See the histograms at the top of the ext page.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 135 Measurig Dispersio 135 f Rage 0 4 Mea Deviatio 0 Variace 0 3 Stadard Deviatio 0 1 0 1 3 4 5 6 7 8 9 10 Tom s Group f 4 3 1 0 1 3 4 5 6 7 8 9 10 Rage.0 Mea Deviatio 0.5 Variace 0.5 Stadard Deviatio 0.7 Dick s Group f 4 3 1 0 1 3 4 5 6 7 8 9 10 Rage 4.0 Mea Deviatio.0 Variace 4.0 Stadard Deviatio.0 Harriet s Group Below are the dispersio measures for artistic freedom for the o liberal arts majors, Group A, preseted i Chapter 4. Group A x x x x x x (x x ) 8 7 1 1 1 8 7 1 1 1 8 7 1 1 1 7 7 0 0 0 7 7 0 0 0 7 7 0 0 0 6 7 1 1 1 6 7 1 1 1 9 6 7 1 1 1 x 63 x x 6 (x x ) 6 The scores rage from 6 to 8. Rage 8 6.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 136 136 STATISTICS FOR THE SOCIAL SCIENCES x x 63 9 7.0 M.D. x x 6 9 3 0.67 Variace s (x x) 6 9 3 0.67 Stadard Deviatio s 0.67 0.8 Summary Group A Rage.00 Mea Deviatio 0.67 Variace 0.67 Stadard Deviatio 0.8 As metioed, the variace ad stadard deviatio are the most widely used measures of dispersio i statistics, eve though o the face of it, the mea deviatio would appear to be the most logical measure (ad easiest to calculate) of the three. The reaso is that the stadard deviatio has meaig i terms of a commo frequecy distributio kow as the ormal curve, which we will ecouter later i this text. THE COMPUTATIONAL FORMULAS FOR VARIANCE AND STANDARD DEVIATION The variace formula s (x x ) / is ofte referred to as the defiitioal formula sice it ot oly calculates the variace but also defies or explais what the variace is: the mea amout of the squared deviatios of the scores from the mea. (It is ofte quite difficult for those log away from algebraic formulas to see that defiitio, but it is there.) Defiitioal formula A formula that ot oly calculates the variace but also defies or explais what the variace is: the mea amout of the squared deviatios of the scores from the mea. For computatioal purposes, however, it is ofte easier to use oe of several alterative formulas, kow as computatioal formulas, particularly if a calculator is available. Oe such computatioal formula is the followig:

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 137 Measurig Dispersio 137 Computatioal formulas A formula that geerates the correct variace but does ot seek to defie what the variace is. x ( x) Variace s x ( x) Stadard Deviatio s Before we apply these formulas, we should make ote of the differece betwee two parts of the formula: x ad ( x), which are ot the same. The first, x, read summatio of x squared, tells us to square each x ad the add up all of the x s. The secod, ( x), read summatio of x, quatity squared, tells us to first add up all the xs to get x ad the square x to get ( x). (This follows the covetio of first doig what is iside a set of paretheses before doig what is outside of the paretheses.) Thus, we must add the origial scores ad square the sum, ad we must also square each origial score ad add up the squared values. Group A x x 8 64 8 64 8 64 7 49 7 49 7 49 6 36 6 36 9 6 36 x 63 x 447 s x ( x) 447 3969 9 9 6 9 3 0.67 ad s 0.67 0.8 447 (63) 9 9 447 441 9 ( x) (63) 63 63 3969 The aswers are obviously the same as whe we use the defiitioal formula. Ofte, the two results will differ slightly due to roudig error, particularly if the mea used i the defiitioal formulas is ot a whole umber (such as 7, i this case) but possesses several decimals (such as 7.,

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 138 138 STATISTICS FOR THE SOCIAL SCIENCES 7.3, 7.34, ad so o). Notice that the computatioal formula requires the calculatio of several large itermediate figures, such as the ( x) 3969. Sice such large umbers are ot eeded whe usig the defiitioal formula, we may questio the eed for a computatioal formula. If, however, there are may scores (eve as few as the 9 scores i Group A), it is faster ad easier to use the computatioal formulas. It is eve easier to use the computatioal formulas with today s advaced scietific, busiess, ad statistical calculators, which usually store x ad x i their memories for easy retrieval. BOX 5.1 Aother Formula for the Stadard Deviatio I Chapter 8, you will ecouter aother formula for the stadard deviatio, idicated by the lowercase Greek letter sigma with a circumflex above it ad read (believe it or ot) as sigma hat. (x x) ˆσ 1 Note that this formula is the same as the defiitioal formula we have just bee usig except that 1 replaces i the deomiator. Whe we wish to geeralize about some group (called a populatio) from data take from fewer people tha the etire group (called a sample), we ru ito a problem. Suppose I wated to geeralize about the ages of all residets of Thousad Oaks, Califoria (the populatio), from a sample of 0 residets of that tow. If I calculate the mea for my sample, I get the best estimate of the mea age of all that commuity s residets that my data will allow. However, if I estimate the populatio s stadard deviatio from my sample, usig the formula with i the deomiator, my estimate is iaccurate. I fact, the smaller the size of my sample, the less accurate my estimate of the populatio s stadard deviatio will be. It turs out that the formula with 1 i the deomiator gives us a better estimate of the populatio s stadard deviatio tha the formula with. Thus, you will see the 1 formula widely used i textbooks, calculators, ad computer programs. I fact, rarely ca we study whole populatios directly; so much of the time, we are really usig sample data to estimate populatio data. That is why the formula with 1 i the deomiator appears so ofte. (Cotiued)

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 139 Measurig Dispersio 139 (Cotiued) Fially, ote that may authors will state that the formula with i the deomiator is for a populatio s stadard deviatio ad the 1 formula is for a sample s stadard deviatio. That is ot quite correct, but sice most of the time what we really are doig is usig sample data to estimate populatio data, we really are ot iterested i the sample s stadard deviatio except as a estimate of the populatio s stadard deviatio. So, it is easier just to call the 1 formula the formula for a sample s stadard deviatio. That practice is ot followed i this textbook. VARIANCE AND STANDARD DEVIATION FOR DATA IN FREQUENCY DISTRIBUTIONS If the data are i frequecy distributios, the formulas give above will ot fid the correct variace or stadard deviatio. I a frequecy distributio, we must accout ot oly for each possible value of x but also for the umber of times, or frequecy, that value occurs. This is the same reaso we modified the formula for fidig the mea of a frequecy distributio i the previous chapter. Recall that i calculatig the mea for the liberal arts majors, Group B, we first established a fx colum ad added it up to get fx. We the divided fx by f(our ) to get the mea. For frequecy distributio data, the defiitioal formula for the variace is also adjusted so that before addig the squared deviatios, we multiply each squared deviatio by the frequecy of that particular value of x. Therefore, s [(x x) f ] [(x x) f ] f Group B x f fx x x x (x x ) (x x )f 9 18 7.5 1.5.5.5 4.50 8 3 4 7.5 0.5 0.5 0.5 3 0.75 7 3 1 7.5 0.5 0.5 0.5 3 0.75 6 1 7.5 1.5.5.5 4.50 f 10 fx 75 [(x x) f] 10.50 x fx fx 75 f 10 7.5

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 140 140 STATISTICS FOR THE SOCIAL SCIENCES Thus, the variace is s [(x x) f ] [(x x) f ] 10.50 1.05 f 10 ad the stadard deviatio is s 1.05 1.046 1.03 For data i frequecy distributios. there is also a adjusted computatioal formula. s To apply this to Group B, we must geerate colums for x i order to fid x ad x f i order to fid x f. We have already geerated a fx colum, but we eed to square its summatio. x f fx x x f 9 18 81 81 16 8 3 4 64 64 19 7 3 1 49 49 147 6 1 36 36 7 f 10 fx 75 x f 573 Thus, the variace is x f ( fx) ( fx) (75) 75 75 565 x f ( fx) f s x f ( fx) f 573 (75) 10 10 573 565 10 10 573 56.5 10 10.5 10 1.05 ad the stadard deviatio is s 1.05 1.03 The results are idetical to those foud usig the defiitioal formulas.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 141 Measurig Dispersio 141 We ow kow the primary measures for describig a sigle-iterval or ratio-level variable: the mea for cetral tedecy ad the stadard deviatio or variace for dispersio. With the latter two, we geerally use the stadard deviatio for descriptive purposes but retai the variace for use i procedures that will be discussed later i this text. With the exceptio of the rage, the measures of dispersio preseted i this chapter all assume iterval level of measuremet. (The rage may be applied also to ordial data: The guests at the $100-a-plate charity fudraiser raged from middle class to affluet. ) While measures of dispersio are widely used with iterval-level data, they are oly rarely used with lower levels of measuremet. Accordigly, such usage will ot be covered here. We have ow covered the last of the basic tools of descriptive data aalysis. With the itroductio of dispersio measures, particularly the variace ad the stadard deviatio, we ca begi the study of several statistical techiques widely applied i may disciplies. We will see that i additio to their role as useful descriptive tools, the mea ad the variace ofte plug ito other formulas. Thus, they do double duty. Armed with the tools itroduced so far, we will evetually retur to the task of fidig ad describig relatioships betwee two variables. CONCLUSION Chapter 5: Summary of Major Formulas Idividual Data The Mea Deviatio x x M.D. The Variace Defiitioal The Variace Computatioal s (x x) s x ( x)

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 14 14 STATISTICS FOR THE SOCIAL SCIENCES Frequecy Distributios Defiitioal Computatioal [(x x) s f ] Both Idividual ad Frequecy Distributio The Stadard Deviatio s the variace [(x x) f ] x f ( fx) s f x f ( fx) f f EXERCISES Note: For the followig exercises, refer to the exercises at the ed of Chapter 4 for the defiitios of the variables. Exercise 5.1 I the social worker sample (Exercises 4.7 to 4.9), a group of 9 private agecy employees was compared to a group of 16 public employees. Followig are the health care cost ratigs for the private agecy employees. Remember that the higher ratig idicates more cocer about the issue. Private Agecy Employees Health 70 55 15 10 5 5 5 0 0 1. Fid the mea Health score.. Fid the media. 3. Fid the mea deviatio. 4. Fid the variace usig the defiitioal formula. 5. Fid the variace usig the computatioal formula. 6. Fid the stadard deviatio.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 143 Measurig Dispersio 143 Exercise 5. Followig are the health care cost ratigs for the public employees: Public Employees Health 95 95 95 90 90 90 90 90 90 80 80 75 75 60 40 35 Form a frequecy distributio from the above, ad usig the appropriate formulas: 1. Fid the mea Health score.. Fid the media. 3. Fid the variace usig the defiitioal formula. 4. Fid the variace usig the computatioal formula. 5. Fid the stadard deviatio. 6. Compare the mea ad stadard deviatio of the public employees to those of the private agecy employees foud i Exercise 5.1. Which group s scores cluster more closely about its mea? Exercise 5.3 Maagemet persoel have bee scored o a scale measurig assertiveess of leadership style, where more assertiveess idicates less accommodativeess. Are fiacial ad bakig maagers more assertive tha their colleagues i other service idustries? Followig are scores for 7 maagers i fiace- or bakigrelated firms.

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 144 144 STATISTICS FOR THE SOCIAL SCIENCES Assertiveess 4 49 9 9 11 68 97 1. Fid the mea Assertiveess score.. Fid the media. (Note that you must first array the data from high to low scores.) 3. Fid the mea deviatio. 4. Fid the variace usig the defiitioal formula. 5. Fid the variace usig the computatioal formula. 6. Fid the stadard deviatio. Exercise 5.4 Followig are assertiveess scores for 18 maagers from ofiacial service idustries listed i a ugrouped frequecy distributio. x Assertiveess f 100 1 97 1 9 1 86 3 54 1 30 1 7 3 4 1 5 1 3 1 0 1. Fid the mea Assertiveess score.. Fid the media. 3. Fid the variace usig the defiitioal formula. 4. Fid the variace usig the computatioal formula. 5. Fid the stadard deviatio. 6. Compare the meas ad stadard deviatios of the ofiacial istitutio maagers to those foud i Exercise 5.3. Which group is more assertive? Which group s scores are more spread out about the mea?

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 145 Measurig Dispersio 145 Exercise 5.5 Below are the results, i pritout format, for the employee sample of Exercise 4.10 (refer to Exercise 4.10 for a defiitio of the variables). Please ote that this was ru usig SAS, oe of several statistical packages available (we will be discussig the most recet versio of SAS later i this book). Like most such packages, data are preseted with far more decimal places tha social scietists eed. While suitable for egieers ad some scietists, this level of precisio is ot suitable for the less exact measures that we use. Thus, whe discussig the results, we will roud to oe or two decimal places. I this exercise, workers have bee broke dow by regio, Midwest versus all other regios combied. Suppose it had bee rumored that the corporatio was plaig to close several plats ad move those jobs to plats i other coutries with lower wage scales. Suppose it had also bee rumored that oly plats i the Midwest would be exempt; i all other regios, some plats would be shut dow. Let us compare the attitudes of the employees. Reg Midwest Variable N Mea S.D. ATTEND 13 90.6153846 1.7890 BOARD 13 44.769308 19.663517 DIV 13 76.6153846 16.83031 SECUR 13 67.769308 8.458013 PARTIC 13 39.6153846 35.0868885 OPPOR 13 55.4615385 38.5413531 UNION 13 55.3846154 35.5844968 SALARY 13 65.693077 5.9466909 Reg Midwest Variable N Mea S.D. ATTEND 37 93.79797 5.87008 BOARD 37 34.7837838 18.161509 DIV 37 78.7837838 16.183134 SECUR 37 44.70707 3.619361 PARTIC 37 67.43434 30.5646315 OPPOR 37 30.07070 3.4349661 UNION 37 76.8918919 9.4380553 SALARY 37 49.6486486 7.4764710 1. Compare the meas for each variable. What do you coclude?. Which regio usually has the greater diversity o these dimesios as determied by comparig the stadard deviatios? I which two scales is that tedecy reversed?

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 146 146 STATISTICS FOR THE SOCIAL SCIENCES Exercise 5.6 Followig is a compariso of the maagerial group to the employee group. MGTPOP Variable N Mea S.D. ATTEND 89 9.3595506 9.93078 BOARD 89 57.113596 15.1840513 DIV 89 74.94380 16.43054 SECUR 89 56.1685393 3.447949 PARTIC 89 48.5955056 34.673716 OPPOR 89 4.580899 36.3509065 UNION 89 6.4719101 31.6307136 SALARY 89 53.8764045 4.57594 EMPLOY Variable N Mea S.D. ATTEND 50 9.900000 8.0997899 BOARD 50 37.3800000 18.783861 DIV 50 78.00000 16.081989 SECUR 50 50.7000000 3.974093 PARTIC 50 60.000000 33.76059 OPPOR 50 36.6400000 35.548614 UNION 50 71.3000000 3.118307 SALARY 50 53.800000 7.7501167 You have already compared the meas i Exercise 4.10. Now compare the stadard deviatios for each variable. What ca you coclude? For which variables are the maagers more diverse (have larger stadard deviatios)? For which variables are the employees more diverse? Exercise 5.7 The two discoteted groups, upper-middle maagemet ad white-collar employees, are compared i the followig sets of data. UPPER-MIDDLE MANAGEMENT Variable N Mea S.D. ATTEND 50 91.8000000 1.8364914 BOARD 50 47.00000 10.9195388 DIV 50 78.6400000 15.160044 SECUR 50 39.8000000 31.07040 PARTIC 50 7.1000000.0178499 OPPOR 50 16.4800000 18.04378 UNION 50 85.6600000 10.8130873 SALARY 50 36.1000000 11.1158097

05-Sirki-4731.qxd 6/9/005 6:40 PM Page 147 Measurig Dispersio 147 WHITE-COLLAR EMPLOYEES Variable N Mea S.D. ATTEND 9 93.3103448 5.1137311 BOARD 9 3.8965517 6.9710873 DIV 9 84.068965 9.4354169 SECUR 9 31.3103448 6.7956598 PARTIC 9 81.8965517 17.899115 OPPOR 9 13.174137 16.7333477 UNION 9 9.448758 10.968796 SALARY 9 35.0000000 14.767417 Compare the meas ad the the stadard deviatios for each variable. What do you coclude? Exercise 5.8 For the data i Exercise 4.1, calculate ad compare the stadard deviatios. Use the defiitioal formula to fid the variace for the exporters ad the computatioal formula to fid the variace for the oexporters. The fid ad compare the two stadard deviatios. Exercise 5.9 For the data i Exercise 4.4, calculate ad compare the stadard deviatios. Use the frequecy distributio defiitioal formula to fid the variace for the exporters ad the frequecy distributio computatioal formula to fid the variace for the oexporters. The fid ad compare the two stadard deviatios.

06-Sirki-4731.qxd 6/9/005 6:53 PM Page 148 cotigecy table cotrol variable KEY CONCEPTS spurious relatioships causal models atecedet variable iterveig variable