Constructing a Bivariate Table:

Introduction Bivariate Analysis: A statistical method designed to detect and describe the relationship between two nominal or ordinal variables (typically independent and dependent variables). Cross-Tabulation: A technique for analyzing the relationship between two or more nominal or ordinal variables. Allows for consideration of control variables. Constructing a Bivariate Table Column variable: A variable whose categories are the columns of a bivariate table (usually the independent variable). Row variable: A variable whose categories are the rows of a bivariate table (usually the dependent variable). Marginals: The row and column totals in a bivariate table. Chapter 6 1 Chapter 6 2 Example of Bivariate Table: Support for Abortion by Job Security (absolute numbers provided) Job Security Can Find Can Not Find Job Easy Job Easy Row Total Support for Yes 24 25 49 Abortion No 20 26 46 Column Total 44 51 95 Constructing a Bivariate Table: Percentages Can Be Computed in Different Ways: 1. Column Percentages: column totals as base 2. Row Percentages: row totals as base What are the column and row variables? Marginals? What is the disadvantage of providing only absolute numbers? What is the advantage of providing percentages? Chapter 6 3 Chapter 6 4 Column Percentages Effect of Job Security on Support for Abortion (absolute numbers in parentheses) Can Find Can Not Find Abortion Job Easy Job Easy Row Total Yes 55% 49% 52% (24) (25) (49) No 45% 51% 48% (20) (26) (46) Column Total 100% 100% Questions to Answer When Examining a Bivariate Relationship 1. What are the dependent and independent variable? 2. Does there appear to be a relationship? (one test for significance is chi square) 3. How strong is it? (there are measures of association for determining the strength of the relationship) 4. What is the direction of the relationship? Chapter 6 5 Chapter 6 6

Direction of the Relationship A Positive Relationship (as class goes up health goes up) Positive relationship: A bivariate relationship between two variables measured at the ordinal level or higher in which the variables vary in the same direction (both go up or both go down). Negative relationship: A bivariate relationship between two variables measured at the ordinal level or higher in which the variables vary in opposite directions. Chapter 6 7 Chapter 6 8 A Negative Relationship (as class goes up traumas go down) More Examples Which are likely to be positive relationships and which negative relationships? 1. The relationship between studying and grades 2. The relationship between partying and grades 3. The relationship between amount of sleep and grades 4. The relationship between color of shoes and grades Chapter 6 9 Chapter 6 10 Elaboration Elaboration Tests Spurious relationships Intervening relationships Conditional Relationships Elaboration is a process designed to further explore a bivariate relationship; it involves the introduction of control variables. A control variable is an additional variable considered in a bivariate relationship. The variable is controlled for when we examine the variables in the bivariate relationship. Chapter 6 11 Chapter 6 12

1. Testing for a spurious relationship A Direct causal relationship is a relationship between two variables that cannot be accounted for by other variables. It is a nonspurious relationship. A Spurious relationship is a relationship in which both the IV and DV are influenced by a third variable. The IV and DV are not causally linked, although it might appear so if one was unaware of the third variable. The relationship between the IV and DV is said to be explained away by the control variable. Example of a Bivariate Relationship that is probably spurious: # of Firefighters and Property Damage Number of Firefighters (IV) + Property Damage (DV) Chapter 6 13 Chapter 6 14 A second example of a spurious relationship: A relationship between two variables prior to considering a third variable: (that is, prior to elaboration) (two variables appear related) Sale of Ice Cream + Number of Outdoor Crimes Chapter 6 15 Chapter 6 16 Example of a third (control) variable causing a spurious relationship: (elaboration considers control variables) _ Sale of Ice Cream In-Class Assignment: Write down an example of a spurious relationship (don t confer with your neighbor just do the best you can to think of one) Winter Number of Outdoor Crimes Identify the dependent and independent variable and the control variable that is causing the spurious relationship? Chapter 6 17 Chapter 6 18

2. Elaboration can test for an intervening relationship Intervening relationship: a relationship in which the control variable intervenes between the independent and dependent variables. Intervening variable: a control variable that follows an independent variable but precedes the dependent variable in a causal sequence. Intervening Relationship: Examination of two variables prior to considering a third intervening variable Attending Parties (independent variable) Grades (dependent variable) Chapter 6 19 Chapter 6 20 Examination of an intervening variable between two other variables In-Class Assignment: What is another example of an intervening relationship? Attending Hours Studying Grades Parties (IV) (Intervening Variable) (DV) What is the dependent and independent variable and what is the control variable that is intervening between the two variables? Chapter 6 21 Chapter 6 22 3. Elaboration tests for Conditional Relationships Conditional relationship: a relationship in which the independent variable s effect on the dependent variable depends on (or is conditional on) the category of the control variable. The relationship between the independent and dependent variables will change according to the different conditions (or categories) of the control variable. Example of a relationship between two variables prior to considering a conditional control variable # of Dolls owned Hours spent Playing with Dolls Chapter 6 23 Chapter 6 24

Example of a Conditional Relationship Another Example of a Conditional Relationships # of Dolls owned Hours spent Playing with Dolls Gender of Child (two categories: male and female) Empowering the Employee (letting him/her make decisions) Job Satisfaction Chapter 6 25 Chapter 6 26 Another Example of a Conditional Relationships Empowering the Job Satisfaction Employee (letting him/her make decisions) Maslow s Hierarchy of Need (categories range from basic physical needs to self actualization and esteem needs) Three Goals of Elaboration 1. Elaboration allows us to test for spurious relationships 2. Elaboration clarifies the causal sequence of bivariate relationships by introducing variables hypothesized to intervene between the IV and DV. 3. Elaboration specifies the different conditions under which the original bivariate relationship might hold. Chapter 6 27 Chapter 6 28 Chi Square is a test of statistical significance. Chi Square Chi Square tests the null hypothesis of no difference or no association between variables. It answers the question: what is the probability, that we would have the results we have from our sample, if there is truly no association between the two variables.

Column Percentages Effect of Job Security on Support for Abortion (absolute numbers in parentheses) Can Find Can Not Find Abortion Job Easy Job Easy Row Total Yes 55% 49% 52% (24) (25) (49) Example: Table 11.1 below provides sample frequencies (taken from your book). What are the dependent and independent variables? In order to calculate a Chi Square for these data, what would you guess is the next step that should be taken? No 45% 51% 48% (20) (26) (46) Column Total 100% 100% Chapter 6 31 Table 11.1 provides actual frequencies from a sample. The next step is to calculate a table of no association. That is, if this sample had been drawn and there was absolutely no association between the variables what would the table look like? Table 11.3 provides a perfect table of no association. That is, what one would expect to find if these two variables were not associated. Chi Square calculates the chance that you would get the sample data (Table 11.1) if there is actually no association between the two variables (Table 11.3). (41.9%) (41.9%) (58.1%) (58.1%) In this example, the chi square is calculated by doing a mathematical comparison of the two tables. The resulting chi square is: 57.99 The chi square statistic is then looked for in the Distribution of Chi Square table and a p value is obtained (you will not be responsible for calculating the chi square or using the table). If you are using the computer, it will calculate the chi square and provide you with the p value. In this example the p value =.001. How would you interpret this p value when considering the association between first generation students and sex? How would you relate it to the null hypothesis?

In sum, once we have created our table of no association, we can, first, calculate a Chi- Square statistic by mathematically comparing the two tables and, second, determine the p value for that Chi Square number. Thus, Chi Square is the test statistic that summarizes the association/difference between the observed and the expected values in a bivariate table (the expected values are those we expect to find if there is no association/difference). It allows us to determine the probability that we would obtain the association/difference we see in our sample, if there is actually no association/difference between the two variables. Review 1. Estimating the population parameter from a random sample: You have been doing research examining the level of physical activity of the students at UNT. You prepared a questionnaire that has a measure for level of physical activity that ranges from 0 (no activity) to 100 (over two hours per day of strenuous physical activity seven days a week). After conducting a random sample survey of 1,500 students, you find the mean level of activity for your sample to be 45 with a standard deviation of 6.5. Use the information from your sample to estimate the level of physical activity for all UNT students (the population parameter) with 95% confidence. Hint: and Estimating the population parameter from a random sample: Sy = 6.5/sqrt of 1500 =.168 CI = 45 + 1.96(.168) CI = 44.67 45.33 Draw a picture of a normal curve and display the estimated population parameter. What is your conclusion? 2. Estimating whether a group within a population is different from the whole population. It is known that, nationally, doctors working for health maintenance organizations (HMOs) average 13.5 years of experience in their specialties, with a standard deviation of 7.6 years. The executive director of an HMO in a western state is interested in determining whether the doctors in her HMO have less experience than the national average. She could not afford to survey all of her HMO doctors. Therefore, she drew a random sample of 150 doctors from her HMO and examined their experience level. She found the sample to have a mean of only 10.9 years of experience. Z = o y Y u y or Group mean = 10.9 years experience Group size = 150 doctors Population mean = 13.5 years experience Population standard deviation = 7.6 years Group Mean Population Mean Population SD Do the doctors in the director s HMO really have less years of experience (on average) than doctors nationally as suggested by the sample, or, is the difference found between her sample and the doctors nationally due to sampling error? Calculate and show on a normal curve. What conclusions do you make regarding experience? Why? Chapter 13 41 Z = o y Y u y or Group Mean Population Mean Population SD Group mean = 10.9 years experience Group size = 150 doctors Population mean = 13.5 years experience Population standard deviation = 7.6 years 10.9 13.5-2.6 = = Z = - 4.2, p =.0001 7.6.62 150 Chapter 13 42

3. Estimating whether two groups are different from one another. Do students who miss a lot of class have lower grades than students who miss only a few classes? A small sample of students was drawn from the UNT population. Missed Many Missed A Few Mean grade 63.0 73.0 Standard deviation 13.266 18.205 N 06 15 Obtained t statistic - 1.450 Can the null hypothesis be rejected with 95% confidence? Explain. What conclusions can you reach about attending class?