Dr. Kelly Bradley Final Exam Summer {2 points} Name

{2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points. EPE/EDP 660 Exam 3 {5 points} Minitab (or other approved software) output must be included. It must be clearly labeled, with all answers clearly identified. In addition, you must include a copy of your session window. Do NOT include a copy of the worksheet. Directions: Read each question before responding. In order to receive partial credit, work must be shown. PART A: (2 points) Fill in each blank on the answer sheet with the best choice from the Word Bank. { point each blank} Word Bank for Items Accept Correlation Heteroscedastic Logistic Regression Replication ANOVA Descriptive Homoscedastic Multicollinearity Block Design Exaggeration Hypothesis Multiple Regression Qualitative Causation Experiment Inferential Parsimony Quantitative Completely Randomized Extrapolation Interaction Partitioning Type I Corollary Fail to Reject Least Squares Regression Reject Type II () The two branches of statistics are and. (2) The measures the direction and strength of the linear association between two variables. (3) is the idea that simpler models are easier to understand and appreciate, and therefore have a "beauty" that their more complicated counterparts often lack. (4) If H 0 is true and we reject it, we have made a error. (5) In a(n) design, the total sum of squares is made up of the treatment sum of squares and the error sum of squares. (6) Predicting y when the x values are outside the range of experimentation is. P a g e

(7) We refer to as the error term ε having constant variance σ 2 for all levels of the independent variables. (8) If the effect of a -unit change in one independent variable depends on the level of the other independent variable, we have a(n). (9) In, the β parameter is interpreted as the percentage change in odds for every -unit increase in x i holding all other x s fixed. (0) In a hypothesis test, if the p-value =.94 and you have set alpha at.05, you would the null hypothesis. () occurs when two (or more) independent variables in a regression are related; they measure essentially the same thing. True/False: Determine the correctness of each statement by assigning the best choice, or. Using the following table, answer items 2 4. ID Age Score [0-00%] Sex Disease [Relapse or Remission] 24 89 F Relapse 2 32 74 F Remission 3 36 77 F Remission 4 28 92 M Relapse (2) ID is an ordinal measure. (3) Sex could be classified as a qualitative variable and a nominal measure. (4) Score is a ratio measure. (5) When choosing a measure of central tendency, if the data set has extreme values, the mean would be the best measure. 2 P a g e

(6) Range and standard deviation are both measures of variability. (7) To test if all of the slope parameters are zero, we use an F test. (8) The value of SST does not change with the model, as it depends only on the values of the dependent variable y. (9) Once an interaction has been deemed important in a model, we cannot remove any associated first-order terms in the model. (20) In a completely randomized experimental design with 4 factors and 4 levels, 8 treatments exist. PART B: Short Answer (30 POINTS) Answer the questions below. {5 points each} () In hypothesis testing, does rejecting the null hypothesis prove that the research hypothesis is correct? Specifically, can we accept the alternative? Explain. (2) A colleague conducts a study and finds a positive correlation between income and health. She concludes that higher income causes better health. Is this a suitable conclusion? Explain. (3) Explain when we might use stepwise regression, and note at least one reason we would need to use caution in drawing inferences from a stepwise model. (4) In an experimental design, what is the purpose of blocking? Explain. (5) Consider the assumption of equal population variances in ANOVA. Why is this important? Explain. (6) In an ANOVA, why is it preferable to use a follow-up analysis such as Tukey s Multiple Comparisons of Means as opposed to multiple t-tests? 3 P a g e

PART C: Data Analysis (42 points) *** (Use α=.05) for testing purposes *** Consider the following data set (posted on the website as HSBfinal.xls) The High School and Beyond data set includes the following variables: sex (-male, 2-female), SES (Socioeconomic status: -low, 2-middle, 3-upper), school type (-public, 2-private), type of high school program (-general, 2-academic or 3-vocational), self-concept scores, and motivation level scores, in addition to test scores on an achievement test in writing. The data are posted as HSB Data for Final under Exams on the website.. Descriptive statistics were produced for all the continuous variables, including a correlation matrix. Using the output below, describe the distribution of each variable and their relationship with one another. {5 points} Descriptive Statistics: self concept, motivation, WRTG Variable N N* Mean StDev Minimum Q Median Q3 self concept 600 0 0.0049 0.7055-2.6200-0.3000 0.0300 0.4400 motivation 600 0 0.6608 0.3427 0.0000 0.3300 0.6700.0000 WRTG 600 0 52.385 9.726 25.500 44.300 54.00 59.900 Variable Maximum Skewness Kurtosis self concept.900-0.90.56 motivation.0000-0.59-0.88 WRTG 67.00-0.47-0.70 Correlations: self concept, motivation, WRTG self concept motivation motivation 0.289 WRTG 0.09 0.254 4 P a g e

Frequency Percent Dr. Kelly Bradley Final Exam Summer 203 2. A multiple regression equation was computed to explain the variation in Self-Concept, with a summary residual analysis. Using the output below, A. Write the regression model in population format. Label each component, i.e., main effect, error, etc. {5 points} B. Determine if the model has utility. Report your p-value and explain the decision. {3 points} C. Test the significance of the variables included. Interpret the results. {3 points} D. Do you feel the assumptions of regression have held in this analysis? Explain. {4 points} Regression Analysis: self concept versus motivation, SEX, motivation*sex The regression equation is self concept = 0.209 + 0.95 motivation - 0.403 SEX + 0.279 motivation*sex Predictor Coef SE Coef T P Constant 0.2094 0.875.2 0.265 motivation 0.953 0.2595 0.75 0.452 SEX -0.4034 0.85-3.4 0.00 motivation*sex 0.2792 0.602.74 0.082 S = 0.666568 R-Sq =.2% R-Sq(adj) = 0.7% Analysis of Variance Source DF SS MS F P Regression 3 33.34.4 25.0 0.000 Error 596 264.80 0.444 Total 599 298.5 Plots for self concept 99.99 Normal Probability Plot Versus Fits 99 90 50 0 0 - -2 0.0-3.0 -.5 0.0.5 3.0-3 -0.6-0.4-0.2 0.0 Fitted Value 0.2 00 75 Histogram 0 Versus Order 50-25 -2 0-2.4 -.8 -.2-0.6 0.0 0.6.2-3 50 00 50 200 250 300 350 400 450 Observation Order 500 550 600 5 P a g e

3. Using an ANOVA approach A. Conduct an analysis to determine if there is a significant difference between the self-concept of students by SES ( = low, 2 = average, 3 = high). {3 points} i. Produce the 4 in plot. { point} ii. Produce the comparative boxplots. {2 points} iii. Make sure to run Tukey s post hoc. {2 points} B. Based on your results, is there sufficient evidence of a difference between the self-concept of students for different SES levels? Explain. {3 points} C. If you found an overall difference, where did the individual differences lie? Justify your answer. {3 points} 4. Researchers decided to block on School Type to attempt to control for variation. A. List the explained and unexplained components of the model. List the random effect(s). {4 points} B. Using the output below, determine if the blocking was useful. Explain. {4 points} General Linear Model: self concept versus SES, School Type Factor Type Levels Values SES fixed 3, 2, 3 School Type random 2, 2 Analysis of Variance for self concept, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P SES 2 4.507 4.2549 2.274 4.32 0.04 School Type 0.052 0.052 0.052 0. 0.745 Error 596 293.5972 293.5972 0.4926 Total 599 298.50 S = 0.70864 R-Sq =.53% R-Sq(adj) =.03% When you are finished, submit your exam and celebrate. You have just completed 660 in the 4-week summer session! 6 P a g e