Kidane Tesfu Habtemariam, MASTAT, Principle of Stat Data Analysis Project work
|
|
- Margaret Pierce
- 5 years ago
- Views:
Transcription
1 1
2 1. INTRODUCTION Food label tells the extent of calories contained in the food package. The number tells you the amount of energy in the food. People pay attention to calories because if you eat more calories than your body uses, you might gain weight. This project paper presents a statistical analysis report of a research problem concerned with the accuracy of labeling of diet and health taken from the work published by David B.- Allison; September, 1993 in the Journal of the American Medical Association (JAMA). Research setting; Foods were sampled from retail merchants throughout the borough of Manhattan, New York, NY. The researcher sampled 40 different food items across regionally distributed, nationally advertised and locally prepared. They measured the caloric content of each food item via bomb calorimeter and converted the readings into an estimate of total metabolically energy. In addition, they calculated the percentage difference between the measured calories and the labeled calories for each item and per gram. 1.1 Objectives Determine accuracy of caloric labeling of diet and health foods. Assess whether the accuracy differs for certain categories of food suppliers. Evaluate if there is evidence of overall underreporting/over reporting of calories per gram on food labels. Assess if the degree of underreporting/over reporting of calories per gram differ regional versus national. Evaluate if there is variability of under reporting/over reporting of calories per gram regional versus national. Analyze the degree of underreporting/over reporting of calories per item differ across food suppliers. Examine whether there is any relationship between the relative frequency of underreporting of calories per gram and the type of food supplier.
3 1. Data and Methodology Data: A sample of 40 food items including regionally distributed (n=1), nationally advertised (n=0), and locally prepared items (n=8). The data contains 8 missing values for locally prepared food in food label of per gram. Additionally, classification represents the three different food suppliers denoted as (R, N &L). The measurement of calories was on food per item and food per gram. For each food type a percentage difference between measured calorie minus labeled calorie ( +ve, underreporting) and percentage difference between measured calorie minus labeled calories(-ve, over reporting) obtained. Methodology Percentage difference of caloric labeling with positive value indicates underreporting whereas negative implies over reporting. In the first analysis a descriptive plots such as box plot, QQplots, histograms and bar plot were used to demonstrate the nature and pattern of the dataset. The histogram for per item overall indicates extremely right skewed and for per gram moderately right skewed. The QQ plot for per item is far from normality and the remedial measure taken was to transform into log scale where as plots of per gram behaves some how a normal with some extreme values at the upper tail of QQ plot. The QQ plot (per gram) is very sensitive for outlier and a solution was to remove outliers. The data contains 8 missing values in the locally prepared food for food label per gram; this is whole set of missing data for a particular variable. Possible solution applied was omitting them, reasoning can be they are whole set of data no way to replace them. After having confirmed that per gram measurements are approximately normally distributed with Shapiro test, Two sample T test was applied to evaluate if there is overall over reporting or under reporting exists. Further more, two sample T test was performed with in food labeling on per gram across the regionally advertised and nationally advertised foods to check if there is over reporting or underreporting. To examine if there is variability between region and nation on food labeling of food per gram, an F test, a test of variance was performed. Concurrently, in the caloric labeling per item group, caloric labeling was compared between the 3 food suppliers by one-way analysis of variance, after having transformed them into log 3
4 scale and removal of influential outliers (), then after inspected that food labeling measurements are approximately normal distributed with the same variance in each of the 3 study areas after necessary transformation taken place using Bartlett test. Post-hoc analysis was based on Tukey s Honest Significant Difference method. Food labeling effect estimates were obtained as explained in Section., and are reported together with Bonferroni-adjusted p-values and 95% confidence intervals. In examining the relationship between relative frequency of underreporting or over reporting of per gram across classification Fisher exact and Pearson Chi square was applied. In all analyses, examination of normality was based on QQ-plots and Shapiro test and of homoscedasticity (i.e. constancy of variance) on the F test and Bartlett test. P-values below 0.05 (or 5%) will be termed statistically significant. All analyses were conducted in R Version.6, using the package faraway, stats and car. Results Section Part I: Els Goetghebeur.1 Descriptive Statistics The mean percentage over label per item and per gram overall is 4% and 5% respectively. Mean percentage over label per item in regionally distributed foods were 5% (SD = 16%). Nationally advertised foods on per item their mean percentage over label were 0.13% (SD = 11%). Where as locally prepared foods on per item their mean percentage over label (mean difference) were 8% (SD = 84%). Regionally distributed foods per gram had mean % over label of 15% (SD = 19%). While nationally advertised foods per gram had mean percentage over label of -0.95% (SD = 8%) In locally prepared foods per gram data is missing (values not reported). More is explained in Table.1 Table.1 Statistical indicators % difference over food Labeling per gram Classification % difference over food Labeling per item Classification L*= locally prepared N*= advertised R*= distributed Nationally Regionally Classification L N R L N R L = 8, N = 0, R = 1 Mean % over label NA s -0.95% 14.67% 81.75% 0.15% 5.1% Std.Deviation NA s 8.10% 18.7% 83.97% 10.5% 16.07% Median NA s -1.0% 1.50% 70%.5% 6.50% Total N = 40 NA s = Missing 4
5 Fig Check for Normality of per item via plots The left panel plots (fig.1) of different type are indicating for untransformed data of per item. In the box plot section there appears to be with many extreme outliers and the mean and median are different (table.1). A plot of histogram was depicted to demonstrate the nature of skew ness and it can be seen in the plot that the data are right skewed. A final plot of Normal QQ-plot was done to assess the normality pattern and linearity of the graph and it appears that the data set of per item is not normally distributed. Alternative remedy will be transformation of the per item dataset in to log scale (solution for right skewed data). 5
6 .1. Check for Normality of per gram via plots The right panel plots (fig.1) of different type are indicating for untransformed data of per gram. Looking at the boxplot the mean and median are almost the same but the data contains some few outliers. An extension for checking skewenes was performed using histogram and final plot for inspection of normality is QQ plot and the data behaves some how normal even though at the upper tail the QQ plot deviates and this a signal for existence of outlier and possible remedy is removal of outlier.. Normality of the data Fig. 6
7 As the data for per item are right skewed transforming them into log scale gave the solution to meet the normality. In the case of data per gram an alternative measure for normality was only to cut the outliers and these outliers are very few and we can tolerate the absence of these values. The removed outliers are values (greater of 5) which are only 3 values. This way brings the dataset in to normality. Section Part II (Stefan Van Aelst) 3.1 Evaluation of overall underreporting/over reporting over labeling per gram In order to evaluate if there is an evidence of overall underreporting/over reporting of calories per gram a statistical test which is Independent T test was performed. To apply the T test assumptions has to be fulfilled and these assumptions are normality and independent observations. As it is briefly explained in the methodology section since the dataset per gram consists extreme values (outliers>5) one way to handle this problem is to cut the outlier and fit the normality test. The Shapiro test for normality have confirmed that the dataset for per gram is approximately normal with (P = 0.78) and the T test can be applied. Furthermore, the missing values were omitted during the analysis. Fig 3.1( Plot of per gram ) Fig 3.1 depicts the caloric labeling of per gram after removal of outliers and this ensures the normality of the data. 7
8 The null hypothesis for the T test is H : 0 0 gram Versus H : 0 1 gram Let gram represents the average percentage difference of caloric measurement over labeling. = gram measuredcalory - Labelledcalory > 0 ========= underreporting ( t 0.68) with P-val(P = 0.5) and 95% of CI is (-.56, 5.11) The out put from R is 8,0.05 The p value is larger than 0.05 and 95% of confidence interval includes zero. Thus; there is insufficient information or evidence to reject the null hypothesis (Ho: represents the mean percentage difference between measured calorie and labeled calorie is the same). With 95% confidence the true mean % difference of food labeling per gram lies some where between -.56 and Section Part III B.1 & B. (Stijn Vansteelandt) 3. Evaluation of the degree of underreporting/over reporting of calorie per gram To determine the degree of underreporting/over reporting of calories per gram across regionally distributed foods and nationally advertised foods a statistical test has been performed. The test statistics is independent T test. To guarantee the use of this test the assumptions of normality has to be fulfilled. The Shapiro test for normality have confirmed the normality of the data per gram with regionally distributed foods (P = 0.36) and with nationally distributed foods (P =0.99). In both cases the assumption is met and we can continue to apply to Independent T test. Fig 3. 8
9 The null hypothesis for the T test across nationally advertised foods H : 0 0 NA Versus H : 0 1 NA Let NA represents the average percentage difference of caloric measurement over labeling for nationally advertised foods. = - Labelledcalorie > 0 ========= underreporting NA measuredcalorie ( t 0.5) with (P = 0.61) and 95% of CI is (-4.74,.84) The out put from R is 19,0.05 The p value is larger than 0.05 and 95% of confidence interval includes zero. Thus; there is insufficient information or evidence to reject the null hypothesis (Ho: represents the mean percentage difference between measured calorie and labeled calorie is the same for nationally advertised food labels). With 95% confidence the true mean % difference of food labeling per gram of nationally advertised food lies some where between 4.74 and.84. The null hypothesis for the T test across regionally distributed foods H : 0 Versus H : 0 1 Let 0 RG RG RG represents the average percentage difference of caloric measurement over labeling for regionally distributed foods. = measuredcalorie - Labelledcalorie > 0 ========= underreporting RG ( t.71) with (P = 0.0) and 95% of CI is (.77, 6.56) The out put from R is 11,0.05 The p value is less than 0.05 and 95% of confidence interval excludes zero. Thus; the null hypothesis is rejected at 5% level of significance and conclude that the mean percentage difference of calorie measure per gram for regionally distributed food is not zero. With 95% confidence the true mean % difference of food labeling per gram of regionally distributed food lies some where between.77 and Evaluation of the variability in overall of underreporting/over reporting of calorie per gram on food labels. Fig 3.(box plot) indicates that the median for the food suppliers varies and to examine the variability an F test is performed. F test demonstrates the variability in variance and tests if variance is constant or not with in the two independent samples. 9
10 Let s formulate our null hypothesis as follows. 1 (Variance of nationally advertised foods per gram) and (variance of regionally distributed foods per gram) H : Versus 0 1 H : 1 1 F ration test = > F = S / S ===== ratio of variances 1 ( F 0.19) with (P = 0.001) and 95% of CI is (0.06, 0.5). The out put from R is 19,11 The P value is highly significant and hence we reject the null hypothesis and conclude that the variance is not constant across nationally advertised foods per gram and regionally distributed foods per gram. Section Part III C (Stijn Vansteelandt) Figure 3.3 Figure 3.3 suggests that the average percentage difference of caloric labeling of food per item is higher in the locally prepared food than in the regional and national food suppliers. The m 10
11 This is confirmed by the one-way analysis of variance test, which reveals a significant difference in average percentage difference of caloric labeling between the food suppliers. The box plot depicted in fig 3.3 (green colors) indicates the untransformed value of caloric labeling per item deviates extremely from normality. One-way analysis of variance test prerequisite is homogeneity of variance and the Bartlett s test turns to be highly significant before transformation (P = 3.e-07). To meet the requirement for normality transformation of the per item variable was performed by changing into logscale. Even with this method again the Bartlett s test brought significant difference in variance (P = 0.034). There seems to have some influential outliers in the dataset per item for locally distributed foods, and one way to handle this problem is to cut off the extreme values which are observations (not danger). After all the Bartlett s test have confirmed for constant variance and appropriateness of the test with (P =0.1). The results for the F value are in logscale which are (F, 5 =5.91) and P-value (P=0.0078) which are highly significant. The null hypothesis for one-way analysis of variance H : 0 L R N Versus H A : at least 1 of the population means differs. F betweenmse Ho F withinmse k 1, n k Post hoc analysis shows that the average % difference in caloric labeling measurement in per item in log scale equals -1.8 (95% confidence interval (-.49,-0.07, P =0.036) between food suppliers of national and local,-0.01(95% confidence interval (-1., 1.19, P = 0.99) between food suppliers of regional and local, 1.7(95% confidence interval (0.5,.8, P = 0.01) between food suppliers of regional and national. All the above values are reported in log scale. 11
12 Section Part V (Yves Rossel) In evaluating for a possible relationship that may exist between the (relative) frequency of the underreporting of calories per gram and the type of food supplier, a Fisher s exact test for count data shows that nationally advertised food supplier didn t have significant relationship (P= 0.06) with regionally advertised food suppliers with respect to the frequency of the underreporting of calories per gram. The test is based on small sample method. The odd ration is and the 95% CI is (0.015, 1.13). A similar result of non-significance was obtained using the Pearson s Chi square test (P = 0.056). In sum there is no significant relationship between the relative frequency of underreporting of the calories per gram and the type of food supplier. In examining the degree of association between results were also obtained using Sakoda s method with 0.46 which can be associated as weak relationship. Another technique applied was fitting 3x tables (see appendix and R code) in which the missing values where fitted as structural zeros for locally prepared foods. The log linear model from the Poisson distribution fitted gave a deviance (G = 4.89) with 1 degree of freedom. Conclusion These findings suggest that food labels may be inadequate sources for caloric monitoring. Health care professionals should consider the accuracy of caloric labeling when advising patients to use food labels to help monitor their caloric intake. All locally prepared food labels per item had reported significant difference of underreporting/over reporting of caloric measurement. All regionally distributed food labels per item had reported significant difference of underreporting/over reporting of caloric measurement. All nationally advertised food labels per item had no significance difference of under reporting/over reporting of caloric measurement. The overall underreporting/over reporting of calories per item differs significantly regional versus national and national versus local where as comparison between regional and local is not significant. The overall underreporting/over reporting of calories per gram differs significantly between the regional and national food labels. There is no significant association between the relative frequencies of underreporting/over reporting of calories per gram with the type of food suppliers. 1
13 Appendix I ( The data set) Appendices Appendix II (Barplot of data by classification 13
Profile Analysis. Intro and Assumptions Psy 524 Andrew Ainsworth
Profile Analysis Intro and Assumptions Psy 524 Andrew Ainsworth Profile Analysis Profile analysis is the repeated measures extension of MANOVA where a set of DVs are commensurate (on the same scale). Profile
More informationMMI 409 Spring 2009 Final Examination Gordon Bleil. 1. Is there a difference in depression as a function of group and drug?
MMI 409 Spring 2009 Final Examination Gordon Bleil Table of Contents Research Scenario and General Assumptions Questions for Dataset (Questions are hyperlinked to detailed answers) 1. Is there a difference
More informationStepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality
Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,
More informationANOVA in SPSS (Practical)
ANOVA in SPSS (Practical) Analysis of Variance practical In this practical we will investigate how we model the influence of a categorical predictor on a continuous response. Centre for Multilevel Modelling
More informationBefore we get started:
Before we get started: http://arievaluation.org/projects-3/ AEA 2018 R-Commander 1 Antonio Olmos Kai Schramm Priyalathta Govindasamy Antonio.Olmos@du.edu AntonioOlmos@aumhc.org AEA 2018 R-Commander 2 Plan
More informationbivariate analysis: The statistical analysis of the relationship between two variables.
bivariate analysis: The statistical analysis of the relationship between two variables. cell frequency: The number of cases in a cell of a cross-tabulation (contingency table). chi-square (χ 2 ) test for
More informationChapter 23. Inference About Means. Copyright 2010 Pearson Education, Inc.
Chapter 23 Inference About Means Copyright 2010 Pearson Education, Inc. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it d be nice to be able
More informationUnderstandable Statistics
Understandable Statistics correlated to the Advanced Placement Program Course Description for Statistics Prepared for Alabama CC2 6/2003 2003 Understandable Statistics 2003 correlated to the Advanced Placement
More informationBiostatistics II
Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationTitle: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items
Title: A new statistical test for trends: establishing the properties of a test for repeated binomial observations on a set of items Introduction Many studies of therapies with single subjects involve
More informationDr. Kelly Bradley Final Exam Summer {2 points} Name
{2 points} Name You MUST work alone no tutors; no help from classmates. Email me or see me with questions. You will receive a score of 0 if this rule is violated. This exam is being scored out of 00 points.
More informationStudy of cigarette sales in the United States Ge Cheng1, a,
2nd International Conference on Economics, Management Engineering and Education Technology (ICEMEET 2016) 1Department Study of cigarette sales in the United States Ge Cheng1, a, of pure mathematics and
More informationOne-Way Independent ANOVA
One-Way Independent ANOVA Analysis of Variance (ANOVA) is a common and robust statistical test that you can use to compare the mean scores collected from different conditions or groups in an experiment.
More informationHS Exam 1 -- March 9, 2006
Please write your name on the back. Don t forget! Part A: Short answer, multiple choice, and true or false questions. No use of calculators, notes, lab workbooks, cell phones, neighbors, brain implants,
More informationSimple Linear Regression the model, estimation and testing
Simple Linear Regression the model, estimation and testing Lecture No. 05 Example 1 A production manager has compared the dexterity test scores of five assembly-line employees with their hourly productivity.
More informationBusiness Research Methods. Introduction to Data Analysis
Business Research Methods Introduction to Data Analysis Data Analysis Process STAGES OF DATA ANALYSIS EDITING CODING DATA ENTRY ERROR CHECKING AND VERIFICATION DATA ANALYSIS Introduction Preparation of
More informationCHAPTER ONE CORRELATION
CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to
More informationappstats26.notebook April 17, 2015
Chapter 26 Comparing Counts Objective: Students will interpret chi square as a test of goodness of fit, homogeneity, and independence. Goodness of Fit A test of whether the distribution of counts in one
More informationProblem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol.
Ho (null hypothesis) Ha (alternative hypothesis) Problem #1 Neurological signs and symptoms of ciguatera poisoning as the start of treatment and 2.5 hours after treatment with mannitol. Hypothesis: Ho:
More informationChapter 25. Paired Samples and Blocks. Copyright 2010 Pearson Education, Inc.
Chapter 25 Paired Samples and Blocks Copyright 2010 Pearson Education, Inc. Paired Data Data are paired when the observations are collected in pairs or the observations in one group are naturally related
More informationSTATISTICS AND RESEARCH DESIGN
Statistics 1 STATISTICS AND RESEARCH DESIGN These are subjects that are frequently confused. Both subjects often evoke student anxiety and avoidance. To further complicate matters, both areas appear have
More informationLAB ASSIGNMENT 4 INFERENCES FOR NUMERICAL DATA. Comparison of Cancer Survival*
LAB ASSIGNMENT 4 1 INFERENCES FOR NUMERICAL DATA In this lab assignment, you will analyze the data from a study to compare survival times of patients of both genders with different primary cancers. First,
More informationResearch Methods in Forest Sciences: Learning Diary. Yoko Lu December Research process
Research Methods in Forest Sciences: Learning Diary Yoko Lu 285122 9 December 2016 1. Research process It is important to pursue and apply knowledge and understand the world under both natural and social
More informationBusiness Statistics Probability
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationBasic Biostatistics. Chapter 1. Content
Chapter 1 Basic Biostatistics Jamalludin Ab Rahman MD MPH Department of Community Medicine Kulliyyah of Medicine Content 2 Basic premises variables, level of measurements, probability distribution Descriptive
More informationMidterm Exam MMI 409 Spring 2009 Gordon Bleil
Midterm Exam MMI 409 Spring 2009 Gordon Bleil Table of contents: (Hyperlinked to problem sections) Problem 1 Hypothesis Tests Results Inferences Problem 2 Hypothesis Tests Results Inferences Problem 3
More informationApplied Statistical Analysis EDUC 6050 Week 4
Applied Statistical Analysis EDUC 6050 Week 4 Finding clarity using data Today 1. Hypothesis Testing with Z Scores (continued) 2. Chapters 6 and 7 in Book 2 Review! = $ & '! = $ & ' * ) 1. Which formula
More informationSUMMER 2011 RE-EXAM PSYF11STAT - STATISTIK
SUMMER 011 RE-EXAM PSYF11STAT - STATISTIK Full Name: Årskortnummer: Date: This exam is made up of three parts: Part 1 includes 30 multiple choice questions; Part includes 10 matching questions; and Part
More informationList of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition
List of Figures List of Tables Preface to the Second Edition Preface to the First Edition xv xxv xxix xxxi 1 What Is R? 1 1.1 Introduction to R................................ 1 1.2 Downloading and Installing
More informationReflection Questions for Math 58B
Reflection Questions for Math 58B Johanna Hardin Spring 2017 Chapter 1, Section 1 binomial probabilities 1. What is a p-value? 2. What is the difference between a one- and two-sided hypothesis? 3. What
More informationHere are the various choices. All of them are found in the Analyze menu in SPSS, under the sub-menu for Descriptive Statistics :
Descriptive Statistics in SPSS When first looking at a dataset, it is wise to use descriptive statistics to get some idea of what your data look like. Here is a simple dataset, showing three different
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationExamining differences between two sets of scores
6 Examining differences between two sets of scores In this chapter you will learn about tests which tell us if there is a statistically significant difference between two sets of scores. In so doing you
More informationOverview of Non-Parametric Statistics
Overview of Non-Parametric Statistics LISA Short Course Series Mark Seiss, Dept. of Statistics April 7, 2009 Presentation Outline 1. Homework 2. Review of Parametric Statistics 3. Overview Non-Parametric
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Correlation SPSS procedure for Pearson r Interpretation of SPSS output Presenting results Partial Correlation Correlation
More informationStill important ideas
Readings: OpenStax - Chapters 1 11 + 13 & Appendix D & E (online) Plous - Chapters 2, 3, and 4 Chapter 2: Cognitive Dissonance, Chapter 3: Memory and Hindsight Bias, Chapter 4: Context Dependence Still
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment
More informationStatistical reports Regression, 2010
Statistical reports Regression, 2010 Niels Richard Hansen June 10, 2010 This document gives some guidelines on how to write a report on a statistical analysis. The document is organized into sections that
More informationPoisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.
Dae-Jin Lee dlee@bcamath.org Basque Center for Applied Mathematics http://idaejin.github.io/bcam-courses/ D.-J. Lee (BCAM) Intro to GLM s with R GitHub: idaejin 1/40 Modeling count data Introduction Response
More informationAdvanced ANOVA Procedures
Advanced ANOVA Procedures Session Lecture Outline:. An example. An example. Two-way ANOVA. An example. Two-way Repeated Measures ANOVA. MANOVA. ANalysis of Co-Variance (): an ANOVA procedure whereby the
More information1) What is the independent variable? What is our Dependent Variable?
1) What is the independent variable? What is our Dependent Variable? Independent Variable: Whether the font color and word name are the same or different. (Congruency) Dependent Variable: The amount of
More informationMissy Wittenzellner Big Brother Big Sister Project
Missy Wittenzellner Big Brother Big Sister Project Evaluation of Normality: Before the analysis, we need to make sure that the data is normally distributed Based on the histogram, our match length data
More informationCollecting & Making Sense of
Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,
More informationDemonstrating Client Improvement to Yourself and Others
Demonstrating Client Improvement to Yourself and Others Understanding and Using your Outcome Evaluation System (Part 2 of 3) Greg Vinson, Ph.D. Senior Researcher and Evaluation Manager Center for Victims
More informationCreative Commons Attribution-NonCommercial-Share Alike License
Author: Brenda Gunderson, Ph.D., 05 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution- NonCommercial-Share Alike 3.0 Unported License:
More informationisc ove ring i Statistics sing SPSS
isc ove ring i Statistics sing SPSS S E C O N D! E D I T I O N (and sex, drugs and rock V roll) A N D Y F I E L D Publications London o Thousand Oaks New Delhi CONTENTS Preface How To Use This Book Acknowledgements
More informationMODULE S1 DESCRIPTIVE STATISTICS
MODULE S1 DESCRIPTIVE STATISTICS All educators are involved in research and statistics to a degree. For this reason all educators should have a practical understanding of research design. Even if an educator
More informationHealth Consciousness of Siena Students
Health Consciousness of Siena Students Corey Austin, Siena College Kevin Flood, Siena College Allison O Keefe, Siena College Kim Reuter, Siena College EXECUTIVE SUMMARY We decided to research the health
More informationStill important ideas
Readings: OpenStax - Chapters 1 13 & Appendix D & E (online) Plous Chapters 17 & 18 - Chapter 17: Social Influences - Chapter 18: Group Judgments and Decisions Still important ideas Contrast the measurement
More information11/24/2017. Do not imply a cause-and-effect relationship
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationStat Wk 9: Hypothesis Tests and Analysis
Stat 342 - Wk 9: Hypothesis Tests and Analysis Crash course on ANOVA, proc glm Stat 342 Notes. Week 9 Page 1 / 57 Crash Course: ANOVA AnOVa stands for Analysis Of Variance. Sometimes it s called ANOVA,
More informationSPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.
SPRING GROVE AREA SCHOOL DISTRICT PLANNED COURSE OVERVIEW Course Title: Basic Introductory Statistics Grade Level(s): 11-12 Units of Credit: 1 Classification: Elective Length of Course: 30 cycles Periods
More informationTypes of Statistics. Censored data. Files for today (June 27) Lecture and Homework INTRODUCTION TO BIOSTATISTICS. Today s Outline
INTRODUCTION TO BIOSTATISTICS FOR GRADUATE AND MEDICAL STUDENTS Files for today (June 27) Lecture and Homework Descriptive Statistics and Graphically Visualizing Data Lecture #2 (1 file) PPT presentation
More informationDescribe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo
Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter
More informationLecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics
Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose
More informationUPPER MIDWEST MARKETING AREA ANALYSIS OF COMPONENT LEVELS AND SOMATIC CELL COUNT IN INDIVIDUAL HERD MILK AT THE FARM LEVEL 2007
UPPER MIDWEST MARKETING AREA ANALYSIS OF COMPONENT LEVELS AND SOMATIC CELL COUNT IN INDIVIDUAL HERD MILK AT THE FARM LEVEL 2007 Staff Paper 08-01 Prepared by: Corey Freije 2008 Federal Milk Market Administrator
More informationYSU Students. STATS 3743 Dr. Huang-Hwa Andy Chang Term Project 2 May 2002
YSU Students STATS 3743 Dr. Huang-Hwa Andy Chang Term Project May 00 Anthony Koulianos, Chemical Engineer Kyle Unger, Chemical Engineer Vasilia Vamvakis, Chemical Engineer I. Executive Summary It is common
More informationMath Section MW 1-2:30pm SR 117. Bekki George 206 PGH
Math 3339 Section 21155 MW 1-2:30pm SR 117 Bekki George bekki@math.uh.edu 206 PGH Office Hours: M 11-12:30pm & T,TH 10:00 11:00 am and by appointment More than Two Independent Samples: Single Factor Analysis
More informationEXECUTIVE SUMMARY DATA AND PROBLEM
EXECUTIVE SUMMARY Every morning, almost half of Americans start the day with a bowl of cereal, but choosing the right healthy breakfast is not always easy. Consumer Reports is therefore calculated by an
More informationIntroduction & Basics
CHAPTER 1 Introduction & Basics 1.1 Statistics the Field... 1 1.2 Probability Distributions... 4 1.3 Study Design Features... 9 1.4 Descriptive Statistics... 13 1.5 Inferential Statistics... 16 1.6 Summary...
More informationAnalysis of Variance (ANOVA) Program Transcript
Analysis of Variance (ANOVA) Program Transcript DR. JENNIFER ANN MORROW: Welcome to Analysis of Variance. My name is Dr. Jennifer Ann Morrow. In today's demonstration, I'll review with you the definition
More informationSTAT445 Midterm Project1
STAT445 Midterm Project1 Executive Summary This report works on the dataset of Part of This Nutritious Breakfast! In this dataset, 77 different breakfast cereals were collected. The dataset also explores
More informationSouth Australian Research and Development Institute. Positive lot sampling for E. coli O157
final report Project code: Prepared by: A.MFS.0158 Andreas Kiermeier Date submitted: June 2009 South Australian Research and Development Institute PUBLISHED BY Meat & Livestock Australia Limited Locked
More informationResults & Statistics: Description and Correlation. I. Scales of Measurement A Review
Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize
More informationSurvey research (Lecture 1) Summary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.
Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.0 Overview 1. Survey research 2. Survey design 3. Descriptives & graphing 4. Correlation
More informationSurvey research (Lecture 1)
Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2015 Creative Commons Attribution 4.0 Overview 1. Survey research 2. Survey design 3. Descriptives & graphing 4. Correlation
More informationSection 6: Analysing Relationships Between Variables
6. 1 Analysing Relationships Between Variables Section 6: Analysing Relationships Between Variables Choosing a Technique The Crosstabs Procedure The Chi Square Test The Means Procedure The Correlations
More informationIntro to SPSS. Using SPSS through WebFAS
Intro to SPSS Using SPSS through WebFAS http://www.yorku.ca/computing/students/labs/webfas/ Try it early (make sure it works from your computer) If you need help contact UIT Client Services Voice: 416-736-5800
More informationData, frequencies, and distributions. Martin Bland. Types of data. Types of data. Clinical Biostatistics
Clinical Biostatistics Data, frequencies, and distributions Martin Bland Professor of Health Statistics University of York http://martinbland.co.uk/ Types of data Qualitative data arise when individuals
More informationTo open a CMA file > Download and Save file Start CMA Open file from within CMA
Example name Effect size Analysis type Level Tamiflu Hospitalized Risk ratio Basic Basic Synopsis The US government has spent 1.4 billion dollars to stockpile Tamiflu, in anticipation of a possible flu
More informationCOAL COMBUSTION RESIDUALS RULE STATISTICAL METHODS CERTIFICATION SOUTHERN ILLINOIS POWER COOPERATIVE (SIPC)
Regulatory Guidance Regulatory guidance provided in 40 CFR 257.90 specifies that a CCR groundwater monitoring program must include selection of the statistical procedures to be used for evaluating groundwater
More informationEPS 625 INTERMEDIATE STATISTICS TWO-WAY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY)
EPS 625 INTERMEDIATE STATISTICS TO-AY ANOVA IN-CLASS EXAMPLE (FLEXIBILITY) A researcher conducts a study to evaluate the effects of the length of an exercise program on the flexibility of female and male
More informationTutorial 3: MANOVA. Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016
Tutorial 3: Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016 Step 1: Research design Adequacy of sample size Choice of dependent variables Choice of independent variables (treatment effects)
More informationAppendix III Individual-level analysis
Appendix III Individual-level analysis Our user-friendly experimental interface makes it possible to present each subject with many choices in the course of a single experiment, yielding a rich individual-level
More informationStatistics Guide. Prepared by: Amanda J. Rockinson- Szapkiw, Ed.D.
This guide contains a summary of the statistical terms and procedures. This guide can be used as a reference for course work and the dissertation process. However, it is recommended that you refer to statistical
More informationSummary & Conclusion. Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0
Summary & Conclusion Lecture 10 Survey Research & Design in Psychology James Neill, 2016 Creative Commons Attribution 4.0 Overview 1. Survey research and design 1. Survey research 2. Survey design 2. Univariate
More informationSimple Linear Regression
Simple Linear Regression Assoc. Prof Dr Sarimah Abdullah Unit of Biostatistics & Research Methodology School of Medical Sciences, Health Campus Universiti Sains Malaysia Regression Regression analysis
More informationNORTH SOUTH UNIVERSITY TUTORIAL 1
NORTH SOUTH UNIVERSITY TUTORIAL 1 REVIEW FROM BIOSTATISTICS I AHMED HOSSAIN,PhD Data Management and Analysis AHMED HOSSAIN,PhD - Data Management and Analysis 1 DATA TYPES/ MEASUREMENT SCALES Categorical:
More informationPRINTABLE VERSION. Quiz 1. True or False: The amount of rainfall in your state last month is an example of continuous data.
Question 1 PRINTABLE VERSION Quiz 1 True or False: The amount of rainfall in your state last month is an example of continuous data. a) True b) False Question 2 True or False: The standard deviation is
More informationTable of Contents. Plots. Essential Statistics for Nursing Research 1/12/2017
Essential Statistics for Nursing Research Kristen Carlin, MPH Seattle Nursing Research Workshop January 30, 2017 Table of Contents Plots Descriptive statistics Sample size/power Correlations Hypothesis
More informationAPPENDIX N. Summary Statistics: The "Big 5" Statistical Tools for School Counselors
APPENDIX N Summary Statistics: The "Big 5" Statistical Tools for School Counselors This appendix describes five basic statistical tools school counselors may use in conducting results based evaluation.
More informationMATH 1040 Skittles Data Project
Laura Boren MATH 1040 Data Project For our project in MATH 1040 everyone in the class was asked to buy a 2.17 individual sized bag of skittles and count the number of each color of candy in the bag. The
More informationMidterm STAT-UB.0003 Regression and Forecasting Models. I will not lie, cheat or steal to gain an academic advantage, or tolerate those who do.
Midterm STAT-UB.0003 Regression and Forecasting Models The exam is closed book and notes, with the following exception: you are allowed to bring one letter-sized page of notes into the exam (front and
More informationSTAT 503X Case Study 1: Restaurant Tipping
STAT 503X Case Study 1: Restaurant Tipping 1 Description Food server s tips in restaurants may be influenced by many factors including the nature of the restaurant, size of the party, table locations in
More informationOne way Analysis of Variance (ANOVA)
One way Analysis of Variance (ANOVA) Esra Akdeniz March 22nd, 2016 Introduction Test hypothesis concerning one population mean. Test hypothesis concerning two population means What if we want to compare
More informationReadings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F
Readings: Textbook readings: OpenStax - Chapters 1 13 (emphasis on Chapter 12) Online readings: Appendix D, E & F Plous Chapters 17 & 18 Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions
More informationPOST GRADUATE DIPLOMA IN BIOETHICS (PGDBE) Term-End Examination June, 2016 MHS-014 : RESEARCH METHODOLOGY
No. of Printed Pages : 12 MHS-014 POST GRADUATE DIPLOMA IN BIOETHICS (PGDBE) Term-End Examination June, 2016 MHS-014 : RESEARCH METHODOLOGY Time : 2 hours Maximum Marks : 70 PART A Attempt all questions.
More information9 research designs likely for PSYC 2100
9 research designs likely for PSYC 2100 1) 1 factor, 2 levels, 1 group (one group gets both treatment levels) related samples t-test (compare means of 2 levels only) 2) 1 factor, 2 levels, 2 groups (one
More informationBiostatistics. Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego
Biostatistics Donna Kritz-Silverstein, Ph.D. Professor Department of Family & Preventive Medicine University of California, San Diego (858) 534-1818 dsilverstein@ucsd.edu Introduction Overview of statistical
More informationChapter 11. Experimental Design: One-Way Independent Samples Design
11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing
More informationIntroduction to Quantitative Methods (SR8511) Project Report
Introduction to Quantitative Methods (SR8511) Project Report Exploring the variables related to and possibly affecting the consumption of alcohol by adults Student Registration number: 554561 Word counts
More informationWhat you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu
What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test
More informationCollecting & Making Sense of
Collecting & Making Sense of Quantitative Data Deborah Eldredge, PhD, RN Director, Quality, Research & Magnet Recognition i Oregon Health & Science University Margo A. Halm, RN, PhD, ACNS-BC, FAHA Director,
More informationREVIEW ARTICLE. A Review of Inferential Statistical Methods Commonly Used in Medicine
A Review of Inferential Statistical Methods Commonly Used in Medicine JCD REVIEW ARTICLE A Review of Inferential Statistical Methods Commonly Used in Medicine Kingshuk Bhattacharjee a a Assistant Manager,
More informationBiostatistics for Med Students. Lecture 1
Biostatistics for Med Students Lecture 1 John J. Chen, Ph.D. Professor & Director of Biostatistics Core UH JABSOM JABSOM MD7 February 14, 2018 Lecture note: http://biostat.jabsom.hawaii.edu/education/training.html
More informationIntroduction to SPSS. Katie Handwerger Why n How February 19, 2009
Introduction to SPSS Katie Handwerger Why n How February 19, 2009 Overview Setting up a data file Frequencies/Descriptives One-sample T-test Paired-samples T-test Independent-samples T-test One-way ANOVA
More informationM 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 60
M 140 Test 1 A Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points 1-10 10 11 3 12 4 13 3 14 10 15 14 16 10 17 7 18 4 19 4 Total 60 Multiple choice questions (1 point each) For questions
More informationQuantitative Methods in Computing Education Research (A brief overview tips and techniques)
Quantitative Methods in Computing Education Research (A brief overview tips and techniques) Dr Judy Sheard Senior Lecturer Co-Director, Computing Education Research Group Monash University judy.sheard@monash.edu
More information