Today s Agenda Wk 1 - Welcome to Stat 342

Similar documents
From Biostatistics Using JMP: A Practical Guide. Full book available for purchase here. Chapter 1: Introduction... 1

September 7 December 2, 2011

4Stat Wk 10: Regression

CASPER COLLEGE-COURSE SYLLABUS American Sign Language I ASL1200 Section 02 FALL 2017

Hour 2: lm (regression), plot (scatterplots), cooks.distance and resid (diagnostics) Stat 302, Winter 2016 SFU, Week 3, Hour 1, Page 1

BIOL 265: Human Anatomy & Physiology Fall 2016; MWF 12:30 1:20pm or 1:30 2:20pm, ISC 131

Term Paper Step-by-Step

MERCER COUNTY COMMUNITY COLLEGE Science and Health Professions

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

9TH ANNUAL GREENSBORO RUN/WALK FOR AUTISM SATURDAY, SEPTEMBER 30, 2017 GREENSBORO JAYCEE PARK 9 AM

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

ANATOMY AND PHYSIOLOGY

FSN 603 Nutrients & Food Processing

Biostatistics II

Data Analysis with SPSS

Office Phone: (307) Office Hours: Monday 7:00-8:00 a.m., Monday- Thursday 10:00-11:00 a.m.

15-110: Principles of Computing, Spring Problem Set 9 (PS9) Due: Friday, April 6 by 2:30PM via Gradescope Hand-in

Stat Wk 8: Continuous Data

American Sign Language III CASPER COLLEGE COURSE SYLLABUS ASL 2200 Section 01 Fall 2017

Content Part 2 Users manual... 4

BIOL 288: Human Anatomy & Physiology Fall 2015; MWF 12:30 1:20pm, ISC 131

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Human Biology (BIOL 104) Liberal Arts and Sciences Curriculum area 3 Fall 2018

The University of North Carolina at Chapel Hill School of Social Work

I. ASCRC General Education Form Group XI Natural Sciences Dept/Program Health and Human Course # HHP 236N

Anatomy And Physiology Exam 1 And Answer

Student Guide to EHR Go

Complex Regression Models with Coded, Centered & Quadratic Terms

OUTPATIENT SERVICES PSYCHOLOGICAL SERVICES CONTRACT

10. LINEAR REGRESSION AND CORRELATION

college essays help 2 hands

Division: Arts and Letters

Imperial Valley College Course Syllabus American Sign Language 2

CLNART 103 Culinary Nutrition Syllabus Fall 2017 (Monday/Wednesday)

AIDS: Science and Society. BILD 36 Summer 2013

Course Description: Learning Outcomes:

KPAC 290 Metabolic Conditioning Course Outline Fall 2014

Bangor University Laboratory Exercise 1, June 2008

Cancer Biology and Biotechnology (Biol 4015 & 7015) Fall, 2014

Section 6: Analysing Relationships Between Variables

Instructions for the ECN201 Project on Least-Cost Nutritionally-Adequate Diets

Introduction to SPSS S0

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

Computer Science 101 Project 2: Predator Prey Model

Vtct Anatomy And Physiology Past Exam Papers

BIOL 266: Human Anatomy & Physiology II Spring 2017; MWF 1:30 2:20pm, Newton 203

CASPER COLLEGE COURSE SYLLABUS American Sign Language I ASL 1200 Section 3 Fall Office Phone: (307)

PSYCHOLOGY 355: FORENSIC PSYCHOLOGY I

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).

TACKLING COMMON ISSUES FOR STUDENTS ON THE AUTISTIC SPECTRUM IN UNIVERSITY OR MAINSTREAM COLLEGE

CHAPTER ONE CORRELATION

The Clean Environment Commission. Public Participation in the Environmental Review Process

NUTR 43800: Micronutrient Metabolism in Human Health and Disease

PSYCHOLOGIST-PATIENT SERVICES

CIS192 Python Programming

HUMAN PHYSIOLOGY BIOLOGY 235 FALL 2009

World Occupational Therapy Day 2013

MAKING THE NSQIP PARTICIPANT USE DATA FILE (PUF) WORK FOR YOU

Anatomy And Physiology Answer Key

REQUIRED TEXTBOOK: Tye-Murray, Nancy, (2009) Foundations of Aural Rehabilitation. 3 rd Edition, Delmar Cengage Learning, Clifton Park, NY.

Chapter 3 Software Packages to Install How to Set Up Python Eclipse How to Set Up Eclipse... 42

Generalized Estimating Equations for Depression Dose Regimes

Cleaning Up and Visualizing My Workout Data With JMP Shannon Conners, PhD JMP, SAS Abstract

SOUTHERN MAINE COMMUNITY COLLEGE Course Syllabus NUTR 110 Normal Nutrition and Nutrition Lab

STAT 151B. Administrative Info. Statistics 151B: Introduction Modern Statistical Prediction and Machine Learning. Overview and introduction

Note: This is a W course. If you want writing credit, let your TA know when you turn in your final essay.

Texas Jurisprudence Dental Assistant Exam Study Guide READ ONLINE

York University Faculty of Health School of Kinesiology and Health Science

Office hours: Thursdays, PM, SY178 Office hours: Wednesdays, PM, SW142C

Ball State University Presents: Holiday Bash. GLACURH Regional Leadership Conference Program of the Year 2018 Oakland University

Interviewer: Tell us about the workshops you taught on Self-Determination.

Instructor Guide to EHR Go

2018 National ASL Scholarship

Get Instant Access to ebook Anatomy Saladin PDF at Our Huge Library ANATOMY SALADIN PDF. ==> Download: ANATOMY SALADIN PDF

Cyq Anatomy And Physiology Mock Exam Answers

Stat 13, Lab 11-12, Correlation and Regression Analysis

Demonstrating Client Improvement to Yourself and Others

STATISTICS & PROBABILITY

Abnormal Psychology Fall 2010 Syllabus

Living with Newton's Laws

Beginning Bender's Program By Ben Edwards, Grand Shiny Bastard DIESELCREW.COM THEGRIPAUTHORITY.COM

D1 D2 D3 D4 D5 D6 D7 D8 D9

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Chapter 1: Managing workbooks

University of Nebraska. Office of Campus Recreation

3.2 Society Staff Administrative Assistant... Zoya Nari Chief Executive Officer... Martin Wyant

CrystalPM - AOA MORE Integration and MIPS (CQM) Tutorial

Rutgers University Course Syllabus Atypical Child and Adolescent Development Fall 2016

Intro to SPSS. Using SPSS through WebFAS

Each Mind Matters is California s Mental Health Movement. We are millions of individuals and thousands of organizations working to advance mental

ANSC 322 APPLIED LIVESTOCK NUTRITION & FEEDING FALL 2015

San José State University Kinesiology KIN 35A-09, Beginning Weight Training, 50555, Fall Semester, 2016

MEDS 320: HUMAN CADAVERIC ANATOMY

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Captioning the podcasts in a UCI biology lecture for international and non-international students

The Biology of Gender and Sexuality

Dr. Kelly Bradley Final Exam Summer {2 points} Name

BlueBayCT - Warfarin User Guide

Transcription:

Today s Agenda Wk 1 - Welcome to Stat 342 - Policy - Some motivation - Course Schedule - Installing SAS - Resources available Stat 342 Notes. Week 1, Page 1 / 49

Contact: E-mail: jackd@sfu.ca Course website: http://www.sfu.ca/~jackd/stat3 42 Office Hours: 3-4pm Monday, Wednesday, and Thursday at K9510 (The stats lab), starting in Week 2. The stats lab is available for free tutoring in the through K9510, 9:30 4:30 M-F, and sometimes later hours. This room also has computers with R installed for you to use. Stat 342 Notes. Week 1, Page 2 / 49

What is this class about? My assumption is that most of you are senior undergraduates in statistics, and that you have little to no experience with SAS before, but that you have some general programming knowledge, likely with R. Also, I assume that you are taking this course because you want to learn programming, and because it is paired with Stat 341 - R. Stat 342 Notes. Week 1, Page 3 / 49

My intention is to bring maximal value to that assumed demographic. By the end of this course, successful students will be able to: 1. Write simple SAS programs (e.g. loading and saving datasets, creating derivative variables, printing the first few lines of a data set, or its variable names) with only minimal reference to outside sources (e.g. textbook, sas user guide, stack exchange). 2. Plan a typical analysis (e.g. regression, generalized model, contingency), including the input, formatting, and cleaning of data, and appropriate use of variables. Stat 342 Notes. Week 1, Page 4 / 49

By the end of this course, successful students will be able to: 3. Write SAS programs to run these full analyses with the aid of outside sources. 4. Produce and explain the standard output from several typical analyses that are done in SAS. 5. Pass the SAS Base Programmer certification exam with less than half of the preparation time of someone learning SAS from scratch. Stat 342 Notes. Week 1, Page 5 / 49

Grading policy 1/2 Minimum scores for letter grades A+ : 90% B-: 68% A: 85% C+: 64% A-: 80% C: 60% B+: 76% C-: 55% B: 72% D: 50% F: 0% Stat 342 Notes. Week 1, Page 6 / 49

Grading policy 2/2 Grades are based on 1 Midterms x 40% = 40% 1 Final x 50% = 50% 3 Assignments x 3% = 9% 1% free. Stat 342 Notes. Week 1, Page 7 / 49

About the assignments In previous semester, large groups of assignments have come in nearly identical, which defeats the purpose of having practice work, therefore this semester... The weight of assignments is very small, and assignments are only graded on completion, not on correctness. Keys will be made available soon after each assignment's due date. I recommend you do them not for the direct 10%, but for the greater preparation that it will afford you on midterms and and on the final exam. Stat 342 Notes. Week 1, Page 8 / 49

Handing in assignments, late policy - There are some drop boxes labelled Stat 342 outside the stats lab, on the main floor of Shrum Science Centre K. All assignments are to be handed in there by 4:30pm of the due date. - Assignments not in the drop-box (or handed in in-class) when they are picked up in the drop box will not be graded. -Assignments are graded by TAs, and solutions will be e- mailed out after the due date. Stat 342 Notes. Week 1, Page 9 / 49

About the textbook SAS and R Data Management, Statistical Analysis and Graphics, by Ken Kleinman and Nicholas J. Horton is ABSOLUTELY ESSENTIAL. By which I mean: You cannot succeed in this course without a copy of the textbook. It will be referenced frequently and extensively. Stat 342 Notes. Week 1, Page 10 / 49

The textbook used in this course and in Stat 341 is a fantastic reference guide. It has hundreds of examples of common tasks that can be done in both SAS and R with code and minimal commentary. Compared to a lot of textbooks, this feels a lot more like a supplement. I will be using these examples extensively, so this textbook is considered required for Stat 342. I will try to cover the material at the same time as Carl Schwarz does in Stat 341, which uses the same book. Stat 342 Notes. Week 1, Page 11 / 49

The lectures, however, will focus a lot more on the theory behind the examples in the book. To use the book more directly would involve a lot of memorization and not much understanding. You can think of the textbook as a recipe book, and the lectures as discussions of cooking technique. Additional readings will include excerpts from 1. Handbook of SAS DATA Step Programming (by Arthur Li), 2. SAS Certification Prep Guide: Base Programming for SAS 9 (by the SAS Institute) Stat 342 Notes. Week 1, Page 12 / 49

Finally, some motivation. Why learn SAS when we already have R? R is open source, and as Carl Schwarz will tell you that means it is free but not cheap. Packages for R are written all over the world by people with varying levels of skill, and varying levels of regard for common styles and documentation. SAS, on the other hand, is more consistent. It's developed mainly from a central campus in Cary, North Carolina. Stat 342 Notes. Week 1, Page 13 / 49

SAS, being closed source, is also able to provide a guarantee of quality. This is a big deal in situations where the output of an analysis has legal ramifications. It's excellent at handling very large databases. Much better than base R can. These two aspects make it the program of choice for analysis work done at Stats Canada in Ottawa, which is by far the biggest employer of BSc level statisticians in Canada. Stats Canada also has an annex at UBC and in the Health Sciences department here at SFU. Stat 342 Notes. Week 1, Page 14 / 49

Pharmaceutical drug trials in the United States, for example, have very strict controls regarding their data and registration. SAS is the tool of choice for pharmaceuticals and lots of other medical work in industry. Emmes Canada is a statistical consulting firm stationed in IRMACS that primarily operates in SAS. Stat 342 Notes. Week 1, Page 15 / 49

We will talk about the Base Programmer exam a lot in the course, but it's the first of 13 credentials that SAS offers, including Clinical Trials Programmer. See: https://support.sas.com/certify/creds/ct.html and https://support.sas.com/learn/ap/index.html SAS Canada, stationed in Toronto, wants to know whenever someone at SFU passes one of these exams, so successful candidates can be added to their hiring list. Employers regularly contact the Toronto office looking for people with SAS experience, and those opportunities get passed on to us. Stat 342 Notes. Week 1, Page 16 / 49

Finally, there's the SAS Institute itself, which hires thousands of statisticians and programmers to develop SAS and JMP. Here's how it places on the Forbes 2016 best employers list Stat 342 Notes. Week 1, Page 17 / 49

Here's SAS among information technology companies. Stat 342 Notes. Week 1, Page 18 / 49

Course Schedule Lectures will be... Thursday 12:30 PM 2:20 PM At the Education Building room 7618 at the Burnaby Campus. The education building is attached to AQ, and is on the opposite side from the Shrum Science buildings. Stat 342 Notes. Week 1, Page 19 / 49

Week 1 Thursday, Sept 7. Introduction to Stat 342, policies, schedule How to install SAS. Introduction to other references. General introduction to SAS. Stat 342 Notes. Week 1, Page 20 / 49

Week 2 Thursday, Sept 14. Textbook Chapters 1 and 2. Data steps and proc steps. The compile phase and the execution phase. Proc print. Proc contents. Input and output. Proc import, proc export, delimiters, the set command,, the input command. Dbms and file formats. (as time permits) Libraries and libref Stat 342 Notes. Week 1, Page 21 / 49

Week 3 Thursday, Sept 21. Spillover from Week 2, finishing input and output. SQL. Proc SQL, RMySQL, and MySQL in general. The artful software collection of SQL programs. Select, from, when. Connections in general. Sorts. Stat 342 Notes. Week 1, Page 22 / 49

Week 4 Thursday, Sept 28. Assignment 1 Due. SQL inner/left/right/outer merges, concatenations. (if time) Dashboards and PHP. The long format of data, transposing. Making new variables from old ones Transformations. Probability distributions. Stat 342 Notes. Week 1, Page 23 / 49

Week 5 Thursday, Oct 6. Textbook Chapter 3. Probability distribution application: Getting p-values from scratch with CDF. Example problems, overview of homework issues. Practice midterm. Stat 342 Notes. Week 1, Page 24 / 49

Week 6 Thursday, Oct 13. The practice midterm key will be given out over the Thanksgiving break, Oct 7-10. Midterm exam, 90 minutes (100 depending on room logistics). Stat 342 Notes. Week 1, Page 25 / 49

Week 7 Thursday, Oct 20. Integer and floating point operations. The similar subtraction issue. Matrix inversion example. Tolerance. Random number generation, specifically getting a sample of rows instead of the first few. Stat 342 Notes. Week 1, Page 26 / 49

Week 8 Thursday, Oct 27. Assignment 2 Due. Means, moments, quantiles, standardizing. Correlation. Proc freq, cross tabs and row/column totals. Tables option, Chisq option, cmh option, missprint option. McNemar s test (as time permits) Stat 342 Notes. Week 1, Page 27 / 49

Week 9 Thursday, Nov 3. Hypothesis tests normality, equal variance, t-test, proc univariate, proc ttest, proc npar1way. proc reg for regression with more details, model setting. proc glm for wider range of models. Note that glm standards for general linear models here, as in multivarible linear regression with interactions and categorical variables. Stat 342 Notes. Week 1, Page 28 / 49

Week 10 Thursday, Nov 10. Dummy variables and categorical variables. Diagnostics, leverage cook s D, residuals and residual plots. Prediction bounds, r-squared. proc logistic for binary responses, odds ratios. Theory: the sigmoid curve. Stat 342 Notes. Week 1, Page 29 / 49

Week 11 Thursday, Nov 17. (As time permits) Automated model selection like Stepwise regression Control of flow IF THEN DO loops Stat 342 Notes. Week 1, Page 30 / 49

Week 12 Thursday, Nov 24. Assignment 3 Due. Plots. Contour plots, density histogram, bar graphs, CDFs. (If time permits) Survival plots Kaplan-Meier. Stat 342 Notes. Week 1, Page 31 / 49

Week 13 Thursday, Dec 1. Buffer time. Final exam prep. Final exam: Monday. Dec 11 3:30 6:30pm Location To Be Announced Stat 342 Notes. Week 1, Page 32 / 49

Note about the drop deadline. Officially the deadline is Monday, Oct 10, but that's a statutory holiday and all offices will be closed. Consider Friday, Oct 7 the deadline, just in case. Also see: http://www.sfu.ca/students/deadlines/fall2016.html Stat 342 Notes. Week 1, Page 33 / 49

Regarding notes: Many course notes will be in a fill-in-the-blank system. Before each lecture, I will e-mail out notes as PDFs like I did with these ones, but with blanks to be filled in during class. The rest will be written during class using a document camera, and will be scanned into a PDF soon after the class. Stat 342 Notes. Week 1, Page 34 / 49

If you are taking notes on paper, I recommend printing these notes so that 4-6 slides appear on a single page. There are single-slide breaks between every 10-15 minutes of material. On these break slides, I like to include pictures of cute/funny animals with stupid stats puns. If there are any animals that you feel uncomfortable seeing (mice, reptiles, fish, birds, whatever), please e-mail me a request not to include those. Stat 342 Notes. Week 1, Page 35 / 49

Regarding collaboration, honesty, and plagiarism None of the assignments or exams for this course are recycled from previous sources. Anyone claiming to have a test bank for this offering of this course is lying. Please include the names of your collaborators on your assignments. This way, the markers will understand when some solutions look very similar that there wasn t blind copying. Stat 342 Notes. Week 1, Page 36 / 49

You are encouraged to work together to do the computational and analytical portions of the assignments. However, all written work is expected to be solely yours. Copying the writing of another student, or using services to write assignments on your behalf will be considered academically dishonest and will be dealt with as appropriate in SFU s academic dishonesty policy. The use of proofreading and essay skills services, such as those in the Student Learning Commons, is perfectly fine. Stat 342 Notes. Week 1, Page 37 / 49

Resources for Installing SAS SFU Software Library https://www.sfu.ca/itservices/technical/software.html http://www.sas.com/en_us/software/university-edition.html VirtualBox for PC, or VMWare for PC/Mac Stat 342 Notes. Week 1, Page 38 / 49

This is VirtualBox, it can be used to make a computer inside a computer. We'll be using it to make a SAS server locally on your computer. Download it first. Stat 342 Notes. Week 1, Page 39 / 49

Next, download SAS University edition, and open it with VirtualBox. Stat 342 Notes. Week 1, Page 40 / 49

Open the 2 GB file and choose to Import, with all the defaults. Stat 342 Notes. Week 1, Page 41 / 49

When it's imported, you can start the virtual machine you have created. Stat 342 Notes. Week 1, Page 42 / 49

The virtual machine will start up and show this screen. Leave it on. From your web browser, go to http://localhost:10000 Stat 342 Notes. Week 1, Page 43 / 49

http://localhost:10000 will bring you here. Press the big red button. Stat 342 Notes. Week 1, Page 44 / 49

A new tab opens up, letting you write, save, and run SAS code! Stat 342 Notes. Week 1, Page 45 / 49

Other SAS references. The library has several SAS books available, including... SAS Certification Prep Guide: Base Programming for SAS 9 Stat 342 Notes. Week 1, Page 46 / 49

SAS Certification Prep Guide: Base Programming for SAS 9 This book is available as an ebook, and can be found by searching for it at http://www.lib.sfu.ca/ Or by following this link directly (and using your SFU login) http://proquest.safaribooksonline.com.proxy.lib.sfu.ca/9781 607649243 This book covers data manipulation and the underlying mechanics of SAS in more detail than you'll want, but it is the authoritative guide if you want to take the cert exam. Stat 342 Notes. Week 1, Page 47 / 49

Stack exchange This is a question-and-answer forum for technical and programming questions of all sorts. The statistics part of it is called 'Cross Validated', but searching for 'stack exchange' will get you there. If you have a problem, there's a good chance someone else has had it first. http://stats.stackexchange.com/questions/tagged/sas Stat 342 Notes. Week 1, Page 48 / 49

SAS knowledge base documentation. These are most useful after you've spent some time with SAS. Like the R support documents that come up when you type?function, they have a lot of details that will be overwhelming until you know what you're looking for. http://support.sas.com/documentation/index.html Stat 342 Notes. Week 1, Page 49 / 49