Administrative notes. Computational Thinking ct.cs.ubc.ca

Similar documents
Session 1: Fibre and health

Carbohydrate Counter Mobile Phone Application

Wellness 360 Online Nutrition Program* Session 3: Reducing Fat and Calories

Session 1: Fibre and Health

Data Mining in Bioinformatics Day 4: Text Mining

New Food Label Pages Diabetes Self-Management Program Leader s Manual

Healthy Eating & Staying Healthy

Healthier On the Go Meals & Snacks

CONTENTS Importance of sports nutrition The basics of sports nutrition Breakfast Lunch Dinner Snacks Fluids Eating before exercise

NUTRITION EDUCATION LESSON CODE FG MyPyramid: Simple Steps for Healthy Living

7 tips. To get Through the Holidays Without Gaining Weight BY SHANNON CLARK, CPT

How to Feed Your Growing Child

QUESTIONS AND ANSWERS

Youth4Health Project. Student Food Knowledge Survey

Warm-Up 92. The Importance of Good Health. A eating good food B exercising. C playing video games D getting good sleep

Group Session 3. Physical Fitness instructor or video

Test date Name Meal Planning for the Family Study Sheet References: Notes in class, lectures, labs, assignments

Appendix D Workout Journal

My Review of John Barban s Venus Factor (2015 Update and Bonus)

Grade 3: Nutrition Lesson 5: Eating Right to Support Your Skeletal System

Knowledge, Attitudes and Behaviors Questionnaire (KAB)

Combine Multiple Servings by Volume

1 ONE MY FUEL UP PLATE. LESSON

Offer versus Serve Webinar. Questions and Answers. 1. Question: Do croutons count as part of grains when mixed with Caesar salad?

Good Grinding for Wise Dining. Choosing Foods Lesson 12: Meal Planning. Let s make a meal plan, yes, we can

June 28, 2008 General Assembly, Fort Lauderdale Presented by Linda Laskowski Unitarian Universalist Church of Berkeley

Food for thought. Easy read information

Lesson 8 Setting Healthy Eating & Physical Activity Goals

Calling All Sodium Sleuths!

a. This is the same as for the general public, but people with diabetes, like the rest of the public, often eat more salt than they need.

5 key steps to managing your blood sugar, naturally! Brought to you by Lily Nichols, RDN, CDE, CLT Author of Real Food for Gestational Diabetes

The Weitzlab Survival Guide. Group meeting 9/14/10

YOUR SOLUTION TO MEDICAL UNCERTAINTY members.bestdoctors.com

Your solution to medical uncertainty members.bestdoctors.com

YOUR SOLUTION TO MEDICAL UNCERTAINTY members.bestdoctors.com

If adaptations were made or activity was not done, please describe what was changed and why. Please be as specific as possible.

Building a balanced meal

How many of you have gone grocery shopping without knowing what to buy or what foods to make? How many of you have gone to the grocery store and

Creating Condom Confidence

POMP Home-Delivered Meals

Session 3: Overview. Quick Fact. Session 3: Three Ways to Eat Less Fat and Fewer Calories. Weighing and Measuring Food

Session 1 Reading. Directions. Session One 3 Go On

PET/CT Patient Information

APPENDIX C: MATERIALS FOR STEP 5 DATA ANALYSIS EXERCISES

Activity #5: The Glycemic Index

Step Up and Celebrate

GRADE 4 SURVEY PART 1: School Garden Knowledge Questions i

Keeping the Body Healthy!

GRADE 5 SURVEY PART 1: School Garden Knowledge Questions i

Chapter 6: MyPlate. Lesson Objectives. Review the Last Chapter. Helpful Hints. 1. To introduce and teach participants about MyPlate (vs. MyPyramid).

Instructions continue on the next page, please turn over.

What and when to feed your child (6 to 24 months)

EMERGENCY ESSENTIALS FOOD STORAGE

30-DAY CLEAR SKIN PROGRAM PROTOCOL

Jump in for Healthy Choices

English *P48984A0112* E202/01. Pearson Edexcel Functional Skills. P48984A 2015 Pearson Education Ltd. Level 2 Component 2: Reading

Midterm project due next Wednesday at 2 PM

Week 2 Video 3. Diagnostic Metrics

What Does My Body Need to Grow?

Chapter 1: Food Guide Pyramid

Beating Diabetes PART 2. Guide To Starting A Worry Free Life. Foods You Need To Eat To Kick Start Normal Blood Sugar.

Descriptive Statistics

Healthy Food for Healthy Adults

Everyone s journey is different; our motivations are relatively the same.

Unit J: Adjusting Standardized Recipes

A STARTER BOOK OF CAMPAIGN TEMPLATES

Chartwells and Being Diabetic Friendly. When you look around campus do you ever stop to think about the many students and

Arlington Food Assistance Center New Volunteer Introduction

1. Lean Meats and Fish. 2. Lots of Veggies

Tips for a Diabetes Diet

The emotional side of diabetes

Activity 4.2 Dissolving a substance in different liquids

Week of November 1-2, 2018

Network for a Healthy California Retail Program Fruit and Vegetable Store Tour Guide

4 Additional lines at the end of the booklet can be used if more space is required for answers or if you need to do any rough work.

Shop smart. A new way of spending your money on food to balance your diet and your food budget.

Nutrition Education for ESL Programs LEVEL: BEGINNING HIGH. Nutrition Standard. Content Objective. Behavior Change Objective. Language Objective

LEVEL: BEGINNING HIGH

Multiple Daily Injection (MDI) & Carbohydrate (CHO) Counting Assessment Tool

Macro tracking 101. But before we get into all that the big question: What is a macro?

Reducing household food and drink waste Date labels and storage guidance Project findings. May 2011

BE SURE EVERYONE IN THE AUDIENCE (AND THE VOLUNTEERS!) IS OPEN TO THE FOTONOVELA PAGE BEFORE PROCEEDING.

Slide 1. Welcome to a short training on the USDA Child Nutrition Labeling Program. This is what is most commonly referred

Effective Date: 9/14/06 NOTICE PRIVACY RULES FOR VALUEOPTIONS

Top Tips for Top Kids Leaflet 2016 v1_4.indd 1 02/03/ :17

Interpreter Services. How to Effectively Work with Interpreters and Translators to Communicate with Your Patients. UWMC Interpreter Services 1

FREQUENTLY ASKED QUESTIONS MINIMAL DATA SET (MDS)

Session 1: Sugar and health

What s Really True? Discovering the Fact and Fiction of Autism

7 Day Fat Loss Formula How to drop those stubborn pounds this week

How the heck did I miss that? How visual attention limits visual perception Jeremy Wolfe Brigham & Women s Hospital/ Harvard Medical School

5. Thinking about your breakfast this morning, which food groups were included?

Follow-up Call Script and Log

MOVIPREP BOWEL PREP. 2 days prior to procedure

TKT CLIL LESSON PLAN

Enhanced Recovery After Surgery (ERAS) Cystectomy. Patient Diary

Overall this is a great 30-minute meal and is different which is what you want. You will enjoy something different yet please your taste buds.

Cut out these cards and match them up to make correct sentences. Arrange the sentences in a sensible order, and stick them into your book.

3.2 For breakfast, our students usually have: sandwiches, baked eggs, omelet, cereal, fruit, salad, yoghurt, tea, coffee.

NUTRITION & ACTIVITY TRACKER

Transcription:

Administrative notes March 14: Midterm 2: this will cover all lectures, labs and readings between Tue Jan 31 and Thu Mar 9 inclusive Practice Midterm 2 is on Exercises webpage: http://www.ugrad.cs.ubc.ca/~cs100/2016w2/ exercises.html#exams March 17: In the News call #3 March 30: Project deliverables and individual report due

Administrative notes Check Project Rubric on the Connect grade centre to learn which rubric we will be using to grade your project. Find your rubric at http://www.ugrad.cs.ubc.ca/~cs100/2016w2/proje ct-grading.html#projectmarkingscheme. If you have questions, please email your project TA (also listed on Connect). We will email you which projects you should review. Please ensure that email forwarding for your CS email (CS_ID@ugrad.cs.ubc.ca) works (you should have set this up in Lab 0).

Data Mining 4 Mining by Association: Apriori algorithm wrap-up

Recall: How to predict the future? Association rules An association rule X à Y suggests that people who buy items in set X are also likely to want items in Y Valid association rules are mined from training data, e.g. store purchases Association rules are useful to stores, and also in areas such as medical diagnoses, protein sequence composition, health insurance claim analysis and census data

When is an association rule valid? We are given two thresholds: Support threshold Confidence threshold A rule X à Y is valid with respect to these thresholds if The support of X Y is at least the support threshold The confidence of X à Y is at least the confidence threshold

Support: The degree to which items appear together The support of a set of items is the fraction of transactions that contain all items in the set. T1 T2 T3 T4 T5 T6 T7 Sushi, Chicken, Milk Sushi, Bread Bread, Vegetables Sushi, Chicken, Bread Sushi, Chicken, Ramen, Bread, Milk Chicken, Ramen, Milk Chicken, Milk, Ramen Here, the set {Chicken, Ramen, Milk} has support 3/7

Confidence: Cause à Effect The confidence of rule XàY is the fraction of transactions containing all items in X that also contain all items in Y The following rules both have confidence 3/3 = 1: Ramen à {Milk, Chicken} {Ramen, Chicken} à Milk T1 T2 T3 T4 T5 T6 T7 Sushi, Chicken, Milk Sushi, Bread Bread, Vegetables Sushi, Chicken, Bread Sushi, Chicken, Ramen, Bread, Milk Chicken, Ramen, Milk Chicken, Milk, Ramen

Exercise: Which rules X à Y are valid? Thresholds: support is 3/7, confidence is 1 Is the support of X Y at least 3/7? (support: fraction of transactions that contain X Y ) Is the confidence of X --> Y at least 1? (confidence: fraction of transactions containing X that also contain Y) A. Chicken à Milk B. Ramen à Milk C. Both T1 Sushi, Chicken, Milk T2 Sushi, Bread T3 Bread, Vegetables T4 Sushi, Chicken, Bread T5 Sushi, Chicken, Ramen, Bread, Milk T6 Chicken, Ramen, Milk T7 Chicken, Milk, Ramen

The association rule data mining problem Input: A table of transactions, a support threshold and a confidence threshold Output: all of the valid association rules

The Apriori algorithm for finding valid association rules The Apriori algorithm has two main tasks: Find all frequent itemsets, i.e., those with support at least the given support threshold Find all rules X à Y with confidence at least the given confidence threshold Calculating association rules on terabytes of data can be sloooowww. The slowest part is finding the frequent itemsets. Let s get back to these.

A frequent itemset: a set whose support is at least some specified threshold Example: Let the support threshold be 3/7 T1 T2 T3 T4 T5 T6 T7 Sushi, Chicken, Milk Sushi, Bread Bread, Vegetables Sushi, Chicken, Bread Sushi, Chicken, Ramen, Bread, Milk Chicken, Ramen, Milk Chicken, Milk, Ramen {Chicken, Milk, Ramen} is a frequent itemset

The Apriori algorithm key idea The Apriori algorithm speeds up task of finding frequent itemsets, based on the observation that each subset of a frequent itemset must also be a frequent itemset Let s see how this is done

A frequent itemset: a set whose support is at least some specified threshold Support threshold: 3/7 Claim: Each subset of a frequent itemset is also a frequent itemset T1 T2 T3 T4 T5 T6 T7 Sushi, Chicken, Milk Sushi, Bread Bread, Vegetables Sushi, Chicken, Bread Sushi, Chicken, Ramen, Bread, Milk Chicken, Ramen, Milk Chicken, Milk, Ramen {Chicken, Milk, Ramen} is a frequent itemset and so {Chicken, Milk}, {Chicken, Ramen}, {Milk, Ramen} must also be frequent itemsets

A frequent itemset: a set whose support is at least some specified threshold Support threshold: 3/7 Claim: Each subset of a frequent itemset is also a frequent itemset T1 T2 T3 T4 T5 T6 T7 Sushi, Chicken, Milk Sushi, Bread Bread, Vegetables Sushi, Chicken, Bread Sushi, Chicken, Ramen, Bread, Milk Chicken, Ramen, Milk Chicken, Milk, Ramen Conversely, {Vegetables} is not a frequent itemset. So any set containing Vegetables cannot be a frequent itemset. For example, {Sushi, Vegetables} is not frequent.

The Apriori algorithm Finding frequent itemsets Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50% We ll work through the algorithm to determine the frequent itemsets for this input

Apriori round 1: Find all frequent itemsets of size 1 List candidate itemsets of size 1 {apple} {corn} {dates} {rice} {tuna} Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 1: Find all frequent itemsets of size 1 Calculate the support of each candidate itemset Support: {apple} = 2/4 {corn} {dates} {rice} {tuna} What is the support for corn? a. 1/4 b. 2/4 c. 3/4 d. 4/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 1: Find all frequent itemsets of size 1 Calculate the support of each candidate itemset Support: {apple} = 2/4 {corn} = 4/4 {dates} = 3/4 {rice} = 1/4 {tuna} = 3/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 1: Find all frequent itemsets of size 1 Calculate the support of each candidate itemset Support: {apple} = 2/4 {corn} = 4/4 {dates} = 3/4 {rice} = 1/4 {tuna} = 3/4 Can any itemset containing rice ever be a frequent itemset, when the support threshold is 50%? A. Yes B. No Transaction T1 T2 T3 T4 Items apple, dates, rice, corn corn, dates, tuna apple, corn, dates, tuna corn, tuna Support threshold 50%

Apriori round 1: Find all frequent itemsets of size 1 Set F 1 to be the list of frequent itemsets of size 1: {apple} = 2/4 {corn} = 4/4 {dates} = 3/4 {rice} = 1/4 {tuna} = 3/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 2: Find all frequent itemsets of size 2 List candidate itemsets of size 2: {apple, corn} {apple, dates} {apple, tuna} {corn, dates} {corn, tuna} {dates, tuna} Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50% Because {rice} is not frequent, any set that includes rice is not frequent, so we ignore itemsets that include rice.

Apriori round 2: Find all frequent itemsets of size 2 Calculate the support of each candidate itemset {apple, corn} {apple, dates} {apple, tuna} {corn, dates} {corn, tuna} {dates, tuna} Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50% Group exercise: count support for these itemsets.

Apriori round 2: Find all frequent itemsets of size 2 Calculate the support of each candidate itemset {apple, corn} = 2/4 {apple, dates} = 2/4 {apple, tuna} = 1/4 {corn, dates} = 3/4 {corn, tuna} = 3/4 {dates, tuna} = 2/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50% Group exercise: count support for these itemsets.

Apriori round 2: Find all frequent itemsets of size 2 Set F 2 to be the list of frequent itemsets of size 2: {apple, corn} = 2/4 {apple, dates} = 2/4 {apple, tuna} = 1/4 {corn, dates} = 3/4 {corn, tuna} = 3/4 {dates, tuna} = 2/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50% Group exercise: what are the frequent itemsets of size 2?

Apriori round 2: Find all frequent itemsets of size 2 Set F 2 to be the list of frequent itemsets of size 2: {apple, corn} = 2/4 {apple, dates} = 2/4 {apple, tuna} = 1/4 {corn, dates} = 3/4 {corn, tuna} = 3/4 {dates, tuna} = 2/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 3: Find all frequent itemsets of size 3 Given frequent itemsets of size 2 Transaction Items {apple, corn} {apple, dates} {corn, dates} T1 T2 T3 corn, dates, tuna {corn, tuna} T4 corn, tuna {dates, tuna} Support threshold 50% Without counting support, what are the candidate frequent itemsets of size 3? (Key: all subsets of a candidate itemset should be frequent itemsets! For example, {apple, corn, rice} is not a candidate itemset because {apple, rice} is not a frequent itemset) apple, dates, rice, corn apple, corn, dates, tuna

Apriori round 3: Find all frequent itemsets of size 3 Given frequent itemsets of size 2 {apple, corn} T1 {apple, dates} T2 {corn, dates} T3 {corn, tuna} T4 corn, tuna {dates, tuna} Support threshold 50% Without counting support, what are the candidate frequent itemsets of size 3? A. {apple, corn, dates} B. {apple, corn, dates}, {apple, corn, tuna}, {corn, dates, tuna} C. {apple, corn, tuna}, {corn, dates, tuna} D. None of the above Transaction Items apple, dates, rice, corn corn, dates, tuna apple, corn, dates, tuna

Apriori round 3: Find all frequent itemsets of size 3 Great! We now have a list of candidate itemsets of size 3: {apple, corn, dates} {corn, dates, tuna} Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50% Group exercise: calculate the support for these candidate itemsets

Apriori round 3: Find all frequent itemsets of size 3 Calculate the support of each candidate itemset {apple, corn, dates} = 2/4 {corn, dates, tuna} = 2/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 3: Find all frequent itemsets of size 3 Set F 3 to be the list of frequent itemsets of size 3: {apple, corn, dates} = 2/4 {corn, dates, tuna} = 2/4 Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori round 4: Find all frequent itemsets of size 4 Given frequent itemsets of size 3 : {apple, corn, dates} {corn, dates, tuna} Without counting support, what are the candidate frequent itemsets of size 4? A. Nothing B. {apple, corn, dates, tuna} C. {apple, corn, dates, tuna}, {apple, corn, dates, rice} Transaction T1 T2 T3 T4 Items apple, dates, rice, corn corn, dates, tuna apple, corn, dates, tuna corn, tuna Support threshold 50%

Apriori example: done! The whole list of frequent itemsets for this example is: {apple} {corn} {dates} {tuna} {apple, corn} {apple, dates} {corn, dates} {corn, tuna} {dates, tuna} {apple, corn, dates} {corn, dates, tuna} Transaction Items T1 apple, dates, rice, corn T2 corn, dates, tuna T3 apple, corn, dates, tuna T4 corn, tuna Support threshold 50%

Apriori example: done! Frequent itemsets {apple} {corn} {dates} {tuna} {apple, corn} {apple, dates} {corn, dates} {corn, tuna} {dates, tuna} {apple, corn, dates} {corn, dates, tuna} Itemsets we counted support for: {apple} {corn} {dates} {rice} {tuna} {apple, corn} {apple, dates} {apple, tuna} {corn, dates} {corn, tuna} {dates, tuna} {apple, corn, dates} {corn, dates, tuna} All possible itemsets: {apple} {corn} {dates} {rice} {tuna} {apple, corn} {apple, dates} {apple, rice} {apple, tuna} {corn, dates} {corn, rice} {corn, tuna} {dates, rice} {dates, tuna} {rice, tuna} {apple, corn, dates} {apple, corn, rice} {apple, corn, tuna} {corn, dates, rice} {corn, dates, tuna} {dates, rice, tuna} {apple, corn, dates, rice} {apple, corn, dates, tuna} {corn, dates, rice, tuna} {apple, corn, dates, rice, tuna}

That s how the algorithm works Let s see it written down, and see how it works on one more example

Apriori algorithm 1. Set k to 0 [k keeps track of what round we re on] 2. Repeat a. Add 1 to k b. Set C k to be the list of candidate itemsets of size k (those whose subsets of size k-1 are frequent) c. Calculate the support of itemsets in C k d. Set F k to be the list of frequent itemsets in C k (those with support greater than the threshold) Until F k is empty 3. Output the union of all F k

Apriori algorithm Repeat loop round 1 (k=1 at step a) Transaction T1 T2 T3 T4 Items apple, dates, rice, corn corn, dates, tuna apple, corn, dates, tuna corn, tuna Support threshold = 75% F 1 : {dates}, {corn}, {tuna} Step 2b C1 {apple} {dates} {rice} {corn} {tuna} Step 2c Support 2/4 3/4 1/4 4/4 3/4 Step 2d F1 {dates} {corn} {tuna}

Apriori algorithm Repeat loop round 2 (k=2 at step a) Transaction T1 T2 T3 T4 Items apple, dates, rice, corn corn, dates, tuna apple, corn, dates, tuna corn, tuna Support threshold = 75% F 1 : {dates}, {corn}, {tuna} F 2 : {corn, dates}, {corn, tuna} Step 2b C2 {corn, dates} {corn, tuna} {dates, tuna} Step 2c Support 3/4 3/4 2/4 Step 2d F2 {corn, dates} {corn, tuna}

Apriori algorithm Repeat loop round 3 (k=3 at step a) Transaction T1 T2 T3 T4 Step 2b Step 2c Step 2d Items apple, dates, rice, corn corn, dates, tuna apple, corn, dates, tuna corn, tuna C3 Support F3 Support threshold = 75% F 1 : {dates}, {corn}, {tuna} F 2 : {corn, dates}, {corn, tuna} Clicker question: What are the candidate sets in C 3? A. nothing B. {corn, dates, tuna}

Great! Your turn! In a group Use the Apriori algorithm to find frequent itemsets with a support threshold of 3/7. Write down what sets you have at each step! Transaction T1 T2 T3 T4 T5 T6 T7 Items cake, jam, rolls, tea cake, jam, tea cake, jam jam, rolls, tea jam, rolls rolls, tea jam, tea Support threshold = 3/7

Apriori Algorithm Clicker question Which of the following are in F 3? A. {cake, jam, rolls} B. {cake, jam, tea} C. {jam, rolls, tea} D. All are in F 3 Transaction Items T1 cake, jam, rolls, tea T2 cake, jam, tea T3 cake, jam T4 jam, rolls, tea T5 jam, rolls T6 rolls, tea T7 jam, tea Support threshold = 3/7 E. None are in F 3

Let s walk through the example Support for candidate sets of size 1: {cake} = 3/7 {jam} = 6/7 {rolls} = 4/7 {tea} = 5/7 F 1 : {cake},{jam},{rolls},{tea} Transaction Items T1 cake, jam, rolls, tea T2 cake, jam, tea T3 cake, jam T4 jam, rolls, tea T5 jam, rolls T6 rolls, tea T7 jam, tea Support threshold = 3/7

Let s walk through the example Support for candidate sets of size 2: {cake, jam} = 3/7 {cake, rolls} = 1/7 {cake, tea} = 2/7 {jam, rolls} = 3/7 {jam, tea} = 4/7 {rolls, tea} = 3/7 F 2 : {cake, jam}, {jam,rolls}, {jam, tea}, {rolls, tea} Support for candidate sets of size 3: {jam, rolls, tea} = 2 F 3 is nothing Transaction T1 T2 T3 T4 T5 T6 T7 Items cake, jam, rolls, tea cake, jam, tea cake, jam jam, rolls, tea jam, rolls rolls, tea jam, tea

The Apriori algorithm shook up the research world It has over 20,000 citations! Why? It s something people really needed It scales really well It s easy to understand Lots to extend

Coming full circle: back to privacy issues Massachusetts released anonymized medical records for state employees. They removed all identifiers but left birthdate (including year), gender, and zip code. Group discussion: what percentage of people in the US could likely be uniquely identified by this information? (Note: there are ~7,500 people per zip code) A. 0-19% B. 20-39% C. 40-59% D. 60-79% E. 80-100%

Group exercise Is it a problem that we can tell that in one database one individual (we don t know the name, but we know the age, gender, and zip code) has a set of medical conditions?

Well Okay, so we can uniquely determine that there exists some person with some medical visits. We still don t who they are. But there are other data sources, too. Publically available voting records include name, zip code, birthdate and gender of voters. So if you put the two together, you now have names and health records together Security researcher (and graduate student) Latanya Sweeny sent the Governor s full health records to his office. http://arstechnica.com/tech-policy/2009/09/your-secretslive-online-in-databases-of-ruin/

Learning goals revisited [CT Building Block] Students will be able to demonstrate that they understand the Apriori algorithm by describing what the output would be for a small input. [CT Building Block] Students will be able to create English language descriptions of algorithms to analyze data and show how their algorithms would work on an input data set.