Cocktail Preference Prediction

Size: px
Start display at page:

Download "Cocktail Preference Prediction"

Transcription

1 Cocktail Preference Prediction Linus Meyer-Teruel, 1 Michael Parrott 1 1 Department of Computer Science, Stanford University, In this paper we approach the problem of rating prediction on data from a number of perspectives. First, we considered trying to predict personal user preferences using a multi-feature linear model. Second, we consider new recipe generation given constraints and preferences, and then predicting a personal rating for new cocktail mixes based on its features. Introduction On campus, we have a well known problem with students drinking excessive amounts of hard liquor, which we believe stems from the lack of knowledge and appreciation for alcohol in moderation. In response, we wanted to create a customized Cocktail Recommendation System to engender appreciation, that takes into account what ingredients you have available to use, your desired alcohol content, and your personal preferences, and recommends a cocktail recipe for you to make. Then, having tried this recipe, the user can rate the output, the system will learn from this feedback for future predictions. We separate the problem into three parts; dataset collection, user rating prediction, and custom recommendation based on rating prediction. 1

2 Data Collection - Datasets, datasets, datasets... The biggest issue that we faced at the beginning of our project was attaining reliable data. Unlike movie or product recommendations, there are few cocktail recipe and rating sites, and even less with cohesive data that included quantities, reliable names, and ratings. We originally wanted to approach the problem of customized prediction from scraped ratings on websites, but for any reviewers that we found, they tended to have less than two reviews, and typically only their favorite drinks, which was unusable. Additionally, the most reliable website that we found (1001cocktails.com) had web blockers implemented to prevent scraping. We decided to break the problem into two parts, personal ratings and generalized ratings. To tackle the personalized rating, we organized several sampling sessions, each of approximately 10 people, who each sampled and rated 15 different drinks. For generalized ratings, we spent several weeks collecting datasets from several different websites, both manually and using BeautifulSoup, and then combined and collated them into one central cocktail database. We began with three separate recipe databases, with a total of 9,800 recipes and about 600 ingredients. This database was full of repeat drinks with different names, one-off recipes, branded ingredients and custom instructions. We eliminated renamed drinks, drinks with less than 10 reviews, separated ingredients out from brands to alcohol types, homogenized the serving quantities, and found ABV values for all alcoholic drinks. We also eliminated drinks that contained ingredients that appeared less than four times, to ensure that we had some reliability in the ratings. Our final dataset consists of 1273 drinks, each with ingredients, quantities, ratings, preparation and serving instructions, and ABV. There are a total of 178 ingredients included in these recipes. An example is shown below. {mint julep : {glasst ype :..., ingredients : {ingredient : bourbon, quantity : 45mL...}, method :..., rating : {numratings : 21, rating = 6.1 }} 2

3 User Rating Prediction 1. Summary Our primary goal in this project was to identify ways to predict a users drink ratings based on their past drink ratings. To do so we compiled a data set of user drink ratings, and then split the data for each user into a training an test set. As each user had an average of 13 ratings, these sets were relatively small and ended up leaving and ingredients they had not tried yet as 0 s. To ensure comprehensive coverage in the ingredients, and the case of no prior knowledge for a new user, we began to consider the case of warm starting our predictor with a weights trained on the 1001cocktails database. 2. Baseline For our baseline we simply return the average rating based on all the drinks in the training set. Cross Validation Baseline K=5: MSE Initial Results All three of our predictors performed worse than the baseline and clearly over-fitted the data. We noticed that each of these features did not have nearly enough data to create accurate predictions. CV Single Ingredient Ingredient Features K=5: MSE % worse CV W/ ABV Ingredient Features K=5 MSE: % worse CV Pairs Ingredient Features K=5 MSE: % worse 4. Results 3

4 By giving the weights a warm start we managed to prevent the over-training that occurred previously. CV Single Ingredient Ingredient Features K=5: MSE % improvement CV W/ ABV Ingredient Features K=5: MSE % improvement CV Pairs Ingredient Features K=5: MSE % improvement CV Triples and Pairs of Ingredient Features K=5: MSE % improvement Drink Rating Prediction for User Feature Learning 1. Summary We needed to use the average rating data set in order to give our model somewhere to start on. Here we try to predict the average rating of a new drink given only its ingredients. For our baseline we simply returns the average rating over all drinks in the current data set. 2. Baseline For our baseline we simply returned the average over all of the previously rated drinks. Using the database of cleaned user ratings, we found that 3. Results We noted that the improvements over the baseline were small in all cases. However, the weightings from these linear models were used to warm start the User Rating Predictions, which helped to improve their performance significantly more. Cross Validation Baseline K=5 MSE Cross Validation Ingredient Features K=5 MSE % improvement 4

5 Cross Validation Pair Features K=5 MSE $ improvement Cross Validation Triple Ingredient Features K=5 MSE $ improvement Custom Recipe Generation Weighted CSP for Recipe Generation Modeling For custom recipe generation, we decided to model the problem as a weighed CSP. The variables X i are a set of 5 ingredients, with a domain of {0, i all ingredients}. The constraints and factors were such that: 1. The user preference for pairs ingredients, on all pairs of variables, given by: weight w(pair, p) φ(x)), where w(pair,p) is the personalized weight of that ingredient pair. 2. A potential on the total quantity of alcohol in the assignment. We found typical serving sizes for each kind of alcohol, and from these calculate ABVs. 3. X 2, X 3, X 4 are constrained to be non-alcohols 4. X 0, X 1 are constrained to be alcohols The ingredient pair potentials were trained on our recipe dataset using our predictor. If the pair did not exist in the dataset, then a penalty is given to the weight of that recipe Algorithm Because of the large number of ingredients and ingredient combinations, we decided to use iterated conditional modes to both allow for ease of processing, and provide variety in output because of the possibility of finding local maxima as opposed to the absolute best possible 5

6 recipe. As given in class, the algorithm for ICM is as follows: Initialize x to a random assignment: Loop until assignment no longer changes: For all variables X i, Iterate through domain of X i Compute weight of X v Set X i to highest weight value Results We evaluated our output based on the ABV deviation from the desired amount, and used our linear predictor to estimate a rating for the produced drink. Our baseline consists of a random selection of ingredients that fit the constraints (two alcohols and three add-ins). The MSE of the baseline was , with an average rating across the drinks of 5.43, while our ICM MSE was with an average rating of Error Analysis 1. Over-fitting due to lack of user data At first our predictions from using linear models for prediction were worse than the baseline MSE of as linear regression tended to over-fit due to the sparsity of the feature vector. We therefore chose to extract general user preferences by looking at the average drink rating data set that we scraped from the 1001cocktails.com website. We trained a linear predictor with each set of features on those ratings, and then carried those weightings over to the linear regression. This helped to prevent the model from 6

7 over-fitting as it increased the chance that it would converge to a more representative local optimum. This led to the 2. Feature Selection We initially tried to choose our features just by looking at the user ratings data set. However, we found due to the sparsity of the features and the size of our data set, there was little we could do choose features there, as it would likely be the cause of snooping, and might not carry over to the more generalized data sets. To guide our feature selection, we ended up using the 1001cocktails average rating data set, as we believed that this would provide a better way to see which features appeared relevant in a larger data set. This let us identify that certain features such as ABV, total volume, and total alcohol content did not have linear correlations with the rating, as individuals can like both strong drinks like Martini s and sweeter drinks like Gin Rickey s. Although ingredients on their own once warm started did improve upon the baseline by 9.9 %, the best features we found were the ingredient pairings, which provided a 27% improvement over the baseline. This problem as a whole was made very difficult by the quality and availability of the data, as well as the apparent difference in between generalized ratings and true personal preferences. We were originally attempting to learn and predict ratings based on our collected data, but it was nearly impossible due to the large variability from person to person. Our predictor performance was terrible, and the weights developed gave no useful information for the drink outputs, and they seemed to be just random collections of drinks, although the ABV accuracy was high. Once we were able to scrape enough drink recipes together, the performance increased significantly because of the warm-start on the linear predictor, and these improved weights 7

8 led to significantly more cohesive new drink outputs, that follow drink recipe trends much more closely. From our results, we found that ingredient pairing features had the strongest influence on predicting a users ratings, and providing a good recommendation for a new drink. Ingredient pairing also led to the most seemingly cohesive new recipe generation. 1 Conclusion 8

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression CSE 258 Lecture 1.5 Web Mining and Recommender Systems Supervised learning Regression What is supervised learning? Supervised learning is the process of trying to infer from labeled data the underlying

More information

What Are Your Odds? : An Interactive Web Application to Visualize Health Outcomes

What Are Your Odds? : An Interactive Web Application to Visualize Health Outcomes What Are Your Odds? : An Interactive Web Application to Visualize Health Outcomes Abstract Spreading health knowledge and promoting healthy behavior can impact the lives of many people. Our project aims

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Bangor University Laboratory Exercise 1, June 2008

Bangor University Laboratory Exercise 1, June 2008 Laboratory Exercise, June 2008 Classroom Exercise A forest land owner measures the outside bark diameters at.30 m above ground (called diameter at breast height or dbh) and total tree height from ground

More information

Assessing Modes of Interaction

Assessing Modes of Interaction Project 2 Assessing Modes of Interaction Analysis of exercise equipment Overview For this assignment, we conducted a preliminary analysis of two similar types of elliptical trainers. We are able to find

More information

CSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

CSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression CSE 258 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

The Lens Model and Linear Models of Judgment

The Lens Model and Linear Models of Judgment John Miyamoto Email: jmiyamot@uw.edu October 3, 2017 File = D:\P466\hnd02-1.p466.a17.docm 1 http://faculty.washington.edu/jmiyamot/p466/p466-set.htm Psych 466: Judgment and Decision Making Autumn 2017

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

Introduction. Preliminary POV. Additional Needfinding Results. CS Behavioral Change Studio ASSIGNMENT 2 POVS and Experience Prototypes

Introduction. Preliminary POV. Additional Needfinding Results. CS Behavioral Change Studio ASSIGNMENT 2 POVS and Experience Prototypes CS 147 - Behavioral Change Studio ASSIGNMENT 2 POVS and Experience Prototypes Introduction Meet The BetterMeet Team We are a team of Stanford students in the behavioral change studio. Problem Domain Our

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013#

UF#Stats#Club#STA#2023#Exam#1#Review#Packet# #Fall#2013# UF#Stats#Club#STA##Exam##Review#Packet# #Fall## The following data consists of the scores the Gators basketball team scored during the 8 games played in the - season. 84 74 66 58 79 8 7 64 8 6 78 79 77

More information

ADVANCED VBA FOR PROJECT FINANCE Near Future Ltd. Registration no

ADVANCED VBA FOR PROJECT FINANCE Near Future Ltd. Registration no ADVANCED VBA FOR PROJECT FINANCE f i n a n c i a l f o r e c a s t i n G 2017 Near Future Ltd. Registration no. 10321258 www.nearfuturefinance.com info@nearfuturefinance.com COURSE OVERVIEW This course

More information

Modeling Sentiment with Ridge Regression

Modeling Sentiment with Ridge Regression Modeling Sentiment with Ridge Regression Luke Segars 2/20/2012 The goal of this project was to generate a linear sentiment model for classifying Amazon book reviews according to their star rank. More generally,

More information

Social and Pragmatic Language in Autistic Children

Social and Pragmatic Language in Autistic Children Parkland College A with Honors Projects Honors Program 2015 Social and Pragmatic Language in Autistic Children Hannah Li Parkland College Recommended Citation Li, Hannah, "Social and Pragmatic Language

More information

Learning to Cook: An Exploration of Recipe Data

Learning to Cook: An Exploration of Recipe Data Learning to Cook: An Exploration of Recipe Data Travis Arffa (tarffa), Rachel Lim (rachelim), Jake Rachleff (jakerach) Abstract Using recipe data scraped from the internet, this project successfully implemented

More information

ReSound Forte and ReSound Smart 3D App For Android Users Frequently Asked Questions

ReSound Forte and ReSound Smart 3D App For Android Users Frequently Asked Questions ReSound Forte and ReSound Smart 3D App For Android Users Frequently Asked Questions GENERAL Q. I have an Android phone. Can I use ReSound Forte? Q. What Android devices are compatible with ReSound Forte

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

Team 6 - Green Prep. Project Synopsis. Project Description. Introduction. Problem: Objective:

Team 6 - Green Prep. Project Synopsis. Project Description. Introduction. Problem: Objective: Team 6 - Green Prep Jules Garrett, Coltyn Gatton, Katie Hrenchir, Austin Juhl, Menita Vedantam Project Synopsis Meal preparation web application that determines the optimal serving sizes of user selected

More information

Tips and Tricks for Raking Survey Data with Advanced Weight Trimming

Tips and Tricks for Raking Survey Data with Advanced Weight Trimming SESUG Paper SD-62-2017 Tips and Tricks for Raking Survey Data with Advanced Trimming Michael P. Battaglia, Battaglia Consulting Group, LLC David Izrael, Abt Associates Sarah W. Ball, Abt Associates ABSTRACT

More information

Consumer Review Analysis with Linear Regression

Consumer Review Analysis with Linear Regression Consumer Review Analysis with Linear Regression Cliff Engle Antonio Lupher February 27, 2012 1 Introduction Sentiment analysis aims to classify people s sentiments towards a particular subject based on

More information

Reveal Relationships in Categorical Data

Reveal Relationships in Categorical Data SPSS Categories 15.0 Specifications Reveal Relationships in Categorical Data Unleash the full potential of your data through perceptual mapping, optimal scaling, preference scaling, and dimension reduction

More information

Excel Solver. Table of Contents. Introduction to Excel Solver slides 3-4. Example 1: Diet Problem, Set-Up slides 5-11

Excel Solver. Table of Contents. Introduction to Excel Solver slides 3-4. Example 1: Diet Problem, Set-Up slides 5-11 15.053 Excel Solver 1 Table of Contents Introduction to Excel Solver slides 3- : Diet Problem, Set-Up slides 5-11 : Diet Problem, Dialog Box slides 12-17 Example 2: Food Start-Up Problem slides 18-19 Note

More information

Session 4 or 2: Be a Fat Detective.

Session 4 or 2: Be a Fat Detective. Session 4 or 2: Be a Fat Detective. We ll begin today to keep track of your weight. Your starting weight was Your weight goal is pounds. pounds. To keep track of your weight: At every session, mark it

More information

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc.

Sawtooth Software. MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB RESEARCH PAPER SERIES. Bryan Orme, Sawtooth Software, Inc. Sawtooth Software RESEARCH PAPER SERIES MaxDiff Analysis: Simple Counting, Individual-Level Logit, and HB Bryan Orme, Sawtooth Software, Inc. Copyright 009, Sawtooth Software, Inc. 530 W. Fir St. Sequim,

More information

A guide to using multi-criteria optimization (MCO) for IMRT planning in RayStation

A guide to using multi-criteria optimization (MCO) for IMRT planning in RayStation A guide to using multi-criteria optimization (MCO) for IMRT planning in RayStation By David Craft Massachusetts General Hospital, Department of Radiation Oncology Revised: August 30, 2011 Single Page Summary

More information

MULTIPLE REGRESSION OF CPS DATA

MULTIPLE REGRESSION OF CPS DATA MULTIPLE REGRESSION OF CPS DATA A further inspection of the relationship between hourly wages and education level can show whether other factors, such as gender and work experience, influence wages. Linear

More information

Chapter Eight: Multivariate Analysis

Chapter Eight: Multivariate Analysis Chapter Eight: Multivariate Analysis Up until now, we have covered univariate ( one variable ) analysis and bivariate ( two variables ) analysis. We can also measure the simultaneous effects of two or

More information

On the Combination of Collaborative and Item-based Filtering

On the Combination of Collaborative and Item-based Filtering On the Combination of Collaborative and Item-based Filtering Manolis Vozalis 1 and Konstantinos G. Margaritis 1 University of Macedonia, Dept. of Applied Informatics Parallel Distributed Processing Laboratory

More information

Statistics: A Brief Overview Part I. Katherine Shaver, M.S. Biostatistician Carilion Clinic

Statistics: A Brief Overview Part I. Katherine Shaver, M.S. Biostatistician Carilion Clinic Statistics: A Brief Overview Part I Katherine Shaver, M.S. Biostatistician Carilion Clinic Statistics: A Brief Overview Course Objectives Upon completion of the course, you will be able to: Distinguish

More information

Predicting Sleep Using Consumer Wearable Sensing Devices

Predicting Sleep Using Consumer Wearable Sensing Devices Predicting Sleep Using Consumer Wearable Sensing Devices Miguel A. Garcia Department of Computer Science Stanford University Palo Alto, California miguel16@stanford.edu 1 Introduction In contrast to the

More information

Obsessive-Compulsive Disorder

Obsessive-Compulsive Disorder When Unwanted Thoughts Take Over: Obsessive-Compulsive Disorder National Institute of Mental Health U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health National Institute of Mental

More information

Cleaning Up and Visualizing My Workout Data With JMP Shannon Conners, PhD JMP, SAS Abstract

Cleaning Up and Visualizing My Workout Data With JMP Shannon Conners, PhD JMP, SAS Abstract Cleaning Up and Visualizing My Workout Data With JMP Shannon Conners, PhD JMP, SAS Abstract I began tracking weight training workouts in notebooks in middle school. However, training notes did not give

More information

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models White Paper 23-12 Estimating Complex Phenotype Prevalence Using Predictive Models Authors: Nicholas A. Furlotte Aaron Kleinman Robin Smith David Hinds Created: September 25 th, 2015 September 25th, 2015

More information

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE

THE STATSWHISPERER. Introduction to this Issue. Doing Your Data Analysis INSIDE THIS ISSUE Spring 20 11, Volume 1, Issue 1 THE STATSWHISPERER The StatsWhisperer Newsletter is published by staff at StatsWhisperer. Visit us at: www.statswhisperer.com Introduction to this Issue The current issue

More information

Case Studies of Signed Networks

Case Studies of Signed Networks Case Studies of Signed Networks Christopher Wang December 10, 2014 Abstract Many studies on signed social networks focus on predicting the different relationships between users. However this prediction

More information

Evolutionary Computation for Modelling and Optimization in Finance

Evolutionary Computation for Modelling and Optimization in Finance Evolutionary Computation for Modelling and Optimization in Finance Sandra Paterlini CEFIN & RECent, University of Modena and Reggio E., Italy Introduction Why do we need Evolutionary Computation (EC)?

More information

Consumer Assessment of Wrigley s Alpine Gum

Consumer Assessment of Wrigley s Alpine Gum Consumer Assessment of Wrigley s Alpine Gum Group Members: Adam Benner Michael Casserly Christine Chen Michelle Halabaso Anide Jean April 22, 2004-1- Agenda Background Objective Research Goals Executive

More information

Comparison of Two Approaches for Direct Food Calorie Estimation

Comparison of Two Approaches for Direct Food Calorie Estimation Comparison of Two Approaches for Direct Food Calorie Estimation Takumi Ege and Keiji Yanai Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka, Chofu-shi, Tokyo

More information

Moodscope: Mood management through self-tracking and peer support

Moodscope: Mood management through self-tracking and peer support Moodscope: Mood management through self-tracking and peer support Introduction Moodscope is a novel online mood-tracking system which enables individuals to accurately measure and record daily mood scores

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

SEMANTICS-BASED CALORIE CALCULATOR. A Paper Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science

SEMANTICS-BASED CALORIE CALCULATOR. A Paper Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science SEMANTICS-BASED CALORIE CALCULATOR A Paper Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science By Sravan Raghu Kumar Narra In Partial Fulfillment of

More information

Predicting Microfinance Participation in Indian Villages

Predicting Microfinance Participation in Indian Villages Predicting Microfinance Participation in Indian Villages Govind Manian and Karen Shen December 5, 202 Abstract Using data from a microfinance organization operating in southern Indian villages, we use

More information

CHAPTER 2 TAGUCHI OPTIMISATION TECHNIQUE

CHAPTER 2 TAGUCHI OPTIMISATION TECHNIQUE 8 CHAPTER 2 TAGUCHI OPTIMISATION TECHNIQUE 2.1 OVERVIEW OF TAGUCHI METHOD The Design of Experiments (DOE) is a powerful statistical technique introduced by Fisher R. in England in the 1920s (Ross 1996),

More information

Enumerative and Analytic Studies. Description versus prediction

Enumerative and Analytic Studies. Description versus prediction Quality Digest, July 9, 2018 Manuscript 334 Description versus prediction The ultimate purpose for collecting data is to take action. In some cases the action taken will depend upon a description of what

More information

Determining the optimal sampling method to estimate the mean and standard deviation of pig body weights within a population

Determining the optimal sampling method to estimate the mean and standard deviation of pig body weights within a population Kansas Agricultural Experiment Station Research Reports Volume 0 Issue 10 Swine Day (1968-2014) Article 1050 2014 Determining the optimal sampling method to estimate the mean and standard deviation of

More information

Fundamental Clinical Trial Design

Fundamental Clinical Trial Design Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003

More information

Predictive Validity of a Robotic Surgery Simulator

Predictive Validity of a Robotic Surgery Simulator Predictive Validity of a Robotic Surgery Simulator Anirudh Pasupuleti SUID: 05833435 SCPD#: X120939 Introduction The primary goal of this project is to answer the question: does performance/training with

More information

Test-Driven Development

Test-Driven Development Test-Driven Development Course of Software Engineering II A.A. 2009/2010 Valerio Maggio, Ph.D. Student Prof. Sergio Di Martino Contents at Glance What is TDD? TDD and XP TDD Mantra TDD Principles and Patterns

More information

Educator(s) Name (s): Sub-Contractor:

Educator(s) Name (s): Sub-Contractor: Washington State Snap-Ed Curriculum Fidelity for Continuous Improvement Lesson Assessment Tool for Kids in the Kitchen: Lesson 7, Level C Got Milk? Educator Self-Assessment Supervisor Assessment Fidelity

More information

J2.6 Imputation of missing data with nonlinear relationships

J2.6 Imputation of missing data with nonlinear relationships Sixth Conference on Artificial Intelligence Applications to Environmental Science 88th AMS Annual Meeting, New Orleans, LA 20-24 January 2008 J2.6 Imputation of missing with nonlinear relationships Michael

More information

Supersparse Linear Integer Models for Interpretable Prediction. Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013

Supersparse Linear Integer Models for Interpretable Prediction. Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013 Supersparse Linear Integer Models for Interpretable Prediction Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013 CHADS 2 Scoring System Condition Points Congestive heart failure 1 Hypertension 1 Age

More information

Part I: Alcohol Metabolization Explore and Explain

Part I: Alcohol Metabolization Explore and Explain Name Date Part I: Alcohol Metabolization Explore and Explain Just like any other type of food or beverage, alcohol is digested and then metabolized by the body. When a substance is metabolized by the body,

More information

Interpretype Video Remote Interpreting (VRI) Subscription Service White Paper September 2010

Interpretype Video Remote Interpreting (VRI) Subscription Service White Paper September 2010 Interpretype Video Remote Interpreting (VRI) Subscription Service White Paper September 2010 Overview: Interpretype Video Remote Interpreting (VRI) Subscription Service is a revolutionary method of scheduling,

More information

Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Shiraz University of Medical Sciences

Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Shiraz University of Medical Sciences Effective Values of Physical Features for Type-2 Diabetic and Non-diabetic Patients Classifying Case Study: Medical Sciences S. Vahid Farrahi M.Sc Student Technology,Shiraz, Iran Mohammad Mehdi Masoumi

More information

Ironman Articles ArthurJonesExercise.com

Ironman Articles ArthurJonesExercise.com Ironman Articles 1970-1974 ArthurJonesExercise.com The Total Omni-Directional Direct Exercise System For best possible results from physical training, several requirements are absolutely essential: (1)

More information

ABR Screener Excellence in Newborn Hearing Screening

ABR Screener Excellence in Newborn Hearing Screening MB 11 Patented CE-Chirp stimulus reduces test times Virtually no cost or low cost supplies reduce program costs Classic and BERAphone hardware models offer choices to meet your program needs ABR Screener

More information

Session 14: Overview. Quick Fact. Session 14: Make Social Cues Work for You. The Power of Social Cues. Dealing with Social Cues

Session 14: Overview. Quick Fact. Session 14: Make Social Cues Work for You. The Power of Social Cues. Dealing with Social Cues Session 14: Overview The Power of Social Cues Social cues are occasions that trigger us to behave in a certain way when we re around other people. For example, watching a football game with friends is

More information

Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines

Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines Florian Markowetz and Anja von Heydebreck Max-Planck-Institute for Molecular Genetics Computational Molecular Biology

More information

Intro to SPSS. Using SPSS through WebFAS

Intro to SPSS. Using SPSS through WebFAS Intro to SPSS Using SPSS through WebFAS http://www.yorku.ca/computing/students/labs/webfas/ Try it early (make sure it works from your computer) If you need help contact UIT Client Services Voice: 416-736-5800

More information

Easy Smoothie Recipes 100 Recipes For Kids Cooking With Kids Series Book 2

Easy Smoothie Recipes 100 Recipes For Kids Cooking With Kids Series Book 2 Easy Smoothie Recipes 100 Recipes For Kids Cooking With Kids Series Book 2 We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing

More information

User Guide. VBT 200 Page 2 VBT 200/300/500. The easy and convenient way to burn fat, while toning and strengthening your muscles. Safety Information

User Guide. VBT 200 Page 2 VBT 200/300/500. The easy and convenient way to burn fat, while toning and strengthening your muscles. Safety Information Vibrational Therapy Welcome to your new VibroTec! waiting inside this box is your way to better health. Be sure to read these instructions in detail to get the most out of your machine. Safety Information

More information

CSE 255 Assignment 9

CSE 255 Assignment 9 CSE 255 Assignment 9 Alexander Asplund, William Fedus September 25, 2015 1 Introduction In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL 1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across

More information

Active Living with Arthritis Podcast #9 Being a Caregiver: Caring for Someone and Managing Your Arthritis

Active Living with Arthritis Podcast #9 Being a Caregiver: Caring for Someone and Managing Your Arthritis Active Living with Arthritis Podcast #9 Being a Caregiver: Caring for Someone and Managing Your Arthritis Karen: Welcome to another Active Living with Arthritis podcast, presented by ENACT center at Boston

More information

What Happened to Bob? Semantic Data Mining of Context Histories

What Happened to Bob? Semantic Data Mining of Context Histories What Happened to Bob? Semantic Data Mining of Context Histories Michael Wessel, Marko Luther, Ralf Möller Racer Systems DOCOMO Euro-Labs Uni Hamburg A mobile community service 1300+ users in 60+ countries

More information

How to use FitDay.com to track your calories (v1.0)

How to use FitDay.com to track your calories (v1.0) How to use FitDay.com to track your calories (v1.0) 2010 Bryne Carruthers -- http://eatfruitfeelgood.com/ Fit Day is a free, easy to use online program that allows you to monitor your intake of calories

More information

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning

A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning A Scoring Policy for Simulated Soccer Agents Using Reinforcement Learning Azam Rabiee Computer Science and Engineering Isfahan University, Isfahan, Iran azamrabiei@yahoo.com Nasser Ghasem-Aghaee Computer

More information

Speech Processing / Speech Translation Case study: Transtac Details

Speech Processing / Speech Translation Case study: Transtac Details Speech Processing 11-492/18-492 Speech Translation Case study: Transtac Details Phraselator: One Way Translation Commercial System VoxTec Rapid deployment Modules of 500ish utts Transtac: Two S2S System

More information

Module 4 Introduction

Module 4 Introduction Module 4 Introduction Recall the Big Picture: We begin a statistical investigation with a research question. The investigation proceeds with the following steps: Produce Data: Determine what to measure,

More information

5 THINGS YOU CAN DO TODAY TO START YOUR FITNESS JOURNEY.

5 THINGS YOU CAN DO TODAY TO START YOUR FITNESS JOURNEY. 5 THINGS YOU CAN DO TODAY TO START YOUR FITNESS JOURNEY. 0405 121 931 INFO@TOTEMFITNESS.COM.AU 65 GILSTON STREET KEPERRA QLD totemfitnessau totemfitnessau totemfitness.com.au 1 Clean out the cupboards

More information

Evaluation of new technology-based tools for dietary intake assessment

Evaluation of new technology-based tools for dietary intake assessment Evaluation of new technology-based tools for dietary intake assessment Alison Eldridge, PhD, RD Nestlé Research Center, Lausanne On behalf of the ILSI Europe Expert Group Evaluation of new methods for

More information

Does chewing gum have an impact on student performance? : An analysis of quiz grades

Does chewing gum have an impact on student performance? : An analysis of quiz grades Does chewing gum have an impact on student performance? : An analysis of quiz grades Abstract: In this paper, we will discuss the results of an experiment measuring whether chewing gum during a quiz impacts

More information

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016 This course does not cover how to perform statistical tests on SPSS or any other computer program. There are several courses

More information

A Guide to Help You Reduce and Stop Using Tobacco

A Guide to Help You Reduce and Stop Using Tobacco Let s Talk Tobacco A Guide to Help You Reduce and Stop Using Tobacco Congratulations for taking this first step towards a healthier you! 1-866-710-QUIT (7848) albertaquits.ca It can be hard to stop using

More information

Determining the Optimal Sampling Method to Estimate the Mean and Standard Deviation of Pig Body Weights Within a Population 1,2

Determining the Optimal Sampling Method to Estimate the Mean and Standard Deviation of Pig Body Weights Within a Population 1,2 Determining the Optimal Sampling Method to Estimate the Mean and Standard Deviation of Pig Body Weights Within a Population 1,2 C.B. Paulk, M.D. Tokach, S.S. Dritz 3, J.L. Nelssen, J.M. DeRouchey, and

More information

LC/MS/MS SOLUTIONS FOR LIPIDOMICS. Biomarker and Omics Solutions FOR DISCOVERY AND TARGETED LIPIDOMICS

LC/MS/MS SOLUTIONS FOR LIPIDOMICS. Biomarker and Omics Solutions FOR DISCOVERY AND TARGETED LIPIDOMICS LC/MS/MS SOLUTIONS FOR LIPIDOMICS Biomarker and Omics Solutions FOR DISCOVERY AND TARGETED LIPIDOMICS Lipids play a key role in many biological processes, such as the formation of cell membranes and signaling

More information

Identity Verification Using Iris Images: Performance of Human Examiners

Identity Verification Using Iris Images: Performance of Human Examiners Identity Verification Using Iris Images: Performance of Human Examiners Kevin McGinn, Samuel Tarin and Kevin W. Bowyer Department of Computer Science and Engineering University of Notre Dame kmcginn3,

More information

CHAPTER ONE CORRELATION

CHAPTER ONE CORRELATION CHAPTER ONE CORRELATION 1.0 Introduction The first chapter focuses on the nature of statistical data of correlation. The aim of the series of exercises is to ensure the students are able to use SPSS to

More information

My Fitness Pal Health & Fitness Tracker A User s Guide

My Fitness Pal Health & Fitness Tracker A User s Guide My Fitness Pal Health & Fitness Tracker A User s Guide By: Angela McCall Introduction My Fitness Pal is an online diet, health, and fitness tracker that allows you to track your nutrition and fitness goals

More information

WTC II Term 4 Notes & Assessments

WTC II Term 4 Notes & Assessments Term 4 Notes & Assessments Training Goals When training, it is important to understand what you want to accomplish. In other words, you must have a purpose for your training sessions and overall program.

More information

A Practical Guide to Getting Started with Propensity Scores

A Practical Guide to Getting Started with Propensity Scores Paper 689-2017 A Practical Guide to Getting Started with Propensity Scores Thomas Gant, Keith Crowland Data & Information Management Enhancement (DIME) Kaiser Permanente ABSTRACT This paper gives tools

More information

Alcohol Problems in Intimate Relationships

Alcohol Problems in Intimate Relationships Alcohol Problems in Intimate Relationships Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Alcohol Problems and Your Practice An Alcohol Problems Framework

More information

Classifying Substance Abuse among Young Teens

Classifying Substance Abuse among Young Teens Classifying Substance Abuse among Young Teens Dylan Rhodes, Sunet: dylanr December 14, 2012 Abstract This project attempts to use machine learning to classify substance abuse among young teens. It makes

More information

The Long Tail of Recommender Systems and How to Leverage It

The Long Tail of Recommender Systems and How to Leverage It The Long Tail of Recommender Systems and How to Leverage It Yoon-Joo Park Stern School of Business, New York University ypark@stern.nyu.edu Alexander Tuzhilin Stern School of Business, New York University

More information

Chapter Eight: Multivariate Analysis

Chapter Eight: Multivariate Analysis Chapter Eight: Multivariate Analysis Up until now, we have covered univariate ( one variable ) analysis and bivariate ( two variables ) analysis. We can also measure the simultaneous effects of two or

More information

Test Driven Development (TDD)

Test Driven Development (TDD) Test Driven Development (TDD) Outline TDD Overview Test First vs. Test Last Summary Quotes Kent Beck said Test-first code tends to be more cohesive and less coupled than code in which testing isn t a part

More information

Healthy Delicious, healthy snack ideas. How to start walking for fitness. Exclusively for. September 2015 IN THIS ISSUE.

Healthy Delicious, healthy snack ideas. How to start walking for fitness. Exclusively for. September 2015 IN THIS ISSUE. Exclusively for September 2015 @yourservice Healthy Habits IN THIS ISSUE Delicious, healthy snack ideas page 3 How to start walking for fitness page 4 Improve your health at any age Let Us Help! No matter

More information

Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation

Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation Enhanced Detection of Lung Cancer using Hybrid Method of Image Segmentation L Uma Maheshwari Department of ECE, Stanley College of Engineering and Technology for Women, Hyderabad - 500001, India. Udayini

More information

INTRODUCTION. Study and Practice

INTRODUCTION. Study and Practice INTRODUCTION Study and Practice How can we alcoholics in recovery live happy, joyous, and free? (Alcoholics Anonymous, 133: 0) Alcoholics Anonymous is the life changing program formed by two desperate

More information

Predicting Diabetes and Heart Disease Using Features Resulting from KMeans and GMM Clustering

Predicting Diabetes and Heart Disease Using Features Resulting from KMeans and GMM Clustering Predicting Diabetes and Heart Disease Using Features Resulting from KMeans and GMM Clustering Kunal Sharma CS 4641 Machine Learning Abstract Clustering is a technique that is commonly used in unsupervised

More information

Exploring the Relationship Between Substance Abuse and Dependence Disorders and Discharge Status: Results and Implications

Exploring the Relationship Between Substance Abuse and Dependence Disorders and Discharge Status: Results and Implications MWSUG 2017 - Paper DG02 Exploring the Relationship Between Substance Abuse and Dependence Disorders and Discharge Status: Results and Implications ABSTRACT Deanna Naomi Schreiber-Gregory, Henry M Jackson

More information

Regression Including the Interaction Between Quantitative Variables

Regression Including the Interaction Between Quantitative Variables Regression Including the Interaction Between Quantitative Variables The purpose of the study was to examine the inter-relationships among social skills, the complexity of the social situation, and performance

More information

Table of Contents. Introduction. 1. Diverse Weighing scale models. 2. What to look for while buying a weighing scale. 3. Digital scale buying tips

Table of Contents. Introduction. 1. Diverse Weighing scale models. 2. What to look for while buying a weighing scale. 3. Digital scale buying tips Table of Contents Introduction 1. Diverse Weighing scale models 2. What to look for while buying a weighing scale 3. Digital scale buying tips 4. Body fat scales 5. Is BMI the right way to monitor your

More information

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering

Gene expression analysis. Roadmap. Microarray technology: how it work Applications: what can we do with it Preprocessing: Classification Clustering Gene expression analysis Roadmap Microarray technology: how it work Applications: what can we do with it Preprocessing: Image processing Data normalization Classification Clustering Biclustering 1 Gene

More information

Minority Report: ML Fairness in Criminality Prediction

Minority Report: ML Fairness in Criminality Prediction Minority Report: ML Fairness in Criminality Prediction Dominick Lim djlim@stanford.edu Torin Rudeen torinmr@stanford.edu 1. Introduction 1.1. Motivation Machine learning is used more and more to make decisions

More information