Hypothesis Testing: Strategy Selection for Generalising versus Limiting Hypotheses

Size: px
Start display at page:

Download "Hypothesis Testing: Strategy Selection for Generalising versus Limiting Hypotheses"

Transcription

1 THINKING GENERALISING AND REASONING, VERSUS 1999, LIMITING 5 (1), HYPOTHESES 67 Hypothesis Testing: Strategy Selection for Generalising versus Limiting Hypotheses Barbara A. Spellman University of Virginia, USA Alejandro López University of Hamburg, Germany Edward E. Smith University of Michigan, USA Humans appear to follow normative rules of inductive reasoning in premise diversity tasks that is, they know that dissimilar rather than similar evidence is better for generalising hypotheses. In three experiments, we use a hypothesis limitation task to compare a related inductive reasoning skill knowing how to limit hypotheses by using a negative test strategy. Participants are told that one category member has some property (e.g. Dogs have a merocrine gland) and are asked what evidence they would test to ensure that either all (generalisation) or only (limitation) category members have that property (e.g. All/Only mammals have merocrine glands; tests: wolf, bull, crocodile). Despite participants reluctance to use negative tests in the Wason task and other reasoning tasks, participants do use normatively correct negative tests in the hypothesis limitation task as often as they use diverse positive tests in the premise diversity task. Moreover, when given a hypothesis limitation task before a rule evaluation task (similar to the task), the use of negative tests increases. Thus, when testing hypotheses, people can and do use the right kind of test strategy for the task. Requests for reprints should be sent to: Barbara A. Spellman, Dept. of Psychology, University of Virginia, 102 Gilmer Hall, Charlottesville, VA 22903, USA. spellman@virginia.edu We would like to thank Lisa Stanton for assisting with data collection and analysis, and Patricia West for helpful discussions. Michael Doherty and Michael Gorman provided helpful comments on an earlier draft. Portions of this paper were presented as a paper at the Sixth Annual Meeting of the Southwest Cognition Conference (ARMADILLO), College Station, TX, May 1995; as a poster at the Third International Conference on Thinking, London, UK, August 1996; and as a paper at the 10th Annual Duck Conference on Social Cognition, Pine Island, NC, June Psychology Press Ltd

2 68 SPELLMAN, LÓPEZ, SMITH INTRODUCTION During most of the last 30 years, it has been fashionable among psychologists to show that human reasoning skills do not measure up to the normative models of rationality proposed by economists, statisticians, and philosophers (see Kahneman, Slovic, & Tversky, 1982, for a collection of early papers and Nisbett & Ross, 1980, for the classic review). However, one area in which human reasoning does seem to live up to the philosophical ideal is in respecting the diversity principle (see Osherson et al., 1990) the notion that hypotheses are considered to be stronger when supported by more diverse rather than more similar evidence (Carnap, 1950; Hempel, 1966; Nagel, 1939; Popper, 1962). In this article, we extend the investigation from how people use evidence to support or generalise their hypotheses, to another reasoning skill involving the relation between evidence and hypotheses: how people use evidence to narrow or limit their hypotheses. We believe that investigating this latter skill is relevant to comprehending performance in some of the so-called scientific reasoning tasks including the Wason (1960) rule-discovery (2-4-6) task. Premise Diversity in Generalisation People demonstrate respect for the diversity principle in several different inductive reasoning tasks. One is the evaluation of category-based inductive arguments. Osherson et al. (1990) had participants read two pairs of premises about different category members followed by the same conclusion about all category members. The participants task was to evaluate which was the better argument. For example, participants might have to choose between the following arguments: and Lions use norepinephrine as a neurotransmitter. Tigers use norepinephrine as a neurotransmitter. All mammals use norepinephrine as a neurotransmitter. Lions use norepinephrine as a neurotransmitter. Giraffes use norepinephrine as a neurotransmitter. All mammals use norepinephrine as a neurotransmitter. According to the diversity principle, the second argument is better because the conclusion is supported by more diverse evidence. About 75% of participants picked the (normatively correct) second argument. Another task in which participants demonstrate respect for the diversity principle is in the selection of evidence to support an argument (or hypothesis). In López s (1995) task, participants were told a fact about a given category member

3 GENERALISING VERSUS LIMITING HYPOTHESES 69 and asked what other category member they would examine in order to test the hypothesis that the fact is true of all category members. For example: Suppose you know for a fact that: Dogs have a merocrine gland. What additional mammal would you examine to test whether: All mammals have a merocrine gland? Choose one of the following: (a) wolf (b) bull According to the diversity principle, the normative answer is bull; that is, to generalise from dogs to all mammals one should choose to obtain information about other mammals that are as different from dogs as possible. López found that a majority of participants (62%) chose the normative answer in this task, and that the less similar a choice was to the initial category member, the more likely it was to be chosen. Thus, in both the evaluation of arguments and in the selection of evidence to support those arguments, it seems that most participants do understand the value of diverse evidence in generalising a hypothesis. The Hypothesis-limitation Task The goal of a reasoner, however, may be to limit a hypothesis, not to generalise it. To illustrate using the previous (dog) example, whereas someone might want to know whether all mammals had merocrine glands, someone else might want to know whether only mammals had merocrine glands, In this case, imagine the possible answers are: (a) wolf, (b) bull, and (c) crocodile. If we knew that dogs had merocrine glands and wanted to know whether only mammals had them, we should test the one animal that could give us relevant evidence: the non-mammal crocodile. A major focus of this article is the issue of whether people are as good at selecting evidence for this kind of inductive reasoning (i.e. limiting their hypotheses) as they are for generalising their hypotheses. The Wason Rule-discovery ( ) Task Previous research, particularly using the Wason (1960) rule-discovery ( ) task, might suggest that participants would be bad at the hypothesis-limitation task because it involves the use of negative tests (as described by Klayman & Ha, 1987). The Wason task is a widely used experimental paradigm, the results of which were initially interpreted as showing that (untrained) human scientific reasoning is flawed because people try to confirm rather than disconfirm their hypotheses. In the Wason (1960) task, participants are told that the experimenter has in mind a simple rule describing a particular relation between any three numbers,

4 70 SPELLMAN, LÓPEZ, SMITH and that the triad conforms to this rule. The participants task is to discover the experimenter s rule by proposing other test triads (sometimes along with the reasons for proposing them). For each test triad, participants are told whether or not it conforms to the rule. Participants are asked to announce the rule once they are convinced of having discovered it. They are then told whether or not the announced rule is correct, and, if it is wrong, are allowed to continue proposing triads, reasons, and rules. In the canonical version of the task, the simple rule that the experimenter has in mind is any ascending triad. When conducting experiments using this task, Wason (1960, 1962, 1968a; Wason & Johnson-Laird, 1972) and many others (e.g. Farris & Revlin, 1989; Gorman, Stafford, & Gorman, 1987; Klayman & Ha, 1989; Mahoney & DeMonbreun, 1977; Tweney et al., 1980) find that most participants form an initial narrow hypothesis such as the rule is triads ascending by two, and then test it by proposing other triads conforming to that hypothesised rule (e.g ). Because those triads also happen to conform to the experimenter s actual rule, the participants obtain confirming evidence for their hypothesised rule, and then often make an incorrect rule announcement. Most participants (at least initially before the first rule announcement) fail to test their hypothesised rule by proposing triads not conforming to it (e.g ). Given that many triads that fall outside the hypothesised rule also happen to conform to the actual rule, the participants would obtain disconfirming evidence for their too-narrow hypothesised rule, and, presumably, revise and expand it, and not make an incorrect rule announcement. Initially this tendency to test triads that conform to one s own hypothesised rule was referred to as confirmation bias (see Evans & Over, 1996, and Klayman, 1995, for reviews). The suspicion was that participants wanted to find data that confirmed their own initial hypotheses and this bias was pervasive in their generation of test items. The normative method of scientific inquiry against which this tendency was evaluated was the method of falsification proposed by the philosopher Popper (1959) and adopted by the psychologists who were evaluating human hypothesis-testing performance (e.g. Wason, 1960). 1 According to Popper, in order to best test one s hypothesis, one should attempt to falsify it. If the resulting data then confirm the hypothesis, those are more valuable data than data obtained from experiments that were merely trying to confirm the hypothesis. The results from many experiments were therefore interpreted as showing that humans were non-normative reasoners because they failed to try to disconfirm their hypotheses and merely tried to confirm them. 1 Popper (1959) believed that falsification both describes how science actually proceeds and is the way science ought to proceed. Many philosophers of science disagree with each of those ideas. For example, Kuhn (1970) argues that science proceeds through stages of slow accretion and rapid revolution. Quine (1961; Duhem, 1954/1906) and others believe that falsification is not merely nonnormative, but that it is essentially impossible.

5 GENERALISING VERSUS LIMITING HYPOTHESES 71 Klayman and Ha (1987) interpret these results as in a different manner. Participants, they argue, are not necessarily trying to confirm their hypotheses; it just looks that way in this particular version of the rule-discovery task because the hypothesised rule usually is a subset of the experimenter s rule and therefore all triads that conform to the former also conform to the latter. However, given other relationships between the two rules, proposing triads that conform to the hypothesised rule could, in fact, disconfirm the hypothesised rule. For example, if the hypothesised rule is triads ascending by two, but the experimenter s actual rule is the narrower rule of even triads ascending by two, the proposed triad would conform to the hypothesised rule but would also disconfirm it. Thus, although the participants strategy in theory could lead to finding disconfirming cases, they are just not getting any disconfirming cases in the canonical version of the task. Klayman and Ha (1987) argue that a better way of interpreting what participants are doing is that they are using a positive test strategy, in which cases conforming to a given hypothesis are tested, rather than a negative test strategy, in which cases not conforming to a given hypothesis are tested. Note that either strategy could lead to disconfirmation depending on the relation between the hypothesised and actual rules. The numerous results from the task can be interpreted in light of this strategy distinction. (See Klayman & Ha, 1987, for a description of the use of the positive test strategy in other reasoning tasks.) Task Differences The fact that participants often fail to use negative tests in the Wason task suggests that they might also fail to use negative tests (e.g. fail to pick crocodile ) in a hypothesis-limitation task. However, there are several differences between the Wason rule-discovery task and our task that might lead to differing performance in these tasks. First, the original task is a rule-discovery task and entails a dual generation process of producing both a plausible hypothesis (rule) and appropriate tests (triads) (see Klahr & Dunbar s, 1988, scientific discovery as dual search model of scientific reasoning). In López s (1995) premise diversity task and our hypothesis-limitation task, the hypothesis/rule is given (e.g. all mammals have a merocrine gland) and appropriate tests are provided (e.g. wolf, bull, crocodile); the participants need only evaluate which of those given tests is best for assessing the given hypothesis. Second, successful rule discovery demands more evidence than either successful generalisation or limitation, in that for rule discovery every exemplar is relevant to assessing the rule. In contrast, if one merely wants to discover whether all mammals have a merocrine gland, at worst one needs to test all mammals. Whether or not other types of animals (e.g. fish, insects, birds) also have a merocrine gland is irrelevant. If one merely wants to discover whether

6 72 SPELLMAN, LÓPEZ, SMITH only mammals have a merocrine gland, at worst one needs to test all nonmammals. Whether every different type of mammal (e.g. whales, bats, kangaroos) also has a merocrine gland is irrelevant. However, if one wants to discover the rule about which animals have a merocrine gland, one needs to test not only all mammals (to make sure that they do) but also all non-mammals (to make sure that they do not). Thus in the original task, testing the proposed hypothesis that the rule is triads ascending by two involves discovering not only that all triads ascending by two conform to the rule (promoting a positive test strategy) but also that only triads ascending by two conform to the rule (promoting a negative test strategy). Thus, both kinds of test and every exemplar are relevant to the rule-discovery task. Wason (1960) was quite aware of the two-sided nature of the rule-discovery task. What astonished him was that intelligent young adults would only be concerned with verifying that, in fact, all the exemplars conforming to their hypothesis did conform to the experimenter s rule (i.e. they made positive tests), but did not attempt to verify that only the exemplars conforming to their hypothesis did conform to the experimenter s rule (i.e. they did not make negative tests). Our experiments were intended to evaluate participants intentional use of negative tests to limit hypotheses and to compare that to their ability to use diverse positive tests to generalise hypotheses. Because the choice of test is relevant to performance in Wason s task, we also included a rule-like condition in which both positive and negative tests would be necessary for assessing a hypothesis. GENERAL METHOD The three experiments presented in this article are somewhat similar and so the basic research strategy is described here first. Differences are summarised in Table 1. Participants read a statement about evidence for an unknown rule (i.e. hypothesis). Depending on the condition All, Only, or Rule participants are then asked a particular question about what other evidence they would like to obtain to assess the rule. For example: Suppose you know for a fact that: The ordered triad of numbers conforms to an unknown rule about number triads called alpha. What triad would you choose to test whether or not: (All condition) All triads ascending by two conform to the alpha rule? or (Only condition) Only triads ascending by two conform to the alpha rule? or (Rule condition) The alpha rule is triads ascending by two?

7 GENERALISING VERSUS LIMITING HYPOTHESES 73 TABLE 1 Differences Between Experiments Experiment Content Questions Task 1 Animals & Numbers All Evaluation (within-subject) Only Rule 2 Numbers All-Rule Evaluation Only-Rule Rule 3 Numbers All-Rule Generation Only-Rule Rule The All condition is designed to be analogous to López s (1995) premise diversity task. Participants know the hypothesis in question, are given a piece of supporting evidence, and need to obtain other evidence to determine whether the hypothesis generalises to all triads ascending by two (other kinds of triads are irrelevant). The normative test strategy is a positive test strategy using a diverse test triad. The Only condition is designed to reveal whether participants will use a negative test strategy when it is the single normatively correct strategy. Again, participants know the hypothesis in question and are given a piece of supporting evidence, but in this condition they need to determine whether the hypothesis is limited to only triads ascending by two, by ruling out other kinds of triads. The normative test strategy is therefore a negative test strategy; that is, testing a triad that does not ascend by two to make sure it does not conform to the rule. By comparing performance in the All and Only conditions, we can compare participants understanding of the value of negative tests for limiting hypotheses to their understanding of the value of diverse positive tests for generalising hypotheses. The Rule condition is meant to capture some of the difficulty of the Wason task. Of course, it is not a rule-discovery task because participants are provided with the rule to evaluate and do not have to propose it for themselves. In addition, in the present task participants get to make just one test whereas in the task participants know they can make many tests. However, in the Rule condition, as in the original task, two kinds of evidence are relevant to verifying that a rule is correct using positive tests to make sure that all triads ascending by two do conform and using negative tests to make sure that all triads not ascending by two do not conform. By comparing performance in the Rule and Only conditions, we can determine whether participants understand the value of

8 74 SPELLMAN, LÓPEZ, SMITH negative tests but merely choose not to use them, at least as a first test, when both kinds of tests are required. EXPERIMENT 1 In Experiment 1, as just described, participants were given a piece of evidence (e.g ) and a hypothesised rule (e.g. the rule is triads ascending by two ), then, depending on condition (All, Only, or Rule), asked a question about what further evidence they would like to obtain. Participants were also provided with three test triads from which to choose (e.g , , 1-2-3). Of these three test triads, one was a similar positive test (i.e ), one was a dissimilar (or diverse) positive test (i.e ), and one was a negative test (i.e ) of the hypothesised rule. In this experiment, we thought that providing the test triads (in addition to providing the hypothesised rule) might lead participants into using negative tests in the Rule condition. We are not the first researchers to use this tactic: Kareev and Halberstadt (1993) also modified the original task by providing test triads; they demonstrated that when given test choices people sometimes do appreciate the benefits of negative testing. In two experiments, they had participants assess the protocol of an imaginary problem-solver working on the task. In their Experiment 1, participants were provided with an example triad (e.g ), the problem solver s hypothesised rule (e.g. the rule is triads where the second number is twice the first number, and the third number is three times the first number ), and four possible test triads including two positive tests (e.g ) and two negative tests (e.g ). The participants task was to evaluate how useful each test triad would be for discovering the rule. Overall, participants judged positive tests to be more useful than negative tests for discovering the rule. Experiment 2 was similar, but this time the participants were also provided with the experimenter s rule (e.g. the rule is triads of twodigit numbers ). For each of the two positive and two negative tests, one would falsify and the other would confirm the problem-solver s hypothesised rule. In this case, participants judged the negative tests that falsified the hypothesised rule to be the most valuable (with negative tests that confirmed the hypothesised rule to be the least valuable). Thus, people may appreciate the benefits of negative testing at least when they know the experimenter s actual rule and so know what the outcome of the test would be. In addition, Kareev and Avrahami (1995) have shown that people appreciate the benefits of negative tests and falsifying exemplars when creating a list of exemplars to teach someone else a rule. However, evaluating the worth of a particular test when you know the outcome of it (i.e. whether it is confirming or falsifying) may be quite different from evaluating the worth of a particular kind of test independent of the outcome. It is the latter that is of interest in most of this literature, and in this article.

9 GENERALISING VERSUS LIMITING HYPOTHESES 75 In addition to the differences between the present task and the canonical task mentioned earlier, there is one other notable difference between the way the premise diversity tasks and the task have been run in the past: the former have used familiar members of natural kind categories (e.g. mammals, fruits) whereas the latter has used numbers. Although it may seem odd to suggest that a mere difference in content could affect performance, there are many examples of formally equivalent reasoning tasks for which content does matter (most notably Wason s, 1966, 1968b, other famous task, the Wason selection task; see Holyoak & Spellman, 1993, for a review). Thus, in Experiment 1 we used both animals and numbers as content. Method Participants and Procedure A total of 144 University of Texas undergraduates participated in partial fulfilment of a course requirement. (The data from one additional participant were discarded because two answers were circled in response to one question.) Participants were tested in small groups of varying sizes. Each participant received a booklet containing the problems of interest along with other short reasoning problems. Participants were encouraged to take as much time as they needed to answer all questions. Design There were two independent variables: condition and content. Condition was tested between subjects; each participant was given either All, Only, or Rule questions. Content was tested within subject; each participant received two animal problems and two number problems. Half did the animal problems first, half did the number problems first. Which particular version of the problem went first was also counterbalanced across participants. In each booklet, the first two similar-content problems were followed by a short unrelated (filler) reasoning task, then by the other two similar-content problems. Materials The materials were adapted from López s (1995) premise diversity task; examples are presented here. (The full set of materials is shown in Appendix A.) Each version was on a separate page. Participants were told a piece of evidence and then asked what further evidence they would like to obtain to assess a particular hypothesis. In each condition participants were given one hypothesis to test for each item. (Unlike the examples shown here, participants saw only one question per item and did not see any of the italicised condition or test labels.) They were then told to circle one

10 76 SPELLMAN, LÓPEZ, SMITH of three possible answers, which in fact corresponded to: a positive test of the hypothesis that used evidence similar to that initially provided; a positive test of the hypothesis that used evidence dissimilar from that initially provided; or a negative test of the hypothesis. 2 For each version of each problem, the order of choices of answers was counterbalanced across participants. Animal Versions: Suppose you know for a fact that: Dogs have a merocrine gland. What organism would you examine to test whether or not: (All condition) All mammals have a merocrine gland? or (Only condition) Only mammals have a merocrine gland? or (Rule condition) A merocrine gland is a mammal property? Choose one of the following organisms (circle the answer): (a) wolf (positive test similar animal) (b) bull (positive test dissimilar animal) (c) crocodile (negative test) In the other animal version, the known fact was that hippopotamuses have an ulnar artery, the questions were about mammals having ulnar arteries, and the answer choices were: rhinoceros (similar), hamster (dissimilar), and robin (negative). Number Versions: Suppose you know for a fact that: The ordered triad of numbers conforms to an unknown rule about number triads called alpha. What triad would you examine to test whether or not: (All condition) All triads ascending by two conform to the alpha rule? or (Only condition) Only triads ascending by two conform to the alpha rule? or (Rule condition) The alpha rule is triads ascending by two? Choose one of the following triads (circle the answer): (a) (positive test similar triad) (b) (positive test dissimilar triad) (c) (negative test) In the other number version, the known fact was that the ordered triad of numbers conforms to an unknown rule about number triads called beta, 2 We did not give participants the choice between similar and dissimilar negative tests. It seems that whereas a dissimilar positive test is the normative test for the All question, a similar negative test would be the normative test for the Only condition.

11 GENERALISING VERSUS LIMITING HYPOTHESES 77 the questions were about triads descending by four, and the answer choices were: (similar), (dissimilar), and (negative). Results and Discussion Scoring Using Consistent Strategies For each content, we determined which participants selected the same test strategy for the two versions (i.e. the two animal problems or the two number problems). These consistent participants were scored as similar if they picked the positive test/similar evidence answer for both versions; dissimilar if they picked the positive test/diverse evidence answer for both versions; and negative if they picked the negative test for both versions. We used this procedure to reduce the effects of guessing. The number of consistent participants in each condition is presented in the leftmost numerical column of Table 2. Overall, 81% of the time participants used a consistent strategy; the percentage did not vary much across conditions (77% to 85%). Performance by Condition In the All condition, most of the consistent participants selected the normative answer: positive test/dissimilar evidence. The percentages were nearly equal in the number and animal problems, suggesting that many participants understand TABLE 2 Experiment 1: Percentage of Consistent Participants Using Each Strategy in Each Condition Strategy Positive test Positive test Negative test Condition N (similar) (dissimilar) Number All Only Rule Animal All Only Rule N = the number of participants out of 48 using a consistent strategy. Bold indicates normative test strategy in the All and Only conditions. For the number problems, based on actual frequencies, x 2 (4, N = 118) = 36.79, P < For the animal problems, based on actual frequencies, x 2 (4, N = 115) = 37.13, P <.001.

12 78 SPELLMAN, LÓPEZ, SMITH the importance of premise diversity when generalising a hypothesis in either domain. (Of course, they were the same participants in both domains.) In the Only condition, most of the consistent participants selected the normative answer: negative tests. Again, the percentages were similar in the number and animal problems, suggesting that participants understand the importance of negative tests when limiting a hypothesis in either domain. More importantly, the proportion of consistent participants using negative tests in the Only condition was not significantly different from the proportion using dissimilar tests in the All condition [for number, x 2 (l, N = 78) = 0.32, ns; for animal, x 2 (1, N = 76) = 1.39, ns]. Thus, many participants will use negative tests when they are the single normatively correct answer. The Rule condition shows the most strategy variability: no strategy was used by a majority of the participants. (Although this variability may seem odd given the usual homogeneity described previously, the present heterogeneity may be attributable to the present method: here participants selected which of the given evidence to test, usually participants must generate their own evidence to test.) For both contents, about 40% of the consistent participants selected negative tests; in the number problems most of the others selected similar tests whereas in the animal problems most of the others selected dissimilar tests. The proportion of consistent participants using negative tests in the Rule condition is significantly less than the proportion using negative tests in the Only condition [for number, x 2 (1, N = 77) = 4.68, P <.05; for animal x 2 (1, N = 78) = 12.04, P <.001]. Therefore, the lack of negative tests in the Rule condition is not due to a failure to appreciate the evidentiary value of such tests (as shown by the results in the Only condition); rather, the variety in strategy selection in the Rule condition is our first piece of evidence that participants (correctly) interpret the Rule condition as requiring both kinds of tests. Content Effects Performance on the animal and number problems is not actually as similar as it appears to be in Table 2. When analysed by content order, the Rule condition shows some asymmetric transfer effects. Although interesting, these effects are not relevant to the current analysis; they are documented and described in Appendix B. EXPERIMENT 2 Experiment 2 had two main goals. First, we wanted to replicate the results of Experiment 1 using more participants. Because of the asymmetric transfer effects, we really only had uncontaminated data from 24 participants in each condition. Therefore, in Experiment 2 we used just the number problems.

13 GENERALISING VERSUS LIMITING HYPOTHESES 79 Second, and more importantly, we wanted to further investigate the variability of answers in the Rule condition. As we have argued, when discovering or evaluating a rule, one needs to find both all of the items included by the rule and all excluded by the rule. The former suggests a positive test strategy (preferably one using dissimilar tests) whereas the later suggests a negative test strategy. Yet in the number problems, the plurality of the consistent participants in the Rule condition used a similar test that is, did not follow either optimal strategy even though many participants showed that they understood the value of dissimilar tests (in the All condition) and negative tests (in the Only condition). In Experiment 2 we thought we could increase the participants use of optimal tests in the Rule condition by priming them to realise the value of one of those (i.e. either dissimilar or negative) tests. Various attempts have been made to try to foster the use of negative tests on the original task. Because participants who propose more negative tests are more likely to discover the rule (e.g. Wason, 1960), researchers thought that fostering the use of negative tests would facilitate performance. In one line of research (e.g. Gorman et al., 1987; Tweney et al., 1980; Wharton, Cheng, & Wickens, 1993), participants are told that they have to guess two rules that the experimenter has in mind (the target rule being the rule). When participants are informed that the two rules are complementary (i.e. that all triads conform to exactly one of the rules), the number of correct first rule announcements doubles (to 43% in Wharton et al., 1993). Why? When participants make positive tests of one of the rules, they are making negative tests of the complementary rule. Those tests give them information that will allow them to disconfirm overly narrow hypotheses. A second line of research found that under some conditions, explicitly instructing participants to use negative tests will greatly increase the use of such tests and the chances of success at the task (Gorman & Gorman, 1984; see Evans, 1989, for a review). In a third line of research, more similar to the present, Kareev, Halberstadt, and Shafir (1993) found that participants who passively see negative test triads provided by the experimenters during a practice problem will use negative tests more often on a subsequent problem than participants who have to generate their own triads during the practice problem. In Experiment 2 we wanted to find out whether we could foster the use of dissimilar and negative tests by having participants actively select such tests on an initial task (All or Only, respectively); participants might then transfer the selected strategy to the Rule task. Accordingly, in Experiment 2 one-third of the participants answered All questions before answering Rule questions; one-third answered Only questions before answering Rule questions; one-third answered just Rule questions. If the initial questions reveal the value of the test strategy for the Rule condition, then participants who answer All questions first should be more likely to use dissimilar tests when later answering Rule questions, whereas

14 80 SPELLMAN, LÓPEZ, SMITH participants who answer Only questions first should be more likely to use negative tests when later answering Rule questions. Method Participants and Procedure A total of 144 University of Texas undergraduates participated in partial fulfilment of a course requirement. (The data from five additional participants were discarded: four circled two answers in response to a single question; one claimed to not understand the experiment.) None had participated in the previous experiment. The procedure was the same as in Experiment 1. Design There were three between-subjects conditions: Rule, All-Rule, and Only- Rule. (Content was dropped as a variable.) In the Rule condition, participants answered just the Rule questions for both versions of the number problem (i.e and ). In the All-Rule and Only-Rule conditions, participants answered the All (or Only) questions for both versions, then did a short unrelated (filler) reasoning task, then answered the Rule questions for both versions. 3 Half of the participants did the version or versions first; the other half did the version or versions first. Materials We used the two versions of the number problem from Experiment 1. The only change we made was that for the All and Only questions we put the words ALL and ONLY in bold capital letters so that participants in the All-Rule and Only-Rule conditions, respectively, would be sure to realise that those first questions were different from the following Rule questions. Results and Discussion Overall, 75% of the time participants used a consistent strategy. For the questions they answered first, 70% of the participants were consistent; for the Rule questions when answered second, 81% of the participants were consistent. (The number of consistent participants is given in the leftmost numerical column of Table 3.) Participants who answered the Rule questions second were marginally 3 Although it appears we are confounding condition and practice in that we are comparing Rule questions answered first (in the Rule condition) to Rule questions answered second (in the All-Rule and Only-Rule conditions), we are mostly interested in the difference between the latter conditions. 4 Note that there is no increase in consistency when participants answered Rule questions following other Rule questions in Experiment 1 (see the bottom of Table 5).

15 GENERALISING VERSUS LIMITING HYPOTHESES 81 TABLE 3 Experiment 2 (Evaluation): Percentage of Consistent Participants Using Each Strategy in Each Condition Strategy Positive test Positive test Negative test Condition N (similar) (dissimilar) Questions Answered First All Only Rule Rule Questions When Second All-Rule Only-Rule N = the number of participants out of 48 using a consistent strategy. Bold indicates normative test strategy in the All and Only conditions. For the top three conditions, based on actual frequencies, x 2 (4, N = 101) = 50.45, P <.001. For the bottom three conditions, based on actual frequencies, x 2 (4, N = 110) = 37.53, P <.001; for the bottom two conditions, x 2 (2, N = 78) = 6.88, P <.05. more likely to be consistent than participants who answered the Rule questions first, x 2 (1, N = 144) = 3.77, P = Questions Answered First The top of Table 3 contains the results for the questions answered first (All, Only, or Rule). These results amount to a replication of the All, Only, and Rule conditions of Experiment 1 when the number problems were done first (see Appendix B). Overall, participants showed different patterns of answers across the three types of questions (see Note to Table 3). As in Experiment 1, for the All questions, most of the consistent participants (54%) again selected the normative answer of the dissimilar positive test. For the Only questions, most of the consistent participants (68%) again selected the normative answer of the negative test. Again, the proportion using negative tests for the Only questions was not significantly different from the proportion using dissimilar tests for the All questions, x 2 (1, N = 69) = 1.29, ns. For the Rule questions, when answered first, most of the consistent participants (76%) selected the positive similar test. The smaller percentage of participants selecting the negative tests (relative to Experiment 1) is concordant with the Experiment 1 results from participants who did the number problems first (see Appendix B). Again, the proportion of consistent participants using negative tests in the Rule condition is significantly less than the proportion using negative tests for the Only questions, x 2 (1, N = 66)

16 82 SPELLMAN, LÓPEZ, SMITH = 20.74, P <.001. Thus, even though many participants will select negative tests in response to an Only question, far fewer do so for the Rule questions. Rule Questions When Answered Second Participants performance on the Rule questions depends on whether the Rule questions are answered first or answered after the All or Only questions. The pattern of strategy use for the Rule questions varied overall across the Rule, All- Rule, and Only-Rule conditions; in particular, the pattern differed between the All-Rule and Only-Rule conditions (see Note to Table 3). Answering the All questions first significantly increased the use of dissimilar tests relative to the Rule baseline condition, x 2 (1, N = 71) = 13.17, P <.001. Similarly, answering the Only questions first significantly increased the use of negative tests relative to the Rule baseline condition, x 2 (1, N = 71) = 11.81, P <.001. The fact that people will change strategies when answering Rule questions when other appropriate strategies are primed (see also Kareev et al., 1993), is our second piece of evidence that participants (correctly) interpret the Rule condition as requiring both kinds of tests. Summary of Experiments 1 and 2 In Experiments 1 and 2 we have seen that people will often use negative tests to limit a hypothesis when (a) they simply have to evaluate which presented evidence is best, and (b) the test is phrased in terms of an Only question. However, when the task is to assess a rule, people do not often use negative tests even when such tests are provided to them. Instead, for the Rule questions, people use a variety of test strategies, which can be altered by prior context. To the extent that our Rule task is similar to the task, it could be argued from our data that at least part of the failure to use a negative test strategy in the original task may be due to the Rule condition requiring both kinds of tests. However, the failure to use a negative test strategy in the original task may also be due (in part) to the fact that participants have to generate their own test triads. We address this issue in our next experiment. EXPERIMENT 3 The purpose of our third experiment was to determine whether strategy use would be different if participants had to generate their own evidence (as they do in the original task) rather than merely evaluate given evidence (as in Experiments 1 and 2). To examine this, we designed Experiment 3 to be identical to Experiment 2 in all respects except for its generation format: participants were asked to produce a single triad themselves in order to test a (given) hypothesised rule.

17 GENERALISING VERSUS LIMITING HYPOTHESES 83 It seems that the generation task should be harder than the evaluation task. Participants might be less likely to spontaneously think of the correct test strategy than they were when the assortment of answers suggested various potential test strategies. In addition, in the absence of alternative answers (i.e. potential evidence), participants might be more strongly primed by the initial evidence and so be more likely to generate something similar to it. As a result, we would expect fewer consistent participants and fewer normatively correct participants for the All and Only questions. For the Rule questions, if it is merely the necessity of both kinds of test that makes the rule task difficult, participants should perform no differently in the generation format of Experiment 3 than they did in the evaluation format of Experiment 2. If, however, the generation format also adds to the difficulty of the Rule questions, we would surmise that it also adds to the difficulty of the task. Method Participants and Procedure A total of 144 University of Texas undergraduates participated in partial fulfilment of a course requirement. (The data from five additional participants were discarded: three skipped questions; one claimed to not understand the experiment; one wrote that the problems could not be solved by proposing only one more triad.) None had participated in either of the previous experiments. The procedure was the same as in the previous experiments. Design The design was identical to Experiment 2. The conditions again were: Rule, All-Rule, and Only-Rule. Materials The materials were identical to Experiment 2 with one change. As in Experiment 2, participants were asked what triad they would choose to test the particular question, but instead of being provided with a choice of triads they were told to generate one themselves and write it down. Three blank lines separated by dashes were supplied. Results and Discussion Scoring For all questions, with respect to the hypothesised rule and the example triad (i.e ), producing a test triad ascending by two constitutes a positive test of the rule, whereas producing any other test triad constitutes a negative test. We distinguished similar from dissimilar positive tests in the following way: if the

18 84 SPELLMAN, LÓPEZ, SMITH triad was even and ascended by two it was considered a similar test; all other triads ascending by two were considered dissimilar tests. Although the line between similar and dissimilar seems somewhat arbitrary, it is parallel to the answers provided in Experiment 2. Thus triads such as 4, 2, 0 or 1002, 1004, 1006 which could be thought of as dissimilar were coded as similar, whereas triads like 1, 3, 5 were coded as dissimilar. Dissimilar triads also included triads like 108.5, 110.5, and x-2, x, x+2. 5 Consistent Participants Overall, 67% of the time participants used a consistent strategy. For the questions they answered first, 69% of the participants were consistent; for the Rule questions when answered second, 64% of the participants were consistent. The number of consistent participants is given in the leftmost numerical column of Table 4. TABLE 4 Experiment 3 (Generation): Percentage of Consistent Participants Using Each Strategy in Each Condition Strategy Positive test Positive test Negative test Condition N (similar) (dissimilar) Questions Answered First All Only Rule Rule Questions When Second All-Rule Only-Rule N = the number of participants out of 48 using a consistent strategy. Bold indicates the normative test strategy in the All and Only conditions. For the top three conditions, based on actual frequencies x 2 (4, N = 99) = 32.43, P <.001. For the bottom three conditions, based on actual frequencies, x 2 (4, N = 92) = 17.99, P <.01; for the bottom two conditions, x 2 (2, N = 61) = 9.91, P < For the other question (beta rule; ), in all conditions, with respect to the hypothesised rule and the example triad, producing test triads descending by four constitutes a positive test of the rule, whereas producing any other test triad constitutes a negative test. We delineated similar from dissimilar positive tests again using odds and evens; for this rule, if the triad was odd and descended by four it was considered a similar test; all other triads descending by four were considered dissimilar tests.

19 Questions Answered First GENERALISING VERSUS LIMITING HYPOTHESES 85 The top of Table 4 shows the results of Experiment 3 for the questions answered first. Overall, participants showed different patterns of answers across the three types of questions (see Note to Table 4). For the All questions, almost half of the consistent participants (47%) generated the normative answer of a dissimilar positive test. For the Only questions, just over half of the consistent participants (53%) generated the normative answer of a negative test. Again, the proportion using negative tests for the Only questions was not significantly different from the proportion using dissimilar tests for the All questions, x 2 (1, N = 68) =.24, ns. For the Rule questions, when given first, most of the consistent participants (55%) generated a positive similar test. Again, the proportion of consistent participants using negative tests in the Rule condition is significantly less than the proportion using negative tests for the Only questions, x 2 (1, N = 63) 16.28, P <.001. These results, therefore, show the same general pattern as the previous experiments. Rule Questions When Answered Second Participants performance on the Rule questions again depends on whether the Rule questions are answered first or after the All or Only questions. The pattern of strategy use for the Rule questions varied across the Rule, All-Rule, and Only- Rule conditions; again, in particular, the pattern differed between the All-Rule and Only-Rule conditions (see Note to Table 4). Unlike Experiment 2, however, answering the All questions first did not significantly increase the use of dissimilar tests relative to the Rule baseline condition, x 2 (1, N = 65) =.002, ns. This result could be attributed to the imprecision in distinguishing similar from dissimilar tests (as discussed in the previous section entitled Scoring). As in Experiment 2, answering the Only questions first significantly increased the use of negative tests relative to the Rule baseline condition, x 2 (1, N = 58) = 13.09, P <.001. (There is no imprecision in deciding what counts as a negative test.) Comparing Experiments 2 and 3 The generation version of the task looks more difficult (although not significantly so) across many measures. When measuring consistency: for questions answered first, the same proportion of participants were consistent in Experiments 2 (70%) and 3 (69%), x 2 (1, N = 228) =.07, ns; however, for the rule question when answered second, fewer participants were consistent in Experiment 3 (64%) than in Experiment 2 (81%), x 2 (1, N = 192) = 7.53, P <.01. When measuring use of the normative strategy: for the All questions, consistent participants in Experiment 3 were less likely to use the normative dissimilar test strategy than those Experiment 2 [47% vs. 54%; x 2 (1, N = 71) =.35, ns];

20 86 SPELLMAN, LÓPEZ, SMITH similarly, for the Only questions, consistent participants in Experiment 3 were less likely to use the normative negative test strategy than those in Experiment 2 [53% vs. 68%; x 2 (1, N = 66) = 1.46, ns]. For the Rule questions (summing across when they were answered first and second), consistent participants in the generation version were also less likely to use a negative test strategy than those in the evaluation version [21% vs. 30%; x 2 (1, N = 202) = 2.29, P =.13, ns]. Note that although the generation task seems more difficult than the evaluation task, the increase in difficulty is comparable across question-type. In particular, in both Experiments 2 and 3, the consistent participants more often used negative tests for the Only questions than for the Rule questions but the size of the advantage was not different across the two experiments. That is, given the question (i.e. Only or Rule), consistent participants did not differ across experiments in the proportion of times they used negative tests, x 2 (2, N = 129) = 2.12, ns. 6 CONCLUSIONS Our three experiments demonstrate that many people will use a negative test strategy in the hypothesis-limitation (i.e. Only) task. That strategy is used both when evaluating and generating evidence. In fact, people use that strategy as frequently as they use a diversity strategy in the premise diversity (i.e. All) task. Of course, not all participants use the normative negative test strategy; however, the proportion who do is far greater than expected given the previous findings regarding the failure to use negative tests in the standard Wason task. In contrast to our other findings, in our Rule task participants did not often use negative tests, although the use of such tests is clearly one optimal strategy. It seems that sometimes participants (correctly) interpret the Rule task as requiring both (diverse) positive and negative tests: one piece of evidence for that is the variability of test use in Experiment 1; a second is the effectiveness of priming test strategy in Experiment 2. Participants had even more difficulty coming up with negative tests when they had to generate such tests in Experiment 3. Both of these factors the necessity for both kinds of tests and the difficulty of generation may contribute to the lack of use of negative tests in the original Wason task. Of course, unlike the Wason task, our tasks involve only a single trial; thus, we cannot attempt to explain the lack of negative tests in subsequent trials after the first. However, it should also be recalled that the original Wason task is a rule discovery task, that is, it entails a dual generation process of producing both the hypothesised rule and the test triads, whereas in (1989). 6 The statistical test is for conditional independence with two degrees of freedom. See Wickens

When Falsification is the Only Path to Truth

When Falsification is the Only Path to Truth When Falsification is the Only Path to Truth Michelle Cowley (cowleym@tcd.ie) Psychology Department, University of Dublin, Trinity College, Dublin, Ireland Ruth M.J. Byrne (rmbyrne@tcd.ie) Psychology Department,

More information

Facilitated Rule Discovery in Wason s Task: The Role of Negative Triples

Facilitated Rule Discovery in Wason s Task: The Role of Negative Triples Facilitated Rule Discovery in Wason s 2-4-6 Task: The Role of Negative Triples Maggie Gale (M.Gale@derby.ac.uk) Institute of Behavioural Sciences, University of Derby Western Road, Mickleover Derby DE3

More information

Wason's Cards: What is Wrong?

Wason's Cards: What is Wrong? Wason's Cards: What is Wrong? Pei Wang Computer and Information Sciences, Temple University This paper proposes a new interpretation

More information

Why Does Similarity Correlate With Inductive Strength?

Why Does Similarity Correlate With Inductive Strength? Why Does Similarity Correlate With Inductive Strength? Uri Hasson (uhasson@princeton.edu) Psychology Department, Princeton University Princeton, NJ 08540 USA Geoffrey P. Goodwin (ggoodwin@princeton.edu)

More information

Analysis of Human-Human and Human-Computer Agent Interactions from the Viewpoint of Design of and Attribution to a Partner

Analysis of Human-Human and Human-Computer Agent Interactions from the Viewpoint of Design of and Attribution to a Partner Analysis of - and -Computer Interactions from the Viewpoint of Design of and to a Partner Kazuhisa Miwa (miwa@is.nagoya-u.ac.jp) Hitoshi Terai (terai@cog.human.nagoya-u.ac.jp) Graduate School of Information

More information

Dual-goal facilitation in Wason s task: What mediates successful rule discovery?

Dual-goal facilitation in Wason s task: What mediates successful rule discovery? THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY 2006, 59 (5), 873 885 Dual-goal facilitation in Wason s 2 4 6 task: What mediates successful rule discovery? Maggie Gale University of Derby, Derby, UK

More information

D 7 A 4. Reasoning Task A

D 7 A 4. Reasoning Task A Reasoning Task A Imagine that each of the squares below represents a card. Suppose that you know from previous experience with these cards that every card has a letter on one side and a number of the other

More information

Effect of Positive and Negative Instances on Rule Discovery: Investigation Using Eye Tracking

Effect of Positive and Negative Instances on Rule Discovery: Investigation Using Eye Tracking Effect of Positive and Negative Instances on Rule Discovery: Investigation Using Eye Tracking Miki Matsumuro (muro@cog.human.nagoya-u.ac.jp) Kazuhisa Miwa (miwa@is.nagoya-u.ac.jp) Graduate School of Information

More information

UNIT. Experiments and the Common Cold. Biology. Unit Description. Unit Requirements

UNIT. Experiments and the Common Cold. Biology. Unit Description. Unit Requirements UNIT Biology Experiments and the Common Cold Unit Description Content: This course is designed to familiarize the student with concepts in biology and biological research. Skills: Main Ideas and Supporting

More information

On the diversity principle and local falsifiability

On the diversity principle and local falsifiability On the diversity principle and local falsifiability Uriel Feige October 22, 2012 1 Introduction This manuscript concerns the methodology of evaluating one particular aspect of TCS (theoretical computer

More information

Category Size and Category-Based Induction

Category Size and Category-Based Induction Category Size and Category-Based Induction Aidan Feeney & David R. Gardiner Department of Psychology University of Durham, Stockton Campus University Boulevard Stockton-on-Tees, TS17 6BH United Kingdom

More information

Hypothesis Generation and Testing in Wason's Task. J. Edward Russo Cornell University. Margaret G. Meloy Penn State

Hypothesis Generation and Testing in Wason's Task. J. Edward Russo Cornell University. Margaret G. Meloy Penn State Running Head: HYPOTHESIS GENERATION AND TESTING Hypothesis Generation and Testing in Wason's 2-4-6 Task J. Edward Russo Cornell University Margaret G. Meloy Penn State The authors wish to thank Josh Klayman

More information

A Comparison of Three Measures of the Association Between a Feature and a Concept

A Comparison of Three Measures of the Association Between a Feature and a Concept A Comparison of Three Measures of the Association Between a Feature and a Concept Matthew D. Zeigenfuse (mzeigenf@msu.edu) Department of Psychology, Michigan State University East Lansing, MI 48823 USA

More information

Relations between premise similarity and inductive strength

Relations between premise similarity and inductive strength Psychonomic Bulletin & Review 2005, 12 (2), 340-344 Relations between premise similarity and inductive strength EVAN HEIT University of Warwick, Coventry, England and AIDAN FEENEY University of Durham,

More information

Incorporating quantitative information into a linear ordering" GEORGE R. POTTS Dartmouth College, Hanover, New Hampshire 03755

Incorporating quantitative information into a linear ordering GEORGE R. POTTS Dartmouth College, Hanover, New Hampshire 03755 Memory & Cognition 1974, Vol. 2, No.3, 533 538 Incorporating quantitative information into a linear ordering" GEORGE R. POTTS Dartmouth College, Hanover, New Hampshire 03755 Ss were required to learn linear

More information

Convergence Principles: Information in the Answer

Convergence Principles: Information in the Answer Convergence Principles: Information in the Answer Sets of Some Multiple-Choice Intelligence Tests A. P. White and J. E. Zammarelli University of Durham It is hypothesized that some common multiplechoice

More information

1) Principle of Proactivity

1) Principle of Proactivity 1) Principle of Proactivity Principle of proactivity teaches us that we can influence our life to a much greater extend than we usually think. It even states that we are the cause of majority of things

More information

Reliability, validity, and all that jazz

Reliability, validity, and all that jazz Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to

More information

Confirmation Bias in Software Development and Testing: An Analysis of the Effects of Company Size, Experience and Reasoning Skills

Confirmation Bias in Software Development and Testing: An Analysis of the Effects of Company Size, Experience and Reasoning Skills Confirmation Bias in Software Development and Testing: An Analysis of the Effects of Company Size, Experience and Reasoning Skills Gul Calikli Department of Computer Engineering, Software Research Laboratory,

More information

Sleeping Beauty is told the following:

Sleeping Beauty is told the following: Sleeping beauty Sleeping Beauty is told the following: You are going to sleep for three days, during which time you will be woken up either once Now suppose that you are sleeping beauty, and you are woken

More information

Audio: In this lecture we are going to address psychology as a science. Slide #2

Audio: In this lecture we are going to address psychology as a science. Slide #2 Psychology 312: Lecture 2 Psychology as a Science Slide #1 Psychology As A Science In this lecture we are going to address psychology as a science. Slide #2 Outline Psychology is an empirical science.

More information

Journal of Experimental Psychology: Learning, Memory, and Cognition

Journal of Experimental Psychology: Learning, Memory, and Cognition Journal of Experimental Psychology: Learning, Memory, and Cognition Conflict and Bias in Heuristic Judgment Sudeep Bhatia Online First Publication, September 29, 2016. http://dx.doi.org/10.1037/xlm0000307

More information

Observational Category Learning as a Path to More Robust Generative Knowledge

Observational Category Learning as a Path to More Robust Generative Knowledge Observational Category Learning as a Path to More Robust Generative Knowledge Kimery R. Levering (kleveri1@binghamton.edu) Kenneth J. Kurtz (kkurtz@binghamton.edu) Department of Psychology, Binghamton

More information

Statistics Mathematics 243

Statistics Mathematics 243 Statistics Mathematics 243 Michael Stob February 2, 2005 These notes are supplementary material for Mathematics 243 and are not intended to stand alone. They should be used in conjunction with the textbook

More information

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH

ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH The following document provides background information on the research and development of the Emergenetics Profile instrument. Emergenetics Defined 1. Emergenetics

More information

Is it possible to give a philosophical definition of sexual desire?

Is it possible to give a philosophical definition of sexual desire? Issue 1 Spring 2016 Undergraduate Journal of Philosophy Is it possible to give a philosophical definition of sexual desire? William Morgan - The University of Sheffield pp. 47-58 For details of submission

More information

T. Kushnir & A. Gopnik (2005 ). Young children infer causal strength from probabilities and interventions. Psychological Science 16 (9):

T. Kushnir & A. Gopnik (2005 ). Young children infer causal strength from probabilities and interventions. Psychological Science 16 (9): Probabilities and Interventions 1 Running Head: PROBABILITIES AND INTERVENTIONS T. Kushnir & A. Gopnik (2005 ). Young children infer causal strength from probabilities and interventions. Psychological

More information

Necessity, possibility and belief: A study of syllogistic reasoning

Necessity, possibility and belief: A study of syllogistic reasoning THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 2001, 54A (3), 935 958 Necessity, possibility and belief: A study of syllogistic reasoning Jonathan St. B.T. Evans, Simon J. Handley, and Catherine N.J.

More information

UPDATING SMALL WORLD REPRESENTATIONS IN STRATEGIC DECISION- MAKING UNDER EXTREME UNCERTAINTY

UPDATING SMALL WORLD REPRESENTATIONS IN STRATEGIC DECISION- MAKING UNDER EXTREME UNCERTAINTY This is the version of the article accepted for publication in Academy of Management Proceedings published by Academy of Management: https://doi.org/10.5465/ambpp.2018.141 Accepted version downloaded from

More information

From: AAAI Technical Report SS Compilation copyright 1995, AAAI ( All rights reserved.

From: AAAI Technical Report SS Compilation copyright 1995, AAAI (  All rights reserved. From: AAAI Technical Report SS-95-03. Compilation copyright 1995, AAAI (www.aaai.org). All rights reserved. MAKING MULTIPLE HYPOTHESES EXPLICIT: AN EXPLICIT STRATEGY COMPUTATIONAL 1 MODELS OF SCIENTIFIC

More information

Lassaline (1996) applied structural alignment directly to the issue of category-based inference. She demonstrated that adding a causal relation to an

Lassaline (1996) applied structural alignment directly to the issue of category-based inference. She demonstrated that adding a causal relation to an Wu, M., & Gentner, D. (1998, August). Structure in category-based induction. Proceedings of the Twentieth Annual Conference of the Cognitive Science Society, 1154-115$, Structure in Category-Based Induction

More information

Why do Psychologists Perform Research?

Why do Psychologists Perform Research? PSY 102 1 PSY 102 Understanding and Thinking Critically About Psychological Research Thinking critically about research means knowing the right questions to ask to assess the validity or accuracy of a

More information

QUESTIONING THE MENTAL HEALTH EXPERT S CUSTODY REPORT

QUESTIONING THE MENTAL HEALTH EXPERT S CUSTODY REPORT QUESTIONING THE MENTAL HEALTH EXPERT S CUSTODY REPORT by IRA DANIEL TURKAT, PH.D. Venice, Florida from AMERICAN JOURNAL OF FAMILY LAW, Vol 7, 175-179 (1993) There are few activities in which a mental health

More information

Science WILLIAM F. BREWER AND PUNYASHLOKE MISHRA

Science WILLIAM F. BREWER AND PUNYASHLOKE MISHRA 60 Science WILLIAM F. BREWER AND PUNYASHLOKE MISHRA The cognitive science of science studies the cognitive processes involved in carrying out science: How do scientists reason? How do scientists develop

More information

What is Science 2009 What is science?

What is Science 2009 What is science? What is science? The question we want to address is seemingly simple, but turns out to be quite difficult to answer: what is science? It is reasonable to ask such a question since this is a book/course

More information

Information and cue-priming effects on tip-of-the-tongue states

Information and cue-priming effects on tip-of-the-tongue states Information and cue-priming effects on tip-of-the-tongue states Psycholinguistics 2 Contents: Introduction... 1 Pilot Experiment... 2 Experiment... 3 Participants... 3 Materials... 4 Design... 4 Procedure...

More information

Between Micro and Macro: Individual and Social Structure, Content and Form in Georg Simmel

Between Micro and Macro: Individual and Social Structure, Content and Form in Georg Simmel Michela Bowman Amy LeClair Michelle Lynn Robert Weide Classical Sociological Theory November 11, 2003 Between Micro and Macro: Individual and Social Structure, Content and Form in Georg Simmel Through

More information

Comment on McLeod and Hume, Overlapping Mental Operations in Serial Performance with Preview: Typing

Comment on McLeod and Hume, Overlapping Mental Operations in Serial Performance with Preview: Typing THE QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1994, 47A (1) 201-205 Comment on McLeod and Hume, Overlapping Mental Operations in Serial Performance with Preview: Typing Harold Pashler University of

More information

The Significance of Empirical Reports in the Field of Animal Science

The Significance of Empirical Reports in the Field of Animal Science The Significance of Empirical Reports in the Field of Animal Science Addison Cheng ABSTRACT Empirical reports published in Animal Science journals have many unique features that may be reflective of the

More information

The Role of Causal Models in Analogical Inference

The Role of Causal Models in Analogical Inference Journal of Experimental Psychology: Learning, Memory, and Cognition 2008, Vol. 34, No. 5, 1111 1122 Copyright 2008 by the American Psychological Association 0278-7393/08/$12.00 DOI: 10.1037/a0012581 The

More information

Goodness of Pattern and Pattern Uncertainty 1

Goodness of Pattern and Pattern Uncertainty 1 J'OURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR 2, 446-452 (1963) Goodness of Pattern and Pattern Uncertainty 1 A visual configuration, or pattern, has qualities over and above those which can be specified

More information

Content Effects in Conditional Reasoning: Evaluating the Container Schema

Content Effects in Conditional Reasoning: Evaluating the Container Schema Effects in Conditional Reasoning: Evaluating the Container Schema Amber N. Bloomfield (a-bloomfield@northwestern.edu) Department of Psychology, 2029 Sheridan Road Evanston, IL 60208 USA Lance J. Rips (rips@northwestern.edu)

More information

Quality Digest Daily, March 3, 2014 Manuscript 266. Statistics and SPC. Two things sharing a common name can still be different. Donald J.

Quality Digest Daily, March 3, 2014 Manuscript 266. Statistics and SPC. Two things sharing a common name can still be different. Donald J. Quality Digest Daily, March 3, 2014 Manuscript 266 Statistics and SPC Two things sharing a common name can still be different Donald J. Wheeler Students typically encounter many obstacles while learning

More information

We Can Test the Experience Machine. Response to Basil SMITH Can We Test the Experience Machine? Ethical Perspectives 18 (2011):

We Can Test the Experience Machine. Response to Basil SMITH Can We Test the Experience Machine? Ethical Perspectives 18 (2011): We Can Test the Experience Machine Response to Basil SMITH Can We Test the Experience Machine? Ethical Perspectives 18 (2011): 29-51. In his provocative Can We Test the Experience Machine?, Basil Smith

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. How Implication Is Understood Author(s): P. N. Johnson-Laird and Joanna Tagart Source: The American Journal of Psychology, Vol. 82, No. 3 (Sep., 1969), pp. 367-373 Published by: University of Illinois

More information

Design Methodology. 4th year 1 nd Semester. M.S.C. Madyan Rashan. Room No Academic Year

Design Methodology. 4th year 1 nd Semester. M.S.C. Madyan Rashan. Room No Academic Year College of Engineering Department of Interior Design Design Methodology 4th year 1 nd Semester M.S.C. Madyan Rashan Room No. 313 Academic Year 2018-2019 Course Name Course Code INDS 315 Lecturer in Charge

More information

Chapter 2: Business Ethics and Social Responsibility

Chapter 2: Business Ethics and Social Responsibility Chapter 2: Business Ethics and Social Responsibility 1. CHAPTER OVERVIEW Chapter 2 explains the issues of right and wrong in business conduct. This explanation begins with the fundamentals of business

More information

Chapter 11 Decision Making. Syllogism. The Logic

Chapter 11 Decision Making. Syllogism. The Logic Chapter 11 Decision Making Syllogism All men are mortal. (major premise) Socrates is a man. (minor premise) (therefore) Socrates is mortal. (conclusion) The Logic Mortal Socrates Men 1 An Abstract Syllogism

More information

How Scientists Think in the Real World: Implications for Science Education

How Scientists Think in the Real World: Implications for Science Education How Scientists Think in the Real World: Implications for Science Education Kevin Dunbar McGill University Research on scientific thinking and science education is often based on introspections about what

More information

Further Properties of the Priority Rule

Further Properties of the Priority Rule Further Properties of the Priority Rule Michael Strevens Draft of July 2003 Abstract In Strevens (2003), I showed that science s priority system for distributing credit promotes an allocation of labor

More information

Chapter 11. Experimental Design: One-Way Independent Samples Design

Chapter 11. Experimental Design: One-Way Independent Samples Design 11-1 Chapter 11. Experimental Design: One-Way Independent Samples Design Advantages and Limitations Comparing Two Groups Comparing t Test to ANOVA Independent Samples t Test Independent Samples ANOVA Comparing

More information

Karl Popper is probably the most influential philosopher

Karl Popper is probably the most influential philosopher S T A T I S T I C S W I T H O U T T E A R S Hypothesis testing, type I and type II errors Amitav Banerjee, U. B. Chitnis, S. L. Jadhav, J. S. Bhawalkar, S. Chaudhury 1 Department of Community Medicine,

More information

Examples of Feedback Comments: How to use them to improve your report writing. Example 1: Compare and contrast

Examples of Feedback Comments: How to use them to improve your report writing. Example 1: Compare and contrast Examples of Feedback Comments: How to use them to improve your report writing This document contains 4 examples of writing and feedback comments from Level 2A lab reports, and 4 steps to help you apply

More information

FAQ: Heuristics, Biases, and Alternatives

FAQ: Heuristics, Biases, and Alternatives Question 1: What is meant by the phrase biases in judgment heuristics? Response: A bias is a predisposition to think or act in a certain way based on past experience or values (Bazerman, 2006). The term

More information

Neuroscience and Generalized Empirical Method Go Three Rounds

Neuroscience and Generalized Empirical Method Go Three Rounds Bruce Anderson, Neuroscience and Generalized Empirical Method Go Three Rounds: Review of Robert Henman s Global Collaboration: Neuroscience as Paradigmatic Journal of Macrodynamic Analysis 9 (2016): 74-78.

More information

Chapter 7: Descriptive Statistics

Chapter 7: Descriptive Statistics Chapter Overview Chapter 7 provides an introduction to basic strategies for describing groups statistically. Statistical concepts around normal distributions are discussed. The statistical procedures of

More information

Negations Without Not : Alternative forms of Negation and Contrast Classes in Conditional Inference. James Richard Hazlett Vance

Negations Without Not : Alternative forms of Negation and Contrast Classes in Conditional Inference. James Richard Hazlett Vance Negations Without Not : Alternative forms of Negation and Contrast Classes in Conditional Inference James Richard Hazlett Vance Doctor of Philosophy Birkbeck, University of London 2018 1 Declaration I

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Evaluating you relationships

Evaluating you relationships Evaluating you relationships What relationships are important to you? What are you doing today to care for them? Have you told those concerned how you feel? Most of us regularly inspect the health of our

More information

However, as Sloman (1993) noted, there are other instances of nonmonotonicity that are not explainable by dilution of category coverage. His participa

However, as Sloman (1993) noted, there are other instances of nonmonotonicity that are not explainable by dilution of category coverage. His participa Blok, S. V., & Gentner, D. (2000, August). Reasoning from shared structure. Poster session presented at the Twenty-Second Annual Conference of the Cognitive Science Society, Philadelphia, PA. Reasoning

More information

Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior

Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior 1 Checking the counterarguments confirms that publication bias contaminated studies relating social class and unethical behavior Gregory Francis Department of Psychological Sciences Purdue University gfrancis@purdue.edu

More information

Reduce Tension by Making the Desired Choice Easier

Reduce Tension by Making the Desired Choice Easier Daniel Kahneman Talk at Social and Behavioral Sciences Meeting at OEOB Reduce Tension by Making the Desired Choice Easier Here is one of the best theoretical ideas that psychology has to offer developed

More information

Study on Gender in Physics

Study on Gender in Physics Listening Practice Study on Gender in Physics AUDIO - open this URL to listen to the audio: https://goo.gl/7xmlgh Questions 1-10 Choose the correct letter, A, B C. Study on Gender in Physics 1 The students

More information

Chapter 5: Field experimental designs in agriculture

Chapter 5: Field experimental designs in agriculture Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction

More information

Generalization and Theory-Building in Software Engineering Research

Generalization and Theory-Building in Software Engineering Research Generalization and Theory-Building in Software Engineering Research Magne Jørgensen, Dag Sjøberg Simula Research Laboratory {magne.jorgensen, dagsj}@simula.no Abstract The main purpose of this paper is

More information

Experimental Testing of Intrinsic Preferences for NonInstrumental Information

Experimental Testing of Intrinsic Preferences for NonInstrumental Information Experimental Testing of Intrinsic Preferences for NonInstrumental Information By Kfir Eliaz and Andrew Schotter* The classical model of decision making under uncertainty assumes that decision makers care

More information

UNIT. First Impressions and Attraction. Psychology. Unit Description. Unit Requirements

UNIT. First Impressions and Attraction. Psychology. Unit Description. Unit Requirements UNIT Psychology First Impressions and Attraction Unit Description Content: This course is designed to familiarize students with concepts in social psychology. Skills: Main Ideas and Supporting Details

More information

The Psychology of Inductive Inference

The Psychology of Inductive Inference The Psychology of Inductive Inference Psychology 355: Cognitive Psychology Instructor: John Miyamoto 05/24/2018: Lecture 09-4 Note: This Powerpoint presentation may contain macros that I wrote to help

More information

Underlying Theory & Basic Issues

Underlying Theory & Basic Issues Underlying Theory & Basic Issues Dewayne E Perry ENS 623 Perry@ece.utexas.edu 1 All Too True 2 Validity In software engineering, we worry about various issues: E-Type systems: Usefulness is it doing what

More information

Cognitive domain: Comprehension Answer location: Elements of Empiricism Question type: MC

Cognitive domain: Comprehension Answer location: Elements of Empiricism Question type: MC Chapter 2 1. Knowledge that is evaluative, value laden, and concerned with prescribing what ought to be is known as knowledge. *a. Normative b. Nonnormative c. Probabilistic d. Nonprobabilistic. 2. Most

More information

Explaining an Explanatory Gap Gilbert Harman Princeton University

Explaining an Explanatory Gap Gilbert Harman Princeton University Explaining an Explanatory Gap Gilbert Harman Princeton University Discussions of the mind-body problem often refer to an explanatory gap (Levine 1983) between some aspect of our conscious mental life and

More information

66 Questions, 5 pages - 1 of 5 Bio301D Exam 3

66 Questions, 5 pages - 1 of 5 Bio301D Exam 3 A = True, B = False unless stated otherwise name (required) You must turn in both this hard copy (with your name on it) and your scantron to receive credit for this exam. One answer and only one answer

More information

Forum. and Categorization. 8. Three Distinctions about Concepts EDWARD E. SMITH. Metaphysical versus Epistemological Accounts of Concepts

Forum. and Categorization. 8. Three Distinctions about Concepts EDWARD E. SMITH. Metaphysical versus Epistemological Accounts of Concepts Mind b Language Vol. 4 Nos. 1 and 2 Spring/Summer 1989 [SSN 0268-1064 0 Basil Blackw~ll Forum 8. Three Distinctions about Concepts and Categorization EDWARD E. SMITH Some of the problems raised in the

More information

Gender Differences Associated With Memory Recall. By Lee Morgan Gunn. Oxford May 2014

Gender Differences Associated With Memory Recall. By Lee Morgan Gunn. Oxford May 2014 Gender Differences Associated With Memory Recall By Lee Morgan Gunn A thesis submitted to the faculty of The University of Mississippi in partial fulfillment of the requirements of the Sally McDonnell

More information

Animal Cognition. Introduction to Cognitive Science

Animal Cognition. Introduction to Cognitive Science Animal Cognition Introduction to Cognitive Science Intelligent Animals? Parrot Intelligence Crow Intelligence I Crow Intelligence II Cow Intelligence Orca Intelligence Dolphin Play Funny Animal Intelligence

More information

EXPERIMENTAL DESIGN Page 1 of 11. relationships between certain events in the environment and the occurrence of particular

EXPERIMENTAL DESIGN Page 1 of 11. relationships between certain events in the environment and the occurrence of particular EXPERIMENTAL DESIGN Page 1 of 11 I. Introduction to Experimentation 1. The experiment is the primary means by which we are able to establish cause-effect relationships between certain events in the environment

More information

support support support STAND BY ENCOURAGE AFFIRM STRENGTHEN PROMOTE JOIN IN SOLIDARITY Phase 3 ASSIST of the SASA! Community Mobilization Approach

support support support STAND BY ENCOURAGE AFFIRM STRENGTHEN PROMOTE JOIN IN SOLIDARITY Phase 3 ASSIST of the SASA! Community Mobilization Approach support support support Phase 3 of the SASA! Community Mobilization Approach STAND BY STRENGTHEN ENCOURAGE PROMOTE ASSIST AFFIRM JOIN IN SOLIDARITY support_ts.indd 1 11/6/08 6:55:34 PM support Phase 3

More information

Spectrum inversion and intentionalism

Spectrum inversion and intentionalism Spectrum inversion and intentionalism phil 93507 Jeff Speaks September 15, 2009 1 What is a spectrum inversion scenario?..................... 1 2 Intentionalism is false because inverts could have in common.........

More information

CHAPTER LEARNING OUTCOMES

CHAPTER LEARNING OUTCOMES EXPERIIMENTAL METHODOLOGY CHAPTER LEARNING OUTCOMES When you have completed reading this article you will be able to: Define what is an experiment Explain the role of theory in educational research Justify

More information

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX

The Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX The Impact of Relative Standards on the Propensity to Disclose Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX 2 Web Appendix A: Panel data estimation approach As noted in the main

More information

Statisticians deal with groups of numbers. They often find it helpful to use

Statisticians deal with groups of numbers. They often find it helpful to use Chapter 4 Finding Your Center In This Chapter Working within your means Meeting conditions The median is the message Getting into the mode Statisticians deal with groups of numbers. They often find it

More information

TEACHING HYPOTHESIS TESTING: WHAT IS DOUBTED, WHAT IS TESTED?

TEACHING HYPOTHESIS TESTING: WHAT IS DOUBTED, WHAT IS TESTED? Australasian Journal of Economics Education Vol. 5. Numbers 3 & 4, 008 1 TEACHING HYPOTHESIS TESTING: WHAT IS DOUBTED, WHAT IS TESTED? Gordon Menzies 1 University of Technology, Sydney E-mail: gordon.menzies@uts.edu.au

More information

The role of sampling assumptions in generalization with multiple categories

The role of sampling assumptions in generalization with multiple categories The role of sampling assumptions in generalization with multiple categories Wai Keen Vong (waikeen.vong@adelaide.edu.au) Andrew T. Hendrickson (drew.hendrickson@adelaide.edu.au) Amy Perfors (amy.perfors@adelaide.edu.au)

More information

Helping Your Asperger s Adult-Child to Eliminate Thinking Errors

Helping Your Asperger s Adult-Child to Eliminate Thinking Errors Helping Your Asperger s Adult-Child to Eliminate Thinking Errors Many people with Asperger s (AS) and High-Functioning Autism (HFA) experience thinking errors, largely due to a phenomenon called mind-blindness.

More information

Overview of the Logic and Language of Psychology Research

Overview of the Logic and Language of Psychology Research CHAPTER W1 Overview of the Logic and Language of Psychology Research Chapter Outline The Traditionally Ideal Research Approach Equivalence of Participants in Experimental and Control Groups Equivalence

More information

Highlighting Effect: The Function of Rebuttals in Written Argument

Highlighting Effect: The Function of Rebuttals in Written Argument Highlighting Effect: The Function of Rebuttals in Written Argument Ryosuke Onoda (ndrysk62@p.u-tokyo.ac.jp) Department of Educational Psychology, Graduate School of Education, The University of Tokyo,

More information

Use of Structure Mapping Theory for Complex Systems

Use of Structure Mapping Theory for Complex Systems Gentner, D., & Schumacher, R. M. (1986). Use of structure-mapping theory for complex systems. Presented at the Panel on Mental Models and Complex Systems, IEEE International Conference on Systems, Man

More information

Automaticity of Number Perception

Automaticity of Number Perception Automaticity of Number Perception Jessica M. Choplin (jessica.choplin@vanderbilt.edu) Gordon D. Logan (gordon.logan@vanderbilt.edu) Vanderbilt University Psychology Department 111 21 st Avenue South Nashville,

More information

11. A Thumbnail Sketch of the Myers-Briggs Type Inventory (MBTI) Page 1 of 24

11. A Thumbnail Sketch of the Myers-Briggs Type Inventory (MBTI) Page 1 of 24 11. A Thumbnail Sketch of the Myers-Briggs Type Inventory (MBTI) Directions for Self-Assessment A Thumbnail Sketch of the Myers-Briggs Type Indicator Extraversion Introversion Sensing Intuition Thinking

More information

Encoding of Elements and Relations of Object Arrangements by Young Children

Encoding of Elements and Relations of Object Arrangements by Young Children Encoding of Elements and Relations of Object Arrangements by Young Children Leslee J. Martin (martin.1103@osu.edu) Department of Psychology & Center for Cognitive Science Ohio State University 216 Lazenby

More information

Value From Regulatory Fit E. Tory Higgins

Value From Regulatory Fit E. Tory Higgins CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE Value From Regulatory Fit E. Tory Higgins Columbia University ABSTRACT Where does value come from? I propose a new answer to this classic question. People experience

More information

Choice set options affect the valuation of risky prospects

Choice set options affect the valuation of risky prospects Choice set options affect the valuation of risky prospects Stian Reimers (stian.reimers@warwick.ac.uk) Neil Stewart (neil.stewart@warwick.ac.uk) Nick Chater (nick.chater@warwick.ac.uk) Department of Psychology,

More information

Running head: How large denominators are leading to large errors 1

Running head: How large denominators are leading to large errors 1 Running head: How large denominators are leading to large errors 1 How large denominators are leading to large errors Nathan Thomas Kent State University How large denominators are leading to large errors

More information

EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS

EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS DePaul University INTRODUCTION TO ITEM ANALYSIS: EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS Ivan Hernandez, PhD OVERVIEW What is Item Analysis? Overview Benefits of Item Analysis Applications Main

More information

Comments on David Rosenthal s Consciousness, Content, and Metacognitive Judgments

Comments on David Rosenthal s Consciousness, Content, and Metacognitive Judgments Consciousness and Cognition 9, 215 219 (2000) doi:10.1006/ccog.2000.0438, available online at http://www.idealibrary.com on Comments on David Rosenthal s Consciousness, Content, and Metacognitive Judgments

More information

Relational Versus Attributional Mode of Problem Solving?

Relational Versus Attributional Mode of Problem Solving? Relational Versus Attributional Mode of Problem Solving? Svetoslav Bliznashki (valsotevs@gmail.com) New Bulgarian University, 2 Montevideo Street Sofia 1618, Bulgaria Boicho Kokinov (bkokinov@nbu.bg) New

More information

ADD Overdiagnosis. It s not very common to hear about a disease being overdiagnosed, particularly one

ADD Overdiagnosis. It s not very common to hear about a disease being overdiagnosed, particularly one ADD Overdiagnosis Introduction It s not very common to hear about a disease being overdiagnosed, particularly one that mostly affects children. However, this is happening on quite a large scale with attention

More information

Critical Thinking Assessment at MCC. How are we doing?

Critical Thinking Assessment at MCC. How are we doing? Critical Thinking Assessment at MCC How are we doing? Prepared by Maura McCool, M.S. Office of Research, Evaluation and Assessment Metropolitan Community Colleges Fall 2003 1 General Education Assessment

More information

HUMAN CIRCULATORY SYSTEM COGNITIVE TASK ANALYSIS. Darnel Degand Teachers College, Columbia University

HUMAN CIRCULATORY SYSTEM COGNITIVE TASK ANALYSIS. Darnel Degand Teachers College, Columbia University HUMAN CIRCULATORY SYSTEM COGNITIVE TASK ANALYSIS HUMAN CIRCULATORY SYSTEM COGNITIVE TASK ANALYSIS Darnel Degand Teachers College, Columbia University MSTU 4133 Cognition and Computers Instructor Sui-Yee

More information