1 Nature of Statistics The Study of statistics involves; Data collection, summarizing data (organization and analysis of data) and interpretation of data. The drawing of inferences about a population from a sample taken from the population. Data are numerical facts and figures from which conclusions can be drawn. In effect, data is the raw material of statistics. Data can be categorise into two types; Numerical - For the purpose of this study, we define data as numbers. The two kinds of numbers that we use in statistics are numbers that result from taking a measurement and those that result from the process of counting. For example; The temperature of a patient can be obtained from measurement. The number of patients in a hospital can be obtained from counting process. Non numerical - These kind of data are qualitative or categorical. 1 Stat 101 1
2 Population: This is a set of existing units or all objects under study. Sample: This is a subset of the units in the population. variable is any characteristics of a population unit. Types of variables Quantitative variables. A quantitative variable is one that can take numerical values. Quantitative variable may be categorized further into discrete and continuous. In the measurement of variable, if the measurement correspond to equally spaced values the real number line, then the variable is said to be quantitative. For example, temperature of patients in a hospital, the scores of students in a test. etc Qualitative variables These are variables that cannot take numerical values. Qualitative variables can neither be measured nor counted. If we simply record into categories in the population, then the variable is called qualitative variable. For example; a person s gender, whether or not an event will happen, etc 2 Stat 101 2
3 Measurement scales Variables can further be classified according to the following four levels of measurement; nominal, ordinal, ordinal, interval and ratio. In general, statistics can be classified into two main branches - descriptive statistics and inferential statistics. Descriptive statistics is concerned with the collection and description of important features in the data. Inferential statistics deal with making decision about a population based on a sample from the population. Statistical inference is a science of using a sample of measurements to make 3 Stat 101 3
generalization about the important aspects of a population of measurement. Uses of Statistics 4 Financial planners use recent trends in stock market prices to make investment decisions. Business and other organizations often employ statistical analysis of data to help improving their process. Production companies uses statistical analysis to predict the sale of their product. Statistical analysis of population growth advices the government as to how to manage the economy. Physicians and hospitals use data on the effectiveness of drugs and surgical procedures to provide patients with the best possible treatment. Politicians utilize data from public-opinion polls to formulate legislation and to devise campaign strategies. 4 Stat 101 4
Opportunities for Statistics 5 In industry, statisticians design and analyse experiments to improve the safety, reliability and performance of products of all types. Statisticians work with social scientists to survey attitudes and opinions. In education, statisticians are involved with the assessment of educational aptitude and achievement and with experiments designed to measure the effectiveness of curricular innovations. In hospitals, medical schools and government agencies, statisticians study the control, prevention, diagnose and treatment of diseases, injuries and other health abnormalities. Statisticians also investigate the efficiency of health delivery systems and practices. In pharmaceutical industry, statisticians design experiments to measure the efficacy of drugs in treatment of illness and to assess the likelihood of undesirable side effect. In insurance and pension industries actuaries use statistical models to formulate policies, assess the risk level in the policies and set premium rates. 5 Stat 101 5
6 Source of statistical data. The source of statistical data depends on its originality. There are two main categories namely; primary and secondary source of data. Primary source of data Primary data are obtained when researchers originally collect data from designing experiment or conducting a survey. Survey In survey, the aim of the researcher is to find a way of obtaining information from individuals or respondents. A survey conducted on a whole population of interest is called a census and a survey conducted on a sample from a population is called sample survey. In survey, questionnaires are used to obtain desired information from respondents. Questionnaires may be administered by post, telephone, e-mail or in person. Secondary source of data These are data originally not collected under supervision and collected from libraries, governments agencies, and internet. 6 Stat 101 6
7 Cross-sectional Data These are data that are collected at the same period or approximately the same time. Time series Data These are data collected over several time period. Note... Statistics : The art and science of collecting, presenting and interpretation of data. Data : The facts and figures collected, analysed and summarized for presentation and interpretation. Data Set : All collected data in a particular study. Element : The entity on which the data are collected. Variable : A characteristics of interest for an element. Observation: the set of measurements obtained for a particular element. Nominal Scale : The scale of measurement for a variable when the data are labels or names used to identify an attribute of an element. Nominal value may be non-numeric or numeric. 7 Stat 101 7
Ordinal Scale : The scale of measurements of a variable if the variable exhibit the properties of a nominal data and order or rank is also meaningful. Ordinal values may be numeric or non-numeric. Interval Scale : The scale of measurements of a variable if the variable exhibit the properties of a ordinal and interval between numbers is expressed in terms of a fixed unit measure. Interval data are numeric. Ratio Scale : The scale of measurement of a variable if the variable demonstrate all the properties of an interval data and the ratio of two values is meaningful. Ratio data are always numeric. Data mining : The process of using procedures from statistics and computer science to extract useful information from a large database. 8