Pesticide Monitoring Program: Design Assessment 1

Similar documents
COMMISSION REGULATION (EU) / of XXX

COMMISSION REGULATION (EU) / of XXX

Setting of new MRLs for fluxapyroxad (BAS 700 F) in various commodities of plant and animal origin 1

Council of the European Union Brussels, 12 August 2014 (OR. en) Mr Uwe CORSEPIUS, Secretary-General of the Council of the European Union

Council of the European Union Brussels, 4 November 2015 (OR. en)

COMMISSION REGULATION (EU) / of XXX

COMMISSION REGULATION (EU) / of XXX

SUPPLY BALANCE SHEETS

5.8 DIMETHOMORPH (225)

National reporting 2014 Pesticide residues in food Federal Republic of Germany

5.17 METHOXYFENOZIDE (209)

EFSA s Concise European food consumption database. Davide Arcella Data Collection and Exposure Unit

Fluopyram FLUOPYRAM (243)

REASONED OPINION. European Food Safety Authority 2, 3. European Food Safety Authority (EFSA), Parma, Italy

REPORT FROM THE COMMISSION TO THE EUROPEAN PARLIAMENT AND THE COUNCIL ON FOOD AND FOOD INGREDIENTS TREATED WITH IONISING RADIATION FOR THE YEAR 2015

Evaluation of active substances in plant protection products Residues Anja Friel European Food Safetey Authority, Parma/ Italy

COMMISSION REGULATION (EU) / of XXX

APPROVED: 4 December 2015 PUBLISHED: 9 December 2015

Proposed Revision or Revocation of Maximum Residue Limits for Discontinued Agricultural Pest Control Products: Update 2

Council of the European Union Brussels, 5 December 2014 (OR. en) Mr Uwe CORSEPIUS, Secretary-General of the Council of the European Union

2008 Annual Report on Pesticide Residues. according to Article 32 of Regulation (EC) No 396/2005 1

5.23 PROPAMOCARB (148)

COMMISSION REGULATION (EU) No /.. of XXX. amending Regulation (EC) No 1881/2006 as regards maximum levels of lead in certain foodstuffs

Review of the existing maximum residue levels for chloridazon according to Article 12 of Regulation (EC) No 396/2005

REPORT FROM THE COMMISSION TO THE EUROPEAN PARLIAMENT AND THE COUNCIL ON FOOD INGREDIENTS TREATED WITH IONISING RADIATION FOR THE YEAR 2012

REQUIREMENTS FOR ACCREDITED LABORATORIES APPLYING FOR A FLEXIBLE SCOPE FOR ANALYSIS OF PESTICIDE RESIDUES IN FOOD AND FEED

REASONED OPINION. European Food Safety Authority 2, 3. European Food Safety Authority (EFSA), Parma, Italy

MONITORING AND EVALUATING THE NUTRITION OUTCOMES OF INTERVENTIONS KEY CONCEPTS AND INDICATORS

Review of the existing maximum residue levels (MRLs) for fludioxonil according to Article 12 of Regulation (EC) No 396/2005 1

Boscalid BOSCALID (221)

BUNDESINSTITUT FÜR RISIKOBEWERTUNG

Prioritised review of the existing maximum residue levels for dimethoate and omethoate according to Article 43 of Regulation (EC) No 396/2005

COMMISSION OF THE EUROPEAN COMMUNITIES REPORT FROM THE COMMISSION ON FOOD IRRADIATION FOR THE YEAR 2002

SANCO/10616/2009 rev. 7 ( )

Aflatoxins (sum of B1, B2, G1, G2) in cereals and cereal-derived food products 1

FLUCYTHRINATE (152) EXPLANATION

Evaluation of monitoring data on residues of didecyldimethylammonium chloride (DDAC) and benzalkonium chloride (BAC) 1

Chlorantraniliprole 67

374 Saflufenacil Short-term dietary exposure

2007 Annual Report on Pesticide Residues. according to Article 32 of Regulation (EC) No 396/ Prepared by Pesticides Unit (PRAPeR) of EFSA

REASONED OPINION. European Food Safety Authority 2, ABSTRACT. European Food Safety Authority (EFSA), Parma, Italy KEY WORDS

Boscalid BOSCALID (221)

5.20 PROTHIOCONAZOLE (232)

5.24 METHOXYFENOZIDE (209)

COMMISSION REGULATION (EU) / of XXX

MONITORING AND EVALUATING THE NUTRITION OUTCOMES OF INTERVENTIONS KEY CONCEPTS AND INDICATORS

Reasoned opinion on the review of the existing maximum residue levels (MRLs) for diquat according to Article 12 of Regulation (EC) No 396/2005 1

5.10 DIFENOCONAZOLE (224)

MRL setting and intakes for cereals. Annette Petersen

MyPlate. Lesson. By Carone Fitness. MyPlate

Azoxystrobin 153. AZOXYSTROBIN (229) The first draft was prepared by Dr U Banasiak, Federal Institute for Risk Assessment, Berlin, Germany

EASY WAYS TO EAT MORE FRUITS AND VEGETABLES AS PART OF A HEALTHY DIET.

5.18 FLUDIOXONIL (211)

Council of the European Union Brussels, 28 March 2018 (OR. en)

Follow up assessment of MRLs for the active substance iprodione. European Food Safety Authority (EFSA)

Food. Food Groups & Nutrients

EatHealthy. SUBJECTS: Health Science English Language Arts listening, speaking, and writing Math. Healthy

Review of the existing maximum residue levels (MRLs) for oxamyl according to Article 12 of Regulation (EC) No 396/2005 1

Reasoned opinion on the modification of the existing MRLs for dimethoate in olives for oil production and table olives 1

L 322/24 Official Journal of the European Union

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Western Balkans 6

Reasoned opinion on the modification of the existing maximum residues levels (MRLs) for fluazifop-p in several commodities 1

NUTRITIONAL STATUS OP RURAL YOUTH. IV. Sherman County

Functions of Food. To provide us with energy and keep us active. For growth and repair of the. body. To stop us from feeling hungry.

My Diabetic Meal Plan during Pregnancy

Trends of Average Food Supply in the European Union

REASONED OPINION. European Food Safety Authority 2, 3. European Food Safety Authority (EFSA), Parma, Italy

Eating Healthier: Six Simple Steps

Reasoned opinion on the review of the existing maximum residue levels (MRLs) for benalaxyl according to Article 12 of Regulation (EC) No 396/2005 1

Canada s Food Supply: A Preliminary Examination of Changes,

European Union comments for the. CODEX COMMITTEE ON PESTICIDE RESIDUES 44th Session. Shanghai, China, April 2012.

REASONED OPINION. European Food Safety Authority 2. European Food Safety Authority (EFSA), Parma, Italy

5.20 PYRACLOSTROBIN (210)

Combined review of the existing maximum residue levels (MRLs) for the active substances metalaxyl and metalaxyl-m

PESTICIDE RESIDUE CONTROL RESULTS NATIONAL SUMMARY REPORT. Country: HELLAS. Year: National competent authority

What s s on the Menu in Europe? - overview and challenges in the first pan- European food consumption survey

Trends in food availability in the SLOVAK REPUBLIC the DAFNE V project

FINAL EXAM. Review Food Guide Material and Compose/Complete Nutrition Assignment. Orange Green Red Yellow Blue Purple

Modification of the existing MRLs for chlorothalonil in barley and several food commodities of animal origin 1

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Near And Middle Eastern Countries

588G: Dietary Antigen Testing: Sensitivity and Complement 1/5. Dietary Antigen Exposure by Food Group

Fruit & Vegetable Health Index

JMPR Review and MRL Recommendations Prof. Dr. Árpád Ambrus

Chemical Occurrence. Exposure Assessment. Food Consumption

Lesson 6. MyPlate. Estimated Class Time Part A Q & A: 20 minutes Total Time: 20 minutes. Part B Poster Activity: 20 minutes Total Time: 20 minutes

COMMISSION OF THE EUROPEAN COMMUNITIES

The AusTrAliAn Guide To healthy eating Eat a wide variety of nutritious foods from these five food groups every day Drink water.

5.9 DIFLUBENZURON (130)

Cypermethrins CYPERMETHRINS (INCLUDING ALPHA- AND ZETA-CYPERMETHRIN) (118)

Following Dietary Guidelines

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Saudi Arabia

Modification of the existing MRLs for dimethoate in various crops 1

Review of the existing maximum residue levels (MRLs) for pyraclostrobin according to Article 12 of Regulation (EC) No 396/2005 1

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Turkey

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Sudan

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Algeria

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Brazil

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - South Africa

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - Saudi Arabia

AGRI-FOOD TRADE STATISTICAL FACTSHEET. European Union - North Korea

Transcription:

EFSA Journal 2015;13(2):4005 ABSTRACT SCIENTIFIC REPORT OF EFSA Pesticide Monitoring Program: Design Assessment 1 European Food Safety Authority 2, 3 European Food Safety Authority (EFSA), Parma, Italy The objective of the EU-coordinated Multi Annual Pesticide Control Program (MACP) is the assessment of MRL exceedance (above 1%) in food items available for consumption in the EU market as well as consumer exposure. The 30 food items currently included in the survey represent 70% of the consumption of adults and around 74% of consumption by children. Probability based sampling methods minimize selection bias, since every element in the population of food items has a non-zero probability of being selected. Using a stratified sampling framework based on food consumption, MRL-exceedance could be estimated with a margin of error of 0.0075 (in terms of raw proportion) by selecting 683 sample units for 32 different food items. The participating countries can be considered to constitute strata since sampling is organized within the countries themselves, for each food item the 683 sample units can be proportionally allocated either on the basis of the resident population or the food item consumption, producing different allocations, the latter being the recommended option. A sample size of 683 is also sufficient to ensure that assessment of consumer exposure is achieved with the same margin of error and confidence. Based on a review of the characteristics of the EU-coordinated multiannual control programme and the use of probability based sampling methods 21,856 samples would need to be taken. This could be split over a three year period. A survey sample design approach requires high quality food consumption data at the level of individual food items for all reporting countries. Food consumption data for individual food items covering all reporting countries should be collected and assessment related to the selection of food items should be revised as well as the allocation of samples per Member States according to the new consumption figures. European Food Safety Authority, 2015 KEY WORDS pesticide monitoring, multi stage sampling design 1 2 3 On request from the European Commission, Question No EFSA-Q-2013-00308, approved on 23 January 2015. Correspondence: amu@efsa.europa.eu Acknowledgement: EFSA wishes to thank the members of the Pesticide Monitoring Network and EFSA staff: José Cortiñas Abrahantes, Anna Zuliani, Jane Richardson, Daniela Brocca and Giuseppe Triacchini for the support provided to this scientific output and Dr. Henk van der Schee for peer reviewing the publication. Suggested citation: European Food Safety Authority, 2015; Pesticide Monitoring Program: Design Assessment. EFSA Journal 2015;13(2):4005, 52 pp. doi:10.2903/j.efsa.2015.4005. Available online: www.efsa.europa.eu/efsajournal European Food Safety Authority, 2015

SUMMARY Regulation (EC) No. 396/2005 4 requires Member States to collect and analyse samples under an EUcoordinated multiannual pesticide control programme and to submit the data annually to EFSA. Currently 27 Member States and two EFTA countries (Iceland and Norway) participate in the exercise. The purpose of the EU-coordinated programme is to provide statistically representative data regarding pesticide residues in food available to European consumers. Data representativeness refers to a dataset obtained from a survey or study (a sample) which accurately reflects the population under study. Sample survey design entails all the processes and considerations concerned with obtaining inferential statistics of population of interest by studying a portion of the population instead of the whole population. Therefore, it is important not to introduce bias when selecting the portion of the population to be surveyed. The use of a well-designed probability based sampling method minimizes the risk of having selection bias, since every element in the population has a non-zero probability of being selected thereby minimizing subjectivity. Considering the principles of sample survey design the EU-coordinated multiannual pesticides control programme is reviewed focusing on: a critical assessment of the commodities included in the monitoring program, assessment of ability of the monitoring program to ensure representativeness and evaluation of the sample size needed to assess EU Maximum Residue Levels (MRL) compliance considering EU consumption patterns. These survey design principles are universal and could be used for other monitoring programmes, for their specific objectives. The first stage in designing a sample survey is a clear definition of the targeted population and objectives. In the specific case of pesticide monitoring, the elements of the targeted population are food items. Objectives can broadly be divided into two groups: estimation and inferential. Estimation objectives mainly involve production of quantitative and numerical descriptions (estimation) of relevant aspects of a targeted population. On the other hand, inferential objectives are about testing a particular hypothesis about the population of interest. An important difference between the two objectives is that the inferential objective requires specification of power of testing, in addition to level of type I error required in the estimation objective. The importance of selecting a sample that will achieve the pre-specified goals cannot be overemphasized. For the purposes of this report the objective is defined as the assessment of MRL exceedance to detect at least 1% (inferential type) in food items available for consumption in the EU market as well as exposure (estimation type). Other exceedance targets could be selected; a lower exceedance target would require a larger sample size. All probabilistic methods assume existence of a sampling frame, from which elements can be selected. This can be in the form of a list of all elements in the population or some equivalent procedure identifying the elements in the population. It would be impractical to list all apples available for consumption in the EU Member States. However, information from food consumption surveys recording the quantity of different food items eaten combined with population registers (number of people residing in a country) could serve as a suitable proxy. Two food consumption databases were available, the PRIMo consumption dataset which contains consumption figures for 318 agricultural commodities but is only available for 12 Member States (MS) and the EFSA comprehensive food consumption database which is available for 19 MS but contains fewer food items (most of which are processed food items). The review of the EU-coordinated multiannual pesticides control programme indicated that the 30 food items currently included represent 70% of the total food consumption of adults and around 74% of total food consumption by children. A simulation study demonstrated possible drawbacks of selecting food items according to ranking based consumption levels when assessing MRL compliance levels if exceedance with MRL levels depends on the consumption levels of food items. The simulation demonstrated that the overall exceedance rate is underestimated when the effect of consumption level on exceedance rate is negative and overestimated when it is positive. Although 4 Regulation (EC) No 396/2005 of the European Parliament and of the Council of 23 February 2005 on maximum residue levels of pesticides in or on food and feed of plant and animal origin and amending Council Directive 91/414/EEC. OJ L 070, 16.3.2005, p. 1. EFSA Journal 2015;13(2):4005 2

biased results were observed when the ranking method was used, it should be highlighted that the level of underestimation relative to the true MRL exceedance rate was around 10%. A balance needs to be struck between making sure that consumption is taken into account, on the one hand, and ensuring that the selected items allow estimation of overall exceedance with minimal bias, on the other hand. The margin of error is the potential variation around the value of interest that will be considered as negligible. The selection of a lower margin of error in the design of a survey means that there can be increased confidence that the results will be close to the true value for the targeted population. A reduction in the margin of error requires not only a larger sample size but a wider range of food items within each category to be sampled. The margin of error needs to be agreed by the survey designers prior to starting the survey, a margin of error of 0.0075 (in terms of raw proportion) has been used in the report but other values have also been used to illustrate the effect on sample size. A three step survey sample design is proposed based on the characteristics observed in the EUcoordinated programme for the 2010 pesticide monitoring study 5. Initially the number of food items needed to estimate exceedance and the allocation of the "overall" sample size to broader food categories (strata) is computed. For a margin of error of 0.0075 using the consumption figures from the PRIMo dataset, 32 food items would need to be selected, split between 10 food categories. The specific food items to be sampled then need to be randomly selected from within the strata based on the consumption proportion they represent. The design can be enhanced to ensure the sample is representative for different consumer groups for example children or the elderly. A survey sampling the following food items Grapefruits, Oranges, Apples, Pears, Table grapes, Wine grapes, Bananas, Potatoes, Carrots, Tomatoes, Aubergines (egg plants), Melons, Broccoli, Cauliflower, Head cabbage, Lettuce, Beans, Other oilseeds, Barley, Buckwheat, Maize, Rice, Wheat, Other cereal, Sugar beet (root), Chicory roots, Other sugar plants, Swine: Meat, Bovine: Meat, Poultry: Meat and Milk and milk products: Cattle based on adult consumption figures and Oranges, Apples, Pears, Table grapes, Strawberries, Kiwi, Bananas, Potatoes, Onions, Tomatoes, Broccoli, Cultivated fungi, Beans, Linseed, Rice, Rye, Wheat, Sugar beet (root), Sugar cane, Chicory roots, Other sugar plants, Swine: Meat, Bovine: Meat, Sheep: Meat, Sheep: Fat, Poultry: Meat, Milk and milk products: Cattle, Eggs: Chicken based on adult and child consumption figures would represent 83% and 88% of the respective diets for Europe. This approach could be modified using a stratified cluster sampling framework in cases where there was an indication that the pesticide residue profiles would be similar for different food items included within the food categories or food sub-categories. The degree of similarity is captured by the intraclass (food sub-category) correlation (ρ), which ranges from 0 to 1. An intra-class correlation of 1 implies that, in terms of pesticide usage and residues found, the food items are exactly the same, they contribute exactly the same information, hence retaining both would be unnecessary. The intra-class correlation within food sub-categories was estimated to be 0.02 for the EU-coordinated programme for the 2010 pesticide monitoring study. In this case the correlation values are low, therefore the three step sampling framework is sufficient and it may be efficient to fix a small number for clusters to be selected and select a large number of items within the same cluster. The participating countries can be considered to constitute strata since sampling is organized within the countries themselves. The total number of sample units (n=683) to be taken for each food item per country should also be based on proportional allocation. Currently proportional allocation is based on population size (number of people residing in a country) but it could be adapted to proportions based on food consumption. In the EU-coordinated programme for the 2010 pesticide monitoring study the average variance around the estimated proportion of samples exceeding the MRL was 0.01 and this was used for the reported sample allocations. Noticeable differences in the number of samples allocated to a specific country were observed depending on the allocation approach used. Within country the allocation will need to be further split according to market share. Since the targeted 5 Commission Regulation (EC) No 901/2009 of 28 September 2009 concerning a coordinated multiannual Community control programme for 2010, 2011 and 2012 to ensure compliance with maximum levels of and to assess the consumer exposure to pesticide residues in and on food of plant and animal origin (Text with EEA relevance). OJ L 256, 29.9.2009, p. 14 22. EFSA Journal 2015;13(2):4005 3

population are the food items in European market, proportional allocation based on food consumptions figures should be preferably used. This would resolve the problems of obtaining samples in countries where a specific food item is rarely consumed. In order to ensure that assessment of consumer exposure is achieved with the same margin of error (level of precision) and confidence considering the mean values estimated from the monitoring report of 2010 for mean and variance of pesticide residue concentration, the number of samples required are 210, implying that a higher precision level will be achieved if the number of samples collected are 683 to ensure compliance assessment. A survey sample design approach requires high quality food consumption data at the level of individual food items for all reporting countries. The sampling frame needs to be comprehensive for the elements to be selected in the survey and routinely updated. However, total sample sizes are determined by the objectives of the survey in terms of exceedance rate to be detected and decisions in relation to acceptable margin of error. Based on a review of the characteristics of the EU-coordinated multiannual control programme and the use of probability based sampling methods 21,856 samples would need to be taken. This could be split over a three year period. EFSA Journal 2015;13(2):4005 4

TABLE OF CONTENTS Abstract... 1 Summary... 2 Background as provided by the European Commission... 6 Terms of reference as provided by the European Commission... 6 Context of the scientific output... 6 Pesticide Monitoring Program: Design Assessment... 7 1. Introduction... 7 2. Commodity Assessment... 9 2.1. Assessment of Representativeness... 11 2.1.1. Consumption Habits... 11 2.1.1.1. Possible Bias in Selection by Ranking Food Items... 13 2.1.1.2. Design of Simulation Study... 14 2.1.1.3. Simulation Results... 16 2.1.2. Market Share... 17 2.2. Enhancements to Monitoring Program to Ensure Representativeness... 17 2.2.1. Selection of Food Items by Age Groups... 20 2.3. Assessing Representation of Food Items with Similar Residues... 22 2.3.1. Selection of Food Items by Age Groups... 28 3. Assessment of Sample Size to Check for EU Maximum Residue Levels Compliance... 28 4. Assessment of Sample Size to Assess Consumer Exposure... 34 Conclusions... 37 Recommendations... 38 References... 38 Appendices... 40 Appendix A. Consumption Information for all the Food Items.... 40 Appendix B. Number of Food Items Available in the Consumption Table for Each Member State... 48 Appendix C. Mean and variance for the combination and precision achieved when 683 samples are collected... 49 EFSA Journal 2015;13(2):4005 5

BACKGROUND AS PROVIDED BY THE EUROPEAN COMMISSION Regulation (EC) 396/2005 requires Member States to participate to and collect samples under an EUcoordinated multiannual control programme and to submit the data annually to EFSA. The purpose of the EU-coordinated programme is to provide statistically representative data regarding pesticide residues in food available to European consumers. Currently 27 Member States and two EFTA countries (Iceland and Norway) participate in the exercise. The programme aims to generate data which can be used to estimate the acute and chronic dietary exposure of consumers to pesticide residues and to assess the application of current legislation. The details of the coordinated multiannual Community control programme for the reference period 2013-2015 have been established in Commission Regulation (EC) No 788/2012. The 2009 European Union Report on Pesticide Residues in Food made the following recommendation To revise the general design of the EU-coordinated multiannual control programme, taking into account the increased number of reporting countries. In particular, a new calculation of the total number of necessary samples to be analysed for each commodity and the allocation to the individual Member States and reporting countries should be performed. It is therefore appropriate that EFSA reviews the design of the EU-coordinated multiannual programme. Besides the coordinated programme, Member States also established national control programmes which are not subject to this mandate. The Commission requests EFSA, in the framework of Art. 31 of Regulation 178/2002 to review the design of the EU-coordinated multiannual pesticide control programme as regards its appropriateness for the dual purpose to a) assess consumer exposure and b) assess the application of current legislation. TERMS OF REFERENCE AS PROVIDED BY THE EUROPEAN COMMISSION In particular EFSA should assess whether the commodities included in the programme are representative for consumption habits of European consumers, taking into account specific vulnerable subgroups (e.g. children), whether the numbers of samples taken per commodity and country are representative o o for the European market taking into account the market share of domestic production as well as trade within the EU and imports from third countries, as well as the market share of conventional and organically produced foods, whether the numbers of samples taken per commodity and country are sufficient to allow to check compliance with EU maximum residue levels in a statistically significant way. In case of deficiencies or weaknesses are identified in the current design of the programme, recommendations on possible improvements should be made. CONTEXT OF THE SCIENTIFIC OUTPUT This scientific report provides an overview of the statistical considerations taken when designing and analysing the data collected in the framework of Pesticide Monitoring Program (EFSA, 2013). It provides: General concepts when designing a monitoring program A critical assessment of the commodities included in the monitoring program as well as a sampling proposal considering several design options. EFSA Journal 2015;13(2):4005 6

An assessment of ability of the monitoring program to ensure representativeness of different market shares. An evaluation of the sample size needed to assess EU maximum residue levels compliance considering the different member stats, and EU consumption patterns. PESTICIDE MONITORING PROGRAM: DESIGN ASSESSMENT 1. Introduction The quality of data used to produce statistics and inferences is crucial in ensuring dissemination of reliable and accurate information (Working Group, 2003). The European Food Safety Authority (EFSA) is engaged in the collection of data to support risk assessment, and therefore needs to ensure that the quality of information is appropriate for such assessments which are essential to inform policy making. In general data representativeness refers to a dataset obtained from a survey or study (a sample) which accurately resemble/reflects the population under study. Assessment of data representativeness is only possible after clearly stipulating the targeted population and the purpose for collecting the data (Groves, et. al., 2004; Ramsey and Hewitt, 2005). Having a large sample does not imply representativeness; the manner in which the sample was collected plays an important role in ensuring representativeness. For instance, if the selection of the sample is such that either they are selected because they have the desired characteristic of interest or have similar characteristics, then even a generously large sample will not deliver a representative data. In general introducing bias when collecting data should be avoided and it could be avoided by employing the principles of sampling design aiming to minimize the risk of bias (Knottnerus, 2003). The use of a well-designed probability sample minimizes the risk of having selection bias. This is the greatest advantage of probability sampling compared to non-probability sampling. In instances where the sample has already been obtained and modifications of the design are no longer possible, correction approaches can be considered. This would, however, require information regarding the existence and nature of the bias in question. Sampling survey design will be briefly described in order to set up the basic principles in which this assessment is based. Sample survey design entails all the processes and considerations concerned with obtaining descriptive or inferential statistics of a population of interest by studying a portion of the population instead of the whole population (Barnett, 1991; Foreman, 1991; Kalton, 1983). Compared to studying the whole population (Census), survey has several advantages of which cost-effectiveness is one; indeed studying the whole population will require more financial and human resources than concentrating on a part of it. Moreover, sample survey will require less time than a census hence the required statistics are likely to be timely and relevant. All these advantages apply when the survey is designed in adherence to scientific guidelines which help to control some of the errors that may arise due to studying part of the population, instead of the whole population (Stopher and Meyburg, 1979). The guidelines are just a collection of interrelated decisions on factors such as mode of data collection, method of processing the data and sample design (Kalton, 1983,pp6). It is vital that every decision is made with the aim of designing a sample survey that is representative of the population under study. The first stage in designing a sample survey is a clear definition of the targeted population and objectives. Regulation (EC) No 396/20055 requires Member States to establish national control programmes, to carry out regular official controls on pesticide residues in food commodities in order to check compliance with the MRLs for pesticide residues and to assess the consumer s exposure. It is important to identify the elements which compose the targeted population i.e. the units that make up the population from which information is sought. In the specific case of pesticide monitoring, the elements of the targeted population are food items. In addition to recognizing the elements, a clear definition of the population has to be stated. Again, in the pesticide monitoring study the population can be defined as, for example, all the apples (food commodity) available for consumption in the EU Member States in a specific year, or all the apples on the market in the EU Member States. Note that EFSA Journal 2015;13(2):4005 7

while the former definition includes apples that are still in the farms in the specific year the latter does not, hence a careful and specific definition of the targeted population is a crucial starting point in designing a survey. Logically, the definition of the population should be linked to the objectives of the sample survey. Objectives can broadly be divided into two groups: estimation and inferential. Estimation objectives mainly involve production of quantitative and numerical descriptions (estimation) of relevant aspects of a targeted population, for example the population mean or the population total, the mean difference between two groups of the same population or the proportion of the population with a characteristic of interest. On the other hand, inferential objectives are about testing a particular hypothesis about the population of interest, examples include, testing that the population mean is greater (less) than a certain value or that means of groups within the same population are not equal. An important difference between the two objectives is that the inferential objective requires specification of power of testing, in addition to level of type I error required in the estimation objective. When a survey is conducted with the aim of estimating a parameter of interest in a population, some level of certainty (usually expressed as a confidence/credible interval) is associated with the estimate. Intervals give a range of values in which is believed the true parameter value lies, and if the true value does not lie in this range, a type I error is committed. The probability of committing this error is pre-specified in advance and incorporated in sample size calculation during a survey design so as to keep it under control. Similarly, when a survey s objective is to test an alternative against a null hypothesis, a type I error is committed when the null hypothesis is rejected when it is true. The power of testing a hypothesis is determined by the number of times it is correctly rejected a false null hypothesis. It is known that this affects the sample size needed for the different objectives (estimation or hypothesis testing). Once the targeted population and the goals of the survey are clearly defined, the portion of the population that needs to be included in the survey can be addressed. Such issues are collectively referred to as, sample design. A choice has to be made between using probabilistic or non-probabilistic sampling methods. The main characteristic of non-probabilistic sampling methods is that elements are chosen arbitrarily and it is not possible to associate each element with a probability of being selected. Examples include: (i) convenience sampling, where elements are selected if they can be easily and conveniently accessed, (ii) volunteer sampling, where elements are included upon volunteering, (iii) judgement sampling, where the researcher decides on the elements that are likely to be representative of the population and hence included in the survey (iv) quota sampling, sampling is done until a specific number of units (quotas) for various sub-populations have been selected. Non-probabilistic methods are in general prone to subjectivity and may affect the representativeness of the realized sample. Due to arbitrariness in the selection of elements, it is difficult to quantify the impact that these methods of sample selection would have on survey results. Nevertheless, in some instances non-probabilistic methods may be the only option. On the other hand, in the case of probabilistic methods, every element in the population has a non-zero probability of being selected thereby minimizing subjectivity, and several choices exist that ensure representativeness of the sample and, therefore, the focus of this report is on these methods. All probabilistic methods assume existence of a sampling frame, from which elements can be selected. This can be in the form of a list of all elements in the population or some equivalent procedure identifying the elements in the population. In the pesticide monitoring, would be impractical to list all apples available for consumption in the EU Member States, as such a sampling frame can be defined as all areas that can have apples, e.g., supermarkets, farms, open markets, warehouses, etc. However, information from food consumption surveys recording the quantity of different food items consumed by consumers combined with population registers (number of people residing in a country) could serve as a suitable proxy. Within the sampling frame, sampling units also have to be defined, these are the units that will actually be selected, and these might be the individual elements or groups that contain the population elements. The definition and organization of the sampling frame/units is one of the factors that influence the choice of the sample design. EFSA Journal 2015;13(2):4005 8

Other factors that need to be considered in choosing the sample design are objectives for which the survey is being launched, measurability, practicality and cost. The importance of selecting a sample that will achieve the pre-specified goals cannot be overemphasized. Measurability refers to the sample design that will allow computation of valid estimates or approximations of its sampling variability. These are necessary for statistical inference but also allow for the assessment of the gap between the values inferred from the sample and from the true values for the whole population. Practicality of the design is essential to ensure the correct execution of the whole survey. For instance, for a chosen survey design one should be able to give clear and realistic guidelines on how, when or where to collect the sample. The cost of conducting a survey is a major point in many decisions involved the design of surveys. Factors like, objectives, desired precision and/or power of testing a pre-specified hypothesis can be altered in order to stay within the available budget. Some designs are more costly than others, and usually the more expensive designs are able to reach higher level of precision than their less costly counterparts, however a poorly designed expensive survey will never be successful. In general choosing a sample design will require input from several interested parties and trade-offs are inevitable. These trade-offs should be well documented and be integrated (if possible) in the reports describing the results of the survey. Note that estimates of the population characteristics and sampling variability depend on the sample design. Thus, a survey is basically identified by its sampling design. The basic principles of sampling survey design have been discussed, but some pragmatic and practical considerations are intrinsic in every survey conducted and these might introduce bias in the inference process. When the source of bias is unknown during the analysis process, no correction-based approaches to the inference process are possible. On the other hand, for surveys conducted regularly, previous surveys provide a good platform to identify possible causes of data non-representativeness or sources of bias. Using data collected through the pesticide monitoring program by EFSA, this work aims at assessing the representativeness of the data EFSA uses in risk assessments. For the purposes of this report the objective is defined as the assessment of MRL exceedance to detect at least 1% in food items available for consumption in the EU market 6 as well as the assessment of exposure, but the principles are universal and could be used for other monitoring programmes, for their specific objectives. 2. Commodity Assessment To estimate food items available for consumption in the EU market, expressed per kg bodyweight information on individual food consumption patterns, average body weights and age stratified population figures (refers to the number of people) for all MS is needed. This can be obtained from food consumption databases combined with population registers (number of people residing in a country, as well as potential age structures). The availability of suitable data was assessed. The data on consumption (in grams/kg body weight/day) from Denmark (adults, and children), Spain (adults, and children), Finland (adults), France ( all population, infants, and toddlers), Ireland (adults), Italy (adults, kids/toddlers), Lithuania (adults), the Netherlands (general population, and children), Poland (general population), Portugal (general population), Sweden (general population, 90th percentile), the United Kingdom (adults, vegetarians, infants, and toddlers), Germany (children) were extracted from the chronic consumption figures in the EFSA Pesticide Residue Intake Model (PRIMo; EFSA, 2007). Within the PRIMo dataset there are twelve food categories and 318 food items (raw commodities) based on the food categories used for MRL legislation. The dataset contains average consumption values for each food item reported in each of the surveys above. Also, information on the age (groups) and average body weight covered by the above populations, the years of the surveys, as well what consumption quantity is reported (whether the mean, or otherwise), was obtained from EFSA, 2007 7. In case that the age was specified and mean consumption 6 Commission Implementing Regulation (EU) No 788/2012 of 31 August 2012, concerning a coordinated multiannual control programme of the Union for 2013, 2014 and 2015 to ensure compliance with maximum levels of pesticides and to assess the consumer exposure to pesticide residues in and on food of plant and animal origin (OJ L 235, 1.9.2012, p. 8). 7 Reasoned Opinion on the Potential Chronic and Acute Risk to Consumers Health Arising from Proposed Temporary EU MRLS. According to Regulation (EC) NO 396/2005 on Maximum Residue Levels of Pesticides in Food and Feed of Plant and Animal Origin. 15 March 2007. EFSA Journal 2015;13(2):4005 9

was reported, information on the number of persons (the population) in these age ranges, during the corresponding survey years, was extracted from the EUROSTAT website (Eurostat, 2013). For cases in which the survey age groups did not match with the age groups available from the EUROSTAT byyear, by-five-year and by-broad-age group information, the closest approximation of the age group was applied in extracting its population (number of people by age group residing in a member state). For instance, the population (number of people residing in a member state) for an age group of 18 months-4 years was approximated by the sum of the population (number of people) of 1, 2, 3 and 4 year olds. On the other hand, for cases in which the survey for the Member State covered multiple years, the population (number of people) was computed as the average. In Table 1, the extracted information is displayed. Table 1. Information for Converting Consumption Figures Survey Population Average Body Weight Number of people Denmark, adults 74 3845379 Denmark, adults-animal products 74 3798131 Spain, adults 68.5 26090850 Finland, adults 77.1 2821124 Ireland, adults 75.2 2225474 Italy, adults 66.5 36971900 Lithuania, adults 70 2086715 Netherlands, general population 63 15610650 Poland, general population 62.8 38263303 United Kingdom, adults 76 56060921 Germany, children 16.15 2385082 Denmark, children 22 210472 Spain, children 34.5 411693 France, infants 8.8 750302 France, toddlers 10.6 739212 Netherlands, children 17.1 1180617 United Kingdom, toddlers 14.6 3099830 United Kingdom, infants 8.7 744751 The information was then used to transform the consumption figures from g/kg body weight/day to g/day, using the formula to calculate the consumption for each food item is given below: where is referring to the group surveyed. In Table 2, the consumption for each food category (rounded to whole numbers) is presented, as well as the consumption of each food category as a proportion of the consumption of all the total food categories. The vegetables, fruits, cereals, and animal products (terrestrial) categories completed the list of the top 4 consumed categories. Consumption data for two categories (fish, fish products, shell fish, molluscs and other marine and freshwater food products, and crops exclusively used for animal feed), were not available in the PRIMo dataset and therefore excluded from the table., EFSA Journal 2015;13(2):4005 10

Table 2. Total Consumption per Food Category. Food Category Consumption Proportion ( ) Fruit fresh or frozen; nuts 44969272203 0.24320 Vegetables fresh or frozen 56979078978 0.30820 Pulses, dry 2036197074 0.01100 Oilseeds and oil fruits 1459448273 0.00790 Cereals 33178882325 0.17950 Tea, coffee, herbal infusions and cocoa 1208520697 0.00650 Hops (dried), including hop pellets and... 28027760 0.00020 Spices 5071612 0.00002 Sugar plants 18131655533 0.09810 Products of animal origin-terrestrial animals 26876109467 0.14540 Food categories in bold indicates that were included in the Multi Annual Control Program (MACP). 2.1. Assessment of Representativeness 2.1.1. Consumption Habits To illustrate selection of food items by simple ranking, the total consumption of each of the 318 food items available in the PRIMo consumption dataset was computed, and a list of the highest to lowest consumption made. In Table 3, the first 12 items are provided, representing around 71% of the total consumption. The 12 most consumed items are wheat, potatoes, sugar beet (root), milk and milk products: cattle, apples, tomatoes, oranges, wine grapes, bananas, rice, carrots, and lettuce. Table 3. Ranking of Food Items by Consumption Food Item Consumption (g/day) Rank Wheat 25432472681.5 1 Potatoes 22070513454.9 2 Sugar beet (root) 18131655532.6 3 Milk and milk products: Cattle 17814021768.9 4 Apples 12611756490.1 5 Tomatoes 9131159024.4 6 Oranges 7919847277.1 7 Wine grapes 6652726250.2 8 Bananas 3902172819.2 9 Rice 2754406590.8 10 Carrots 2695099215.4 11 Lettuce 2586158524.1 12 In Table 4, some of the commodities for which there was specific interest is presented, in terms of inclusion, and/or exclusion, for the MACP. Rye ranked much higher than oats, while liver ranked 84 at best, with bovine topping the liver group. Oranges ranked much higher than mandarins, and cauliflower ranked better than broccoli. EFSA Journal 2015;13(2):4005 11

Table 4. Ranking of Specific Food Items of interest Food Item Consumption (g/day) Rank Oranges 7919847277.12 7.0 Mandarins 1375239732.74 23.0 Cauliflower 1017686196.97 29.0 Rye 675849999.49 38.0 Broccoli 540386474.31 44.0 Oats 214942108.46 67.0 Bovine: Liver 114781712.21 84.0 Swine: Liver 61200006.18 106.0 Sheep: Liver 37833058.00 114.0 Poultry: Liver 1054700.67 185.0 Goat: Liver 0.00 264.5 The 30 food items selected in the 2010-2013 MACP cycle are shown in Table 5, and provide their consumption figures, obtained from the food consumption data described in Section 2. Note that milk and eggs were considered single items in the MACP, but different types of each food item were available in the consumption data. Therefore, it is listed all the different types of each food item, as provided in the consumption data. Consumption information for butter and orange juice was not available in the database. Also note that though rye and oats, and oranges and mandarins are listed as the individual items, the pairs were considered single items in the MACP. In Table 5, the consumption for each food item is provided, and its consumption as a proportion of the consumption of all the 318 food items available, representing about 70% of the total consumption. The 30 food items included in the MACP represents also 70% of the consumption of adults and around 74% of the consumption by the children population (number of children residing in a member state). The items are grouped into the food categories introduced earlier, and rank the food items by food category. Lower ranks correspond to higher consumption, and vice versa. Interesting trends are observed. In category 1, out of 72 items in total, the selected items came from the top 10. In category 2, out of 95 items in total, 7 out of the 13 selected items came from the top 10. In category 5, if rye and oats are combined (as in the MACP), then, out of 10 items in total, the selected items came from the top 5. Finally, in category 10, out of a total of 56 items, if all milk products are combined, and egg types likewise, then the entire selected item came from the top 5. It is clear that preference was probably placed on selecting food items which are highly consumed and only 4 food categories are represented in the MACP, which might impact the compliance with MRL sets assessment. Note that for completeness, it is provided, in Appendix 0, a table with the consumption values and proportions for all the 318 food items, and their ranks. The consumption values are provided both for the entire population (number of people), as well as by different age groups. EFSA Journal 2015;13(2):4005 12

Table 5. Food items selected for the 2010-2012 cycle of MACP Pesticide Monitoring Program: Design Assessment Food Category* Food Items Consumption Proportion Rank in the category Fruits Apples 12611756490.06 0.06822 1 Oranges 7919847277.12 0.04284 2 Bananas 3902172819.23 0.02111 4 Pears 2512826703.15 0.01359 5 Peaches 1764763514.62 0.00955 6 Table grapes 1757302800.53 0.00951 7 Mandarins 1375239732.74 0.00744 8 Strawberries 676997974.03 0.00366 10 Vegetables Potatoes 22070513454.87 0.11938 1 Tomatoes 9131159024.36 0.04939 2 Carrots 2695099215.38 0.01458 3 Lettuce 2586158524.14 0.01399 4 Head cabbage 1640906984.42 0.00888 6 Peas (without pods) 1132846356.11 0.00613 8 Cauliflower 1017686196.97 0.00550 10 Peppers 968733621.70 0.00524 11 Cucumbers 909391386.19 0.00492 13 Spinach 642476576.05 0.00348 16 Leek 503563069.38 0.00272 19 Aubergines (egg plants) 459569790.56 0.00249 21 Beans (without pods) 252307369.41 0.00136 24 Cereals Wheat 25432472681.47 0.13757 1 Rice 2754406590.84 0.01490 2 Rye 675849999.49 0.00366 5 Oats 214942108.46 0.00116 7 Animal originterrestrial Milk and milk products: 17814021768.92 0.09636 1 Cattle Swine: Meat 2463243022.81 0.01332 2 Poultry: Meat 1637870931.37 0.00886 4 Eggs: Chicken 1286262720.75 0.00696 5 Bovine: Liver 114781712.21 0.00062 9 Swine: Liver 61200006.18 0.00033 12 Sheep: Liver 37833058.00 0.00020 13 Milk and milk products: 18467693.05 0.00010 19 Goat Milk and milk products: 4344501.75 0.00002 24 Sheep Eggs: Goose 1679126.65 0.00001 25 Poultry: Liver 1054700.67 0.00001 28 Total 129049749503.64 0.69805 *Category-long names shortened. 2.1.1.1. Possible Bias in Selection by Ranking Food Items In the scenario that the variable of interest (exceedance with MRL levels) depends on the consumption levels of food items, there is a high likelihood of overestimating or underestimating the parameters of interest. For illustration, consider the additional samples collected by Member States in the framework of their national programmes together with samples submitted for the EU-coordinated programme for the 2010 pesticide monitoring study. Combining samples from these programmes exceedance estimates for 536 food items are observed. Indeed exceedance rates (percentage of samples with residues above MRL) for food items analysed within the national programmes should not be generalized to the EU as the objectives of the surveys differ between the Member States and are designed to be representative for the targeted population at the national level. Nevertheless, to illustrate possible relationship between EFSA Journal 2015;13(2):4005 13

Percentage above MRL Pesticide Monitoring Program: Design Assessment exceedance and consumption level, it is assumed that the exceedance rates for the different food items are comparable. First, the mean exceedance rate of each food item was computed by averaging the compliance rates over all the countries that analysed the particular food item. Second, the consumption levels for food items available in the combined data were extracted from the chronic consumption data introduced earlier. Not all food items in the combined data were available in the chronic consumption database. Appendix 0provides the number of food items that are available in the consumption database for each Member State. It can be seen that for three Member States the food items included in the PRIMo dataset covered less than 69% of the surveyed food items in their national programmes. In total, there were 178 items that were available in both the EU-national program combined data and the chronic consumption database. Finally, the 178 food items were ranked based on their consumption levels. Figure 1 plots the consumption based ranks against exceedance rates. It is clear that all the food items with non- compliance rates above zero were ranked below 50 (lowest rank implies highest consumption). 0.03 0.02 0.01 0.00 0 50 100 150 Consumption Based Rank Figure 1. Bubble plot of exceedance rates against consumption based ranks for food items analysed in both the EU program and national programs, where the size of the bubble is proportional to the number of samples used to calculate the exceedance rates. Indeed these food items had high consumption levels and thus may have represented a reasonable percentage of consumed food items. On the other hand, their representativeness in relation to estimation of overall exceedance rate might be questionable. For example, if only the top 50 items were to be selected, the overall exceedance rate could be overestimated. For general illustrations of this phenomenon, some simulations were designed. 2.1.1.2. Design of Simulation Study The simulation study was conducted to demonstrate possible drawbacks of selecting food items according to ranks based on their corresponding consumption levels when assessing MRL compliance levels. Two scenarios were considered: i) food items with high consumption levels had higher exceedance rates than food items with low consumption levels, and, ii) food items with high consumption levels had lower exceedance rates than food items with low consumption. The first scenario may arise when for instance the highly consumed food items are mostly imported from third EFSA Journal 2015;13(2):4005 14

Frequency 0 50 100 150 Pesticide Monitoring Program: Design Assessment world countries where use of pesticides might not be controlled. A good example of the second scenario is when food items with high residues are specific to the diet of minority ethnic groups. For the simulation dataset a population of 178 food items was used. Of the 178 food items, 33 were assigned zero consumption and the remaining 145 food items were assigned random consumption values generated once from a lognormal distribution with scale parameter, and shape parameter,. The parameters were chosen to mimic the distribution of consumption observed for the 178 food items in the database combining samples from the EU-coordinated programme and national specific programmes. The distribution of consumption seems to be heavily skewed to the left as shown in Figure 2. Histogram for Consumption 0.0e+00 5.0e+09 1.0e+10 1.5e+10 2.0e+10 2.5e+10 3.0e+10 Figure 2. Distribution of Consumption. Consumption The exceedance rate ( ) for the food item was then obtained as: ( ) ( ) where is the parameter associated to the exceedance rate ( ( ) ( ) ) when there is no effect of consumption, is the consumption level of food item and is the effect of consumption level on exceedance rate. Note that implies that the magnitude of deviation of exceedance rate for the food item from depends on the consumption level of the food item. For a positive, food items with high consumption levels will have higher exceedance rate than those with low consumption levels (positive effect). A negative implies low exceedance rates for food items with higher consumption levels than those with lower consumption levels (negative effect). Due to practical considerations, the large consumption values were re-scaled to a range between 0 and 1. This was achieved by simply using the proportion of consumption for each food item to the total consumption, i.e., Mean overall exceedance rates ( ( ) ( ) ( ) ( ) ) were set to 0.002 (according to the results obtained from the 2010 pesticide monitoring data) and 0.5 (for illustration purposes), corresponding to ( ). The former was the overall exceedance for the food items in the combined database (EU EFSA Journal 2015;13(2):4005 15

and national programmes), and the latter was included to assess the impact on high exceedance rates. Values considered for the effect of consumption on exceedance rate were ( ), to represent the negative and positive effect, respectively. Figure 1 illustrates a positive effect, while the scenario of a negative effect could occur when in a minor crop off-label use results in a high exceedance rate. After re-scaling the consumption level figures to the (0,1) range, many values for were close to zero, hence the large value of was necessary to ensure a detectable effect of consumption which will depend on the level of MRL non-compliance (see column fourth π in Table 6). After generating the data, two methods were used to select the food items to be used for estimation of overall exceedance rate. In the first method (ranking), the items were ranked according to their consumption levels, with the highest consumed food item having the lowest rank. The top 30 (number used in MACP) food items were then selected. The overall exceedance rate was obtained as and it was estimated from the top 30 food items as In the second method (ad hoc stratification), the food items are randomly assigned to five categories with a restriction that each category should have at least one non-zero consumption food item. The categories can be considered as strata hence the need to have some variability within the category. The weights for the categories were then computed as the proportion of consumption in each category to the total consumption. Finally, 30 food items were selected through a stratified simple random sampling scheme with categories as strata, and allocation was done according to the previously computed weights. The true exceedance was computed as in the first method and it was estimated from the 30 selected food items as follows: where and are the weight and average exceedance rate for category. Simulation Results Results from the simulations are provided in Table 6. As expected, the overall exceedance rate is underestimated when is negative and overestimated when it is positive. Importantly, the drawbacks of ranking as a method for selecting food items to be used in estimating overall exceedance is clear from the relative bias. For the setting of, relative bias was as high as 10% for the ranking method compared to 0% in the ad hoc stratification method. While both methods exhibit large relative bias for the setting of, the bias from the ranking method is much higher than bias from ad hoc stratification (133% versus 40%). The relative bias seems to be more pronounced when is smaller than when it is larger. Table 6. Relative Bias on Estimation of Overall Exceedance Rate for the ranking and ad hoc stratification methods. Method Relative Bias ( ) Ranking 0.002-6.2-10 0.0019 0.0017 0.1053 10 0.0027 0.0063 1.3333 0.5 0-10 0.4893 0.4447 0.0912 10 0.5107 0.5553 0.0873 Ad hoc Stratification 0.002-6.2-10 0.0019 0.0019 0.0000 10 0.0027 0.0038 0.4074 0.5 0-10 0.4893 0.4849 0.0090 10 0.5107 0.5196 0.0174 Note that the ad hoc stratification method does not fully adhere to the principles of the sampling framework proposed described below. For example, a sample size of 30 was fixed beforehand (to EFSA Journal 2015;13(2):4005 16

make the results comparable to ranking method) and was not chosen to attain a certain level of precision. In practice, the true exceedance rate is unknown; hence it becomes problematic to assess the precision of the estimates if the selection was not done within a well-defined sampling framework. Although biased results were observed when the ranking method was used, it should be highlighted that the level of underestimation relative to the true MRL exceedance level was around 10%. On the other hand a more pronounced overestimation effect could be observed (up to 133%), but under precautionary principles it would mean that the levels of compliance are lower than what it is reported. In Section 2.2, a probability sampling framework is proposed to select the required number of food items while taking into account the consumption levels, and at the same time ensuring a certain minimum level of precision. As has been illustrated above, there are possible shortfalls of entirely focusing on consumption while selecting food items for inclusion. Balance needs to be struck between making sure that consumption is taken into account, on the one hand, and ensuring that the selected items allow estimation of overall exceedance with minimal bias, on the other hand. 2.1.2. Market Share In order to ensure that different market shares are represented, the number of samples to be taken should be proportionally distributed by the market share of domestic production, considering trade within the EU and imports from third countries, subdivided by the market share of conventional and organically produced foods. These requirements based on market share result in six potential groups being identified, (Figure 3).The number of samples to be taken for each food item in the different market share groups should be allocated proportionally according to the EU market. Reporting Country Market Share (N) Organically Produced Food Conventionally Produced Food Domestic Production (Proportion N 1 /N) Trades Within EU (Proportion N 2 /N) Imports from Third Countries (Proportion N 3 /N) Domestic Production (Proportion N 4 /N) Trades Within EU (Proportion N 5 /N) Imports from Third Countries (Proportion N 6 /N) Figure 3. Market share strategy to ensure representation for Pesticide Monitoring. 2.2. Enhancements to Monitoring Program to Ensure Representativeness If all food items consumed in Europe are considered to be the sampling frame then food categories/groups can be considered as strata, with food items being the population elements. This structure is as represented in Figure 4. The next step is to calculate the number of food items required, in order to estimate exceedance, with a given margin of error. This is a sample size calculation problem. This is done in conjunction with a determination of allocation of the total number of items to the various food categories. Different allocation schemes for stratified sampling designs exist which could be used for this purpose, but here focus is given to proportional allocation schemes. The final stage is the actual selection of the food EFSA Journal 2015;13(2):4005 17

items. This is achieved by obtaining a stratified random sample of the food items, of the size calculated below in equation (1), without replacement (meaning that once a food item has been selected, it is out of the available food items for sampling). The entire process is as represented in Figure 5. Population of all Food Items of Interest. (The Sampling Frame) Fruit fresh or frozen; nuts (Stratum 1)... Products of animal origin- (terrestrial animals). (Stratum 10) Food Item: Oranges (Sampling Unit)... Food Item: Apples (Sampling Unit)... Food Item: Poultry: Meat (Sampling Unit)... Food Item: Eggs: Chicken (Sampling Unit) Figure 4. Stratified Sampling Framework for Food Item Selection. 1. Calculate number of food items needed to estimate non-compliance with a given margin of error. This is a sample size calculation problem. 2. In conjunction with (1), determine allocation of the "overall" sample size to the various categories (strata). This is sample allocation in a stratified design. 3. From the collection of all food items of interest (earlier referred to as the sampling frame), use a stratified sampling design to randomly select the required number of food items considering for the selection probability the consumption proportion they represent. Figure 5. Procedure for selecting the food items. The inclusion and exclusion of food items, using the framework introduced above is discussed. The 12 food groups mentioned in Section 2 were taken as the strata. The proportions of the consumption of each food category, provided in Table 2, were adopted as the stratum weights. EFSA Journal 2015;13(2):4005 18

To illustrate the calculation of the required number of food items, data from the 2010 monitoring programme were used to construct some realistic settings. In the 2010 programme, in which 12 items were included, there was an overall exceedance of 0.0161 (based on a simple average over the specific food items exceedance). Based on this, three different margins of error were set: 0.01, 0.0075, and 0.005 (1%, 0.75% and 0.5%, referring to an additional percentages considered not relevant from risk manager view point, not relative to the 1% MRL exceedance rate to be detected). The margin of error should be read as the potential variation around the value of interest (survey designed to detect MRL non-compliance at 1%) that will be considered as negligible. The selection of a lower margin of error in the design of a survey means that there can be increased confidence that the results will be close to the true value for the targeted population. An additional input that was required is the expected variances when estimating the exceedance rate within the strata. The 12 food items included in the 2010 monitoring programme were from 4 categories: cereals (oats and rye); fruit fresh or frozen, nuts (apples, peaches, pears, strawberries); vegetables fresh or frozen (head cabbage, leek, lettuce, tomato); products of animal origin terrestrial animals (swine meat, milk and milk products). The variances ( ) within each of these categories were computed, resulting in the following values: 0.0013, 0.00004, 0.0001 and 0 respectively. The non-zero values were used as variance inputs for the corresponding 3 categories, while, for the other categories, the average of these values was used. For the stratum allocation scheme, proportional allocation, based on the stratum weights in Table 2, was used. Table 7 shows the number of items that would be required to estimate exceedance with the earlier mentioned margins of error, as well as the respective allocation to the various strata. The two categories omitted in Table 2 are excluded here as well; they have an allocation of 0. As the specified margin of error decreases the required number of food items increases. Note that when allocating the number of samples per food category the calculated sample size number is provided together with the actual sum (presented between brackets in Table 7), so that that if fractions (above 0.1) of samples need to be taken from a food category the number is rounded up. Overall sample size and allocation are determined simultaneously, and, due to rounding up of the allocation figures, the total of the allocation figures may differ. A reduction in the margin of error requires not only a large sample size but a wider range of food items within each category to be sampled. It should be highlighted that whenever the number of items to be sampled within a food category is higher than the actual number of food items available all items in the food category will be included in the selection. Table 7. Total Number of Food Items and Allocation to each Food Categories Allocation to Food Categories Margin of Error 0.01 0.0075 0.005 Food Category Fruit fresh or frozen; nuts 4 7 16 Vegetables fresh or frozen 5 9 20 Pulses, dry 1 1 1 Oilseeds and oil fruits 1 1 1 Cereals 3 5 12 Tea, coffee, herbal infusions and cocoa 1 1 1 Hops (dried), including hop pellets and unconcentrated powder 0 0 0 Spices 0 0 0 Sugar plants 2 3 7 Products of animal origin-terrestrial animals 3 4 10 Total Food Items 16(20) 28(31) 63(68) Figures in brackets represent the actual overall computed number after rounding up. EFSA Journal 2015;13(2):4005 19

The final stage is the actual selection of the food items. This is achieved using the following procedure: Under proportional allocation, the proportions of the samples (Food Items) in the stratum, are set equal to the proportions of the consumption in the stratum (food categories, see Table 2), the formula to calculate the sample size is then, (1) where from the standard normal considering confidence level of 5%, represents the margin of error as previously explained, and has been previously identified as the proportion of consumption for each food category and the variance for each food category as previously explained. A stratified random sample, without replacement, is then obtained. As an example, one set of the 20 items required for this margin is selected, for the 0.01 margin of error case. The selected items are provided in Table 8. This is just one of the possible sets of selections of size 20, satisfying the allocation condition above and considering that the selection probability reflects the food item consumption that could be generated. This sample of 20 elements already represents around 67% of the total consumption in Europe according to the data available on consumption. Table 8. An Example Set of 20 Food Items to Provide a 0.01 Margin of Error. Food Category Food Items Proportion Rank Fruit fresh or frozen; nuts Apples 0.0682 1 Oranges 0.0428 2 Pears 0.0136 5 Grapefruits 0.0034 11 Vegetables fresh or frozen Potatoes 0.1194 1 Carrots 0.0146 3 Head cabbage 0.0089 6 Cauliflower 0.0055 10 Broccoli 0.0029 18 Pulses, dry Beans 0.0073 1 Oilseeds and oil fruits Other oilseeds 0.0005 4 Cereals Wheat 0.1376 1 Rice 0.0149 2 Other cereal 0.0094 3 Tea, coffee, herbal infusions and cocoa Coffee beans 0.0021 1 Sugar plants Sugar beet (root) 0.0981 1 Other sugar plants 0.0000 4 Products of animal origin-terrestrial animals Milk and milk products: Cattle 0.0964 1 Bovine: Meat 0.0128 3 Poultry: Meat 0.0089 4 2.2.1. Selection of Food Items by Age Groups In section 2.2 the selection of items was performed, pooling the different populations (adults, children, general population, infants, and toddlers) together. A possible concern is how to make sure that children and infants are well represented. Here, the idea of selecting the items by the age groups is explored. Based on the Survey populations represented in Table 1, it was deemed feasible to consider adults as one group, and children, infants, and toddlers as the second group. In Table 9, the consumption of each of the two groups, as a percentage of the total is presented. The two food categories omitted earlier are excluded here as well. EFSA Journal 2015;13(2):4005 20

Table 9. Consumption Percentage by Age Group Percentage of Total Food Category Adults Children* Fruit fresh or frozen; nuts 96.0 4.0 Vegetables fresh or frozen 98.0 2.0 Pulses, dry 98.0 2.0 Oilseeds and oil fruits 98.0 2.0 Cereals 98.0 2.0 Tea, coffee, herbal infusions and cocoa 98.0 2.0 Hops (dried), including hop pellets and unconcentrated powder 99.6 0.4 Spices 48.0 52.0 Sugar plants 94.0 6.0 Products of animal origin-terrestrial animals 93.0 7.0 *Children=children, infants, and toddlers. Within each of these groups, the consumption of each category was used to provide stratum weights, as earlier. Based on similar margins of error as used earlier, the number of food items required for each group was determined, as well as the allocation to the various categories. This is provided in Table 10. Table 10. Selection and Allocation of Items by Adults and Children* Allocation to Food Categories Margin of Error 0.01 0.0075 0.005 Food Category* Adults Children Adults Children Adults Children Fruits 4 4 7 7 16 16 Vegetables 5 3 9 5 20 11 Pulses 1 1 1 1 1 1 Oilseeds 1 0 1 1 1 1 Cereals 3 2 6 3 12 7 Tea, coffee 1 0 0 0 1 1 Hops 0 0 0 0 0 0 Spices 0 0 0 0 0 0 Sugar plants 2 3 3 5 6 11 Animal origin-terrestrial 3 5 4 8 9 17 Total Food Items 20 (16) 18 (15) 32 (28) 30 (27) 66 (63) 65 (61) *Category-long names shortened. *Children=children, infants, and toddlers. Figures in brackets represent the actual overall computed number. Two separate stratified random selections of food items are conducted, each random selection being done as described in Figure 5. The list is given in Table 11, both for the margin of error 0.01, as well as 0.0075. Such lists of items could be generated, and decisions made on the food items to include in the MACP. Note that for the children list, 4 items are selected the sugar plants category, as opposed to 5 as in Table 10. It should be highlighted that in order to obtain an unbiased estimate of the overall MRL exceedance rate for EU, it is important to include not only food items that are expected to exceed the MRL levels, but also those that might not. In the case in which only food items which are expected to exceed the MRL levels are included in the sample, this will certainly overestimate the overall exceedance rate. This is because the category has 4 items only. When considering the margin of error to be 0.01 (potential variation (1%) around exceedance of 1% that will be considered as negligible) the selected food items for adults represent 67% and for the children group 81% of the total consumption. In the case that the margin of error is 0.0075 the selected food items represent 83% and 88% of the adult and children consumption respectively. EFSA Journal 2015;13(2):4005 21

Table 11. An Example Set of Food Items by (Age Group) to Provide 0.01 and 0.0075 Margins of Errors. Food Food Items Category* Margin of Error 0.01 Margin of Error 0.0075 Adults Children* Adults Children* Fruits Grapefruits Oranges Grapefruits Oranges Oranges Apples Oranges Apples Apples Pears Apples Pears Pears Kiwi Pears Table grapes Table grapes Strawberries Wine grapes Kiwi Bananas Bananas Vegetables Potatoes Potatoes Potatoes Potatoes Broccoli Tomatoes Carrots Onions Cauliflower Broccoli Tomatoes Tomatoes Head cabbage Aubergines (egg plants) Broccoli Lettuce Melons Cultivated fungi Broccoli Cauliflower Head cabbage Lettuce Pulses Beans Beans Beans Beans Oilseeds Other oilseeds Other oilseeds Linseed Cereals Rice Rice Barley Rice Wheat Wheat Buckwheat Rye Other cereal Maize Wheat Rice Wheat Other cereal Tea, coffee, Coffee beans Sugar plants Sugar beet (root) Sugar beet (root) Sugar beet (root) Sugar beet (root) Other sugar plants Chicory roots Chicory roots Sugar cane Other sugar plants Other sugar plants Chicory roots Other sugar plants Animal originterrestrial Bovine: Meat Swine: Meat Swine: Meat Swine: Meat Poultry: Meat Bovine: Meat Bovine: Meat Bovine: Meat Milk and milk products: Cattle Poultry: Meat Poultry: Meat Sheep: Meat Milk and milk products: Cattle Eggs: Chicken *Category-long names shortened. *Children=children, infants, and toddlers. Food items in bold indicates that they were also selected in the MACP. Milk and milk products: Cattle 2.3. Assessing Representation of Food Items with Similar Residues Sheep: Fat Poultry: Meat Milk and milk products: Cattle Eggs: Chicken The framework used to determine whether to keep food items which are expected to provide similar information regarding exceedance rate for a particular pesticide is a slight modification of the framework presented in Section 2.2, as illustrated in Figure 6. Similar to the stratified framework, food categories were considered as strata, and an intermediate stage (highlighted in bold) was added to address the question on whether to keep or drop similar food items. Note that this is presented for EFSA Journal 2015;13(2):4005 22

illustration purpose, but insight on what could be considered cluster is crucial and need further elaboration and information about pesticide usage. Population of all Food Items of Interest. (The Sampling Frame) Fruit fresh or frozen; nuts. (Stratum 1) Products of animal origin- (terrestrial animals). (Stratum 10) Citrus Fruit (Cluster) Stone Fruit (Cluster) Meat (Cluster) Bird's eggs (Cluster) Grapefruit (Sampling Unit) Limes (Sampling Unit Apricots (Sampling Unit) Plums (Sampling Unit Swine: Meat (Sampling Unit) Swine: Liver (Sampling Unit) Eggs: Duck (Sampling Unit) Eggs: Quail (Sampling Unit) Figure 6. Stratified Cluster Sampling Framework for Food Item Selection. The selection can therefore be done in two stages; first, clusters (a composition of two or more similar food items) are selected from each stratum, and in the second stage, food items are selected from the previously selected clusters. In this design, cluster refers to a group of food items sharing similar characteristics. The general idea is that food sub-categories should be composed of food items that are suspected to have some similarities; similarities can be in terms of residues found, pesticides used, or any other similarity grouping characteristic deemed relevant by experts. The degree of similarity is captured by the intra-class correlation ( ), which ranges from 0 to 1. This is estimated when mixed logistic regression is used, considering an extra parameter in general a random intercept that is shared by all samples within the cluster, inducing association between them. An intra-class correlation of 1 implies that the food items are exactly the same in terms of for instance residues found, pesticide usage, i.e., they contribute exactly the same information, hence retaining both would be indeed unnecessary. On the other hand, zero intra-class correlation implies complete lack of similarity, i.e., each food item contributes different vital information towards estimation of relevant parameters (e.g. exceedance rate). The proposed framework incorporates information on and consumption levels to compute the required number of food items to be selected, and a sampling scheme with a random component is utilised to select food items that will be representative of the food items population. It is sufficient to know the degree of similarity between food items and the selection process accounts for these similarities. This avoids the scenario of having to decide case-by-case regarding which food items to drop or keep. Figure 7 shows the detailed procedure for obtaining the number of food items and how to get the food items within each cluster. EFSA Journal 2015;13(2):4005 23