When Marker Study Designs Fail the Markers Follow

When Marker Study Designs Fail the Markers Follow Cliff Spiegelman, Texas A&M University & Maria Cuellar, Carnegie Mellon University and Lucas Mentch SAMSI, University of Pittsburgh 1

Always be strong enough to let go, and be smart enough to wait for what you deserve. Anonymous quote 2

The Search for Biomarkers is a Major Scientific and Medical Undertaking Large literature and numbers of research projects for finding markers for early stage cancer and other diseases There are 1000s of cancer biomarker papers published each year, only about 2 cancer biomarkers per year are granted FDA approval Forensic biomarkers do not undergo a FDA approval process 3

We Examine Some of the Poor Study Designs Used in Non-FDA Approved Biomarkers We discuss poor sensitivity, the inability to detect a condition, that follows from inadequate study designs We discuss poor specificity, the inability to detect only the targeted condition, that follows from inadequate study designs We discuss inadequate forensic markers that are promulgated based on anecdotal experience 4

There is a Large Literature on Flawed Biomarker Designs and Studies Freidlin (2010) And Polley (2013) are a couple among many examples of papers the discuss study designs for medical biomarkers This Is Not The Case For Forensic Markers 5

Common Study Designs for Cancer Biomarkers Use Genetically Altered Mice (See Kelly- Spratt Et Al, 2008, Zhang Et Al, 2015) And Are Commonly Referred To As Mouse Models. Commonly Referred To As Mouse Models. The design uses pairs of liter mates that were genetically altered to have a specified cancer when fed tetracycline or other antibiotics. When taken off tetracycline, the cancer temporarily goes into remission. In the absence of tetracycline, the mice do not develop cancer. 6

Design Flaws Neither the cancer groups nor the non-cancer groups are representative of their respective target populations. The cancerous mice do not have colds, medicines, broken bones, bruises, or sore throats. The normal mice are also too normal and do not have colds, medicines, broken bones, bruises, sore throats. When searching for biomarkers to indicate breast cancer, it is important to select a patient pool from women that have suspicious mammograms and before biopsies. Otherwise women who are told that their biopsies are negative will have different chemistry than those who are told that their biopsies are positive largely due to different anxiety levels (see http://cssi.cancer.gov/pdf/program-evaluations/2009- %20CPTAC- %20Program%20Update.pdf, page 73). 7

Multiple Comparisons and False Discovery Rate Issues 8

Young and Karr Demonstrate Bad Markers Due to Selection 9

Forensic Markers For certain studies about Abusive Head Trauma (AHT) (Maguire et al. 2009, Maguire et al. 2011, and Cowley et al. 2015) the designs have possible bias. As described in Maria s talk. The first evident source of bias is that the population in these studies is nonrandom, since it was selected from children s hospitals with physicians and child abuse specialists who were aware of and interested in AHT. Akin to looking for markers of domestic abuse by using only a women in shelters for battered women. 10

Forensic Markers Another source of bias is introduced by the responses used in the studies Specifically, the authors of these studies determine that a child was abused if the child fulfills either of these two criteria: Abuse confirmed at case conference or civil, family or criminal court proceedings or admitted by the perpetrator, or independently witnessed, or, Abuse confirmed by stated criteria including multi-disciplinary assessment. Thus, it is possible and even likely that some children recorded as not having been abused, were in fact abused, but this could not be confirmed according to these criteria. 11

Forensic Markers Natural and immediate parallels can be drawn between the ongoing search for informative biomarkers and the development of arson science in the United States. The last several decades have seen drastic changes in the way fire scenes are processed and analyzed for arson indicators. A number of indicators such as large alligatoring (deep patterns of severe scorching) and crazed glass (unusual fracture patterns in glass surfaces) once thought to definitely indicate the presence of accelerants many even supported at the time by the National Bureau of Standards (NBS now the National Institute of Standards and Technology (NIST)) are now largely recognized as no more than myth (1980). 12

Forensic Markers In 1992, a committee appointed by the by the National Fire Protection Association (NFPA) released the first NFPA 921 investigation guide, helping to put to bed many of these myths. More problematic, however, is that in addition to these patently wrongful interpretations, there are several other indicators that may often be present both in accidental fires as well as cases of arson. Lines of demarcation (well-defined patterns of differing char intensity), for example, can be the result of liquid accelerants but may also be caused by clothing or falling drywall (Lentini 2013). Since heat rises, the presence of low burn areas or burn holes in the floor is sometimes taken as proof that the fire started on the floor and likely involved accelerants (Lentini 2013). However, NFPA 921 now cautions that such effects may also be the result of flashover, an effect common in both intentional and accidental fires. 13

Poor Sensitivity Since mouse models with cancer do not have enough variation in conditions (colds, bruises, pneumonia etc.) the sensitivity of resulting biomarkers for cancer in the general population is unknown and unknowable unless better designs are used. Since AHT studies used biased selection methods that likely missed some AHT cases the sensitivity of resulting markers for AHT in the general population is unknown and unknowable unless better designs are used. Since arson studies used anecdotal designs the sensitivity of resulting markers for arson in the general population is unknown and unknowable unless better designs are used. 14

Poor Specificity Since mouse models with cancer do not have enough variation in conditions (colds, bruises, pneumonia etc.) the specificity of resulting biomarkers for cancer in the general population is unknown and unknowable unless better designs are used. Since AHT studies used biased selection methods that likely missed a lot of non- AHT cases and some cases labeled as AHT were improperly labeled. Thus the specificity of resulting markers for AHT in the general population is unknown and unknowable unless better designs are used. Since arson studies used anecdotal designs the specificity of resulting markers for arson in the general population is unknown and unknowable unless better designs are used. 15