Validation of Risk Matrix 2000 for Use in Scotland

Validation of Risk Matrix 2000 for Use in Scotland Report Prepared for the Risk Management Authority Don Grubin Professor of Forensic Psychiatry Newcastle University don.grubin@ncl.ac.uk January, 2008

ACKNOWLEDGEMENTS The Scottish Prison Service and the Scottish Criminal Records Office were both fundamental to the successful implementation of this project. In spite of other priorities, a number of staff in both these agencies made great efforts to secure the data we sought, for which we thank them warmly. We would also like to thank the Association of Directors of Social Work for assistance in tracking down missing information, and of course the individual Social Workers who provided this material.

EXECUTIVE SUMMARY Background Risk Matrix 2000 is a statistically derived risk assessment instrument for use with convicted male sex offenders. It comprises two scales, Risk Matrix Sex and Risk Matrix Violence, which provide an estimate of the long term likelihood of reconviction for a sexual or a non-sexual violent offence, assigning individuals to Low, Medium, High and Very High risk categories. It is a fundamental component of the systematic sex offender risk assessments carried out in England and Wales by the prison, probation and police services, and is used by police forces throughout the UK, including Scotland. Other agencies in Scotland have also in recent years begun to include Risk Matrix 2000 in their sex offender risk assessment protocols. In spite of its widespread and officially sanctioned use in the United Kingdom, Risk Matrix 2000 has not been subject to any form of rigorous evaluation, and its empirical foundation is thin. The original validation studies, mainly carried out in respect of a cohort of sex offenders released from prisons in England and Wales in 1979 and another in 1980, have not been peer reviewed, their methodology and analyses have not been published, and only limited data from them is available. There is also a paucity of other studies examining the performance of the instrument; these generally report poorer outcome than that described in the validation studies, but they suffer from small sample sizes and selective study populations. Furthermore, Risk Matrix 2000 has not been validated in a Scottish setting. According to the validation studies carried out in England and Wales, the accuracy of Risk Matrix is in the moderate range, similar to that reported for other, similar types of risk assessment instrument used with sex offenders. However, because of variations in the base rate of reconviction in different jurisdictions, more information than just accuracy data is needed to determine whether findings from one setting can be generalised to another. In particular, measures such as Likelihood Ratios (which are an assessment of the likelihood that a recidivist will be placed in a particular risk category compared with the likelihood that a non-recidivist will be placed in that same category) allow for risk categories to be compared across populations, regardless of base rates of reconviction. This type of consideration is particularly pertinent for present purposes, as it is relevant to the issue of whether the findings of Risk Matrix 2000 evaluations in England can be readily applied in Scotland. Aims of the study The study described in this report examines the reliability, validity and interpretation of findings when Risk Matrix 2000 is used in a large Scottish sample. More specifically, it is intended to: determine the association between Risk Matrix risk levels and reconviction rates for sex offenders in a Scottish setting; establish whether the properties of Risk Matrix 2000, when applied to a Scottish sex offender population, are similar to its properties as described in the England and Wales validation studies reported in Thornton et al (2003). i

To achieve these goals requires: - an assessment of how well Risk Matrix ranks offenders in terms of their levels of risk; - establishing the probability of reconviction associated with each Risk Matrix category; - describing the properties of the scale in a manner which can be compared between populations independent of the base rate of reconviction. Taken together, these factors address the overall objective of the study, which is to establish the extent to which, and indeed whether, Risk Matrix 2000 can contribute to the systematic risk assessment of sex offenders in Scotland, and thereby assist in their management. Study population The study cohort is comprised of all sex offenders released from Scottish prisons between 1996 and 2001, amounting to 1029 individuals. Using records obtained from the Scottish Prison Service and the Scottish and English Criminal Records Offices, Risk Matrix ratings and criminal conviction follow-up data were obtained for 771 individuals (75%) in respect of Risk Matrix Sex (RMS), and for 974 individuals (95%) in respect of Risk Matrix Violence (RMV); absence of information from the missing cases is not thought to have biased the findings reported here. Average length of follow-up was approximately 8.5 years. There was a minimum five year follow-up for all offenders. Reliability Although all of the data was collected by a single researcher, 40 cases were scored independently by a second rater. For both RMS and RMV there was complete agreement in risk categories in 36 of 40 cases (90%). Kappa was 0.84 for RMS and 0.85 for RMV, indicating a high degree of inter-rater reliability. Risk Categories Offenders were distributed across the four risk categories as follows: Risk category Risk Matrix Sex Risk Matrix Violence n % n % Low 279 36.2 390 40.0 Medium 312 40.5 322 33.1 High 117 15.2 176 18.1 Very High 63 8.2 86 8.8 Total 771 100 974 100 ii

Compared with the England and Wales 1979 validation study, the Scottish RMS sample contained a higher proportion of men in the Low risk category, while the England and Wales sample had proportionally more offenders in the High and Very High risk groups. For RMV, the distribution of offenders between categories was similar in the two cohorts. Reconviction rates Risk Matrix Sex Of the 771 offenders in the RMS sample, 116 (15.0%) were reconvicted of a sexual offence at any time following their release from prison, while 83 (10.8%) were reconvicted of a sexual crime within 5 years of their prison release. This compares with a 19.6% five year sexual reconviction rate for the 1979 England and Wales cohort. The five year reconviction rate for each RMS category is shown in the table below. There is a significant increase in reconviction rates from Low to Medium to High categories, with no overlap in confidence intervals in terms of both the proportions of men reconvicted and the higher odds of reconviction. The difference between the High and Very High groups, although in this same direction, is not statistically significant because of the relatively small number of offenders in the latter category. Survival analyses showed that the distinction between risk groups was maintained throughout the entire period of follow-up, again with the exception of a clear difference between High and Very High risk groups. Odds Ratios show the increase in the odds of reconviction for each ascending risk category. RMS category % (n) 95% CI Odds Ratio 95% CI Low 2.9 (8) 1.2 5.6 Medium 9.9 (31) 6.9 13.8 Medium v Low 3.7** 1.7 8.2 High 21.4 (25) 14.3 29.9 High v Medium 2.5* 1.4 4.4 Very High 30.2 (19) 19.2 43.0 Very High v High 1.6 0.8 3.2 Total 10.8 (83) 8.7 13.2 ** p =.001 * p <.01 Differences between risk groups were apparent by one year. Although the actual numbers reconvicted were low, a significantly higher proportion of men in the Very High risk group had been reconvicted of a sexual offence within a year of prison release compared with those rated as High risk, and similarly, significantly more men in the High risk group had been reconvicted of a sexual offence within this time iii

compared with men in the Medium risk group. These differences remained significant at two years. accuracy (RMS) In terms of predictive accuracy, the AUC was 0.73 (95% CI 0.68 to 0.78), well within the moderate range typically described for risk assessment instruments of this type. This compared with an AUC of 0.75 in the England and Wales 1979 sample. seriousness of reconvictions (RMS) Although the likelihood of reconviction varied between the risk categories, the seriousness of reoffence did not, basing this judgement on sentences received. Sentencing information was available for 103 of the 116 sexual reconvictions that took place over the entire follow-up period differences in sentence severity did not differ significantly between the four risk groups, but when the Low and Medium groups are collapsed into a single category and the High and Very High groups into another, the lower risk category is found to have received a significantly higher proportion of more severe sentences. Risk Matrix Violence At five years follow-up 120 of the 974 offenders (12.3%) in the RMV sample had recidivated with a non-sexual violent offence, while 176 (18.1%) were reconvicted for a non-sexual violent crime during the entire follow-up period. There are no reports of five year violent reconviction rates for RMV in the literature with which to compare. The five year reconviction rate for each RMV risk category is shown in the table below. There is a significant increase in reconviction rates between Low, Medium and High risk categories, and only a small overlap in the confidence intervals between the High and Very High groups; there is no overlap in confidence intervals in respect of the higher odds of reconviction for ascending risk categories. Survival analyses showed that the distinction between risk groups was maintained throughout the entire follow-up period. RMV category % (n) 95%CI Odds Ratio 95% CI Low 3.1 (12) 1.6 5.3 Medium 10.2 (33) 7.2 14.1 Medium v Low 3.6*** 1.8 7.1 High 23.9 (42) 17.8 30.9 High v Medium 2.7*** 1.7 4.5 Very High 38.4 (33) 28.3 49.5 Very High v High 2.0* 1.1 3.5 Total 12.3 (120) 10.3 14.6 *** p <.0001 * p =.01 iv

As with RMS, differences in reconviction rates between risk categories were apparent from year one, with significant differences emerging between High and Medium risk groups by this time, and which were maintained at two years follow-up. accuracy (RMV) Regarding predictive accuracy, the AUC was 0.76 (95% CI 0.71 to 0.80). This compared with an AUC of 0.78 for the 1980 England and Wales 10 year follow-up, and an AUC of 0.80 for the 1979 England and Wales 19 year follow-up (although independent calculation suggests that the AUC in the latter study was in fact 0.76). seriousness of reconvictions (RMV) Disposals were available for 162 of the 176 non-sexual violent reconvictions that took place during the follow-up period. Just 15 (9.2%) resulted in prison sentences of a year or more, suggesting that most of these offences were not of a serious nature. There were five life sentences, 4 of which were received by men in the Medium risk Group and one by a Low risk offender. Comparison with the 1979 England and Wales cohort In spite of a significant difference in the five year base rate of reconvictions between the 1996-2001 Scottish and the 1979 England and Wales cohorts, reconviction rates for individual RMS categories were consistent between the two groups, with the exception of a lower sexual recidivism rate found in the Scottish Very High risk category. This difference is likely to have been a function of the lower base rate of reconviction in the Scottish cohort. Likewise, two year reconviction rates for individual RMV categories were similar between the Scottish and a 1990s England and Wales cohort, with the exception of a much higher rate of reconviction in the Scottish High Risk category (although in this case the difference may be genuine). In respect of RMS, the Likelihood Ratios (LR) for each category was also in the same range between the cohorts, with the exception of the High risk group (and nearly in the RMS Medium risk one); in the case of RMV, there was a difference between medium risk categories when the Scottish sample was compared with a 10 year England and Wales follow-up from 1980, and high risk categories when compared with the two year 1990s follow-up. These results suggest that it is in this middle area of Medium and High risk offenders that Risk Matrix may be less stable. Overall, however, the instrument appeared to perform similarly across the two settings. The Odds Ratios between adjacent risk categories in respect of reconviction were broadly similar in the two cohorts. The exception was in the High versus Medium RMV Scotland-1980 England and Wales comparison, which may be the result of the RMV medium group varying between the groups as shown by the different Likelihood Ratios. v

Conclusion The aim of this study was to determine how well Risk Matrix performs this task in Scotland. In brief, it found that Risk Matrix 2000 is indeed valid for use in Scotland. It was effective in classifying sex offenders in Scotland in terms of their risk of recidivism for both sexual and non-sexual violent offending. Risk categories were distinct from each other (although the boundary between High and Very High risk individuals was less clear because of the relatively low numbers of offenders in the latter group), and the four risk categories successfully ranked offenders according to their recidivism risk. The predictive accuracy of the two primary scales, Risk Matrix Sex and Risk Matrix Violence, was in the moderate range, with AUCs in the midseventies, similar to that reported for other, more complicated risk assessment instruments of a similar type. Risk Matrix is probably best viewed as a screening tool, identifying individuals who require further assessment because of their increased risk of reconviction. Other approaches will then be necessary to determine current, as opposed to long term risk, as well as the potential consequences of a reoffence these include structured dynamic risk assessments, guided clinical judgment, and psychometrics, amongst others. This overall process will in turn help advise strategies for managing individual offenders, whether for sentencing (including considerations for Orders of Lifelong Restriction), release from custody, or for community management by way of protocols developed through multi-agency public protection arrangements (MAPPA). Risk Matrix 2000, therefore, should be seen as the first step in an assessment process, not a substitute for the assessment process itself; to be effective, it must been seen as part of a wider package. Interpretation of Risk Matrix outcomes Because of the large and comprehensive nature of the study population, which encompasses a high proportion of all sex offenders released from Scottish prisons between 1996 and 2001, and because of the discriminative capacity of the scale as demonstrated in this report, the findings reported here can be used to interpret the meaning of Risk Matrix assessments in Scotland when used with released prisoners, and by extension with sex offenders in Scotland generally, in the following manner: Regarding numbers of offenders per risk category: - for RMS, about three quarters of offenders would be expected to score as Low or Medium risk, about 15% as High risk, and less than 10% as Very High risk; - for RMV, about three quarters of offenders would again be expected to score as Low or Medium risk, from 15-20% as High risk, and again less than 10% as Very High risk. In terms of reconviction risk, reasonable approximations of five year reconviction rates are: vi

RMS Low less than 5 % (1 in 20) Medium 10% (1 in 10) High 20 25% (1 in 4 to 1 in 5) Very High 33% (1 in 3) - The odds of a Medium risk offender recidivating are about 4 times that of a Low risk offender; - The odds of a High risk offender recidivating are about 2.5 times that of a Medium risk offender; - The odds of a Very High risk offender recidivating are about 1.5 times that of a High Risk offender. RMV Low less than 5 % (1 in 20) Medium 10% (1 in 10) High 25% (1 in 4) Very High 40% (2 in 5) - The odds of a Medium risk offender recidivating are about 3.5 times that of a Low risk offender; - The odds of a High risk offender recidivating are about 3 times that of a Medium risk offender; - The odds of a Very High risk offender recidivating are about twice that of a High Risk offender. In respect of the general specificity and sensitivity of the scale: - for RMS, an offender who recidivates is over three times as likely to be rated as Very High risk compared with an offender who does not, and over two times as likely to be rated as High risk; - for RMS, an offender who does not recidivate is over four times as likely to be rated as Low risk compared with an offender who does. - for RMV, an offender who recidivates is over four times as likely to be rated as Very High risk compared with an offender who does not, and over two times as likely to be rated as High risk; - for RMV, an offender who does not recidivate is about four and a half times as likely to be rated as Low risk compared with an offender who does. Recommendations 1. Although Risk Matrix is a reasonable straightforward instrument, training in its use is essential if it is to be scored accurately. There should be a vii

requirement that those carrying out Risk Matrix assessments receive appropriate training. 2. A means of quality assuring Risk Matrix scores is necessary and should be put in place if it is not already established. 3. Relevant information needed to score Risk Matrix should be routinely included in reports prepared by criminal justice social workers, and included in prison files. 4. The base rate of sexual and non-sexual violent reconvictions in sex offenders should be monitored in order to ensure that the reconviction approximations reported here remain valid. 5. Consideration should be given to continuing the follow-up of the data set used in this study to obtain 10 and 15 year reconviction rates for Risk Matrix categories. viii

TABLE OF CONTENTS Page Executive summary i - viii Introduction 1 Background 1 Risk Matrix 2000 3 The evaluation of risk assessment instruments 4 Aims of the study 6 Method 7 Identification of offenders 7 Power calculation 8 Data collection 8 Missing information and final sample size 8 Risk Matrix Sex 8 Risk Matrix Violence 10 Reliability 11 Results 12 Distribution of RMS and RMV categories 12 Reconviction rates 14 Risk Matrix Sex 14 seriousness of reconvictions 19 missing reconviction data 19 Risk Matrix Violence 20 seriousness of reconvictions 24 Discussion 25 Summary of findings 25 Reliability in scoring Risk Matrix 26 Comparison of Risk Matrix performance in England and Wales 27 Risk Matrix 2000 as a screening tool 28 Risk Matrix 2000 and Orders for Lifelong Restriction 29 Limitations 30 Conclusion and Recommendations 31 References 32

List of Figures and Tables Figures Page Figure 1: The Risk Matrix Sex study population 9 Figure 2: The Risk Matrix Violence study population 11 Figure 3: ROC curves based on RMS categories and 5 year conviction rates for 17 sexual offences in the 1996-2001 Scotland and 1979 England and Wales cohorts Figure 4: Sexual reconvictions per RMS categories during the entire follow-up 18 period available for each offender in the sample Figure 5: ROC curve based on RMV categories and 5-yr reconviction rates for 23 violent offences in the 1996-2001 Scotland five year follow-up, the 1979 England and Wales 19 year follow-up, and the 1980 England and Wales 10 year follow-up Figure 6: Violent reconvictions for each RMV category during the entire follow- 24 up period available for each offender in the sample Figure 7: Comparison between number of offenders in combined lower and 29 upper RMS risk categories, and the number of reconvictions for the combined groups Tables Table 1: Number of sex offenders released per year between 1996 and 2001 7 Table 2: Comparison of Step One categories between the 771 offenders in 10 the study population and the 235 offenders for whom full RMS information was not available (missing cases) Table 3: Distribution of RMS risk categories in the 1996-2001 Scotland sample 13 and in the 1979 England and Wales cohort Table 4: Distribution of RMV risk categories in the 1996-2001 Scotland sample 13 and in the 1979 England and Wales cohort Table 5: Five year sexual reconviction rates by RMS category, and Odds Ratios 14 comparing reconvictions between adjacent categories Table 6: Likelihood Ratios in respect of an offender being reconvicted for a 15 sexual offence within five years for each RMS category Table 7: Comparison of sexual reconviction rates per RMS risk category 15 between the 1979 England and Wales and the 1996-2001 Scotland cohorts, five year follow-up

Page Table 8: Comparison of the Odds Ratios for sexual reconviction in adjacent 16 RMS risk categories between the 1979 England and Wales and 1996-2001 Scotland cohorts, five year follow-up Table 9: Comparison of the Likelihood Ratios between the 1979 England and 16 Wales and 1996-2001 Scotland cohorts for each RMS category, five year followup Table 10: One and two year sexual reconviction rates by RMS category 18 Table 11: Comparison of reconviction sentence length between RMS risk 19 categories Table 12: Five year violent reconviction rates by RMV category, and Odds 20 Ratios comparing reconvictions between adjacent categories Table 13: Likelihood Ratios in respect of an offender being reconvicted for a 20 non-sexual violent offence within five years for each RMV category Table 14: Comparison of non-sexual violence reconviction rates per RMV risk 21 category between the 1996-2001 Scotland (5 year follow-up) and the 1980 England and Wales cohort (10 year follow-up) Table 15: Comparison of two year non-sexual violence reconviction rates per 21 RMV risk category between the 1996-2001 Scotland and a 1990s England and Wales cohort Table 16: Comparison of the Odds Ratios for non-sexual violence reconviction 22 in adjacent RMV risk categories between the 1980 England & Wales cohorts and 1996-2001 Scotland Table 17: Comparison of the Likelihood Ratios between the 1979 England and 22 Wales and 1996-2001 Scotland cohorts for each RMV category, five year follow-up

INTRODUCTION Background Risk Matrix 2000 is a statistically derived risk assessment instrument for use with convicted male sex offenders. It provides an estimate of their long term likelihood of reconviction for a sexual or a non-sexual violent offence, assigning individuals to Low, Medium, High and Very High risk categories. It is a fundamental component of the systematic sex offender risk assessments carried out in England and Wales by the prison, probation and police services, and it forms the basis of initial management decisions within Multi-agency Public Protection Arrangement (MAPPA) protocols. Risk Matrix 2000 categories are also included on the Violent and Sex Offenders Register (ViSOR), a national police intelligence data base used by police forces throughout the UK, including Scotland, where it provides an easily assessable quantification of an offender s risk. Other agencies in Scotland have also recently begun to include Risk Matrix 2000 in their sex offender risk assessment protocols. In spite of its widespread and officially sanctioned use in the United Kingdom, however, Risk Matrix 2000 has not been subject to any form of rigorous evaluation. The main validation study used to support it a 19 year follow-up of 429 sex offenders released from prisons in England and Wales in 1979 has not been subject to peer review, nor has its methodology or analysis been more than scantily described (Risk Matrix Scoring Guide (Thornton, unpublished), and Thornton et al (2003)). Furthermore, although the sample is said to be nationally representative, it in fact includes only offenders who could be successfully traced, with no information provided about either the numbers of or reasons for missing cases. While two other studies are also referred to in Thornton et al (2003), one involving 647 sex offenders released from prisons in England and Wales in the early 1990s and followed for two years, the other of 311 sex offenders released in 1980 and followed for four years, again little information is provided except for basic outcome and accuracy data. Risk Matrix 2000 has not been validated in a Scottish setting. Indeed, apart from the three studies referred to above, there are only a small number of other evaluations reported in the literature, all of which have significant limitations. They also generally report poorer outcomes than those described in Thornton et al (2003): Craig et al (2007) carried out a study of 85 sex offenders referred to an English forensic psychiatry service between 1992 and 1995 and followed-up for between two and ten years. Although they found a high level of predictive accuracy in respect of non-sexual violent reconvictions (AUCs 1 of 0.86 and 0.87), outcome in respect of sexual reconviction was much less impressive (AUCs of between 0.59 and 0.68, compared with 0.75 to 0.85 reported by Thornton et al (2003)). However, the number of subjects in this study are so low that one must be extremely cautious in interpreting or accepting its findings, with the confidence intervals around them (which are not reported) likely to be excessively wide. Indeed, although they report results for a 10 year follow-up, just four offenders appear to have been followed for that long. 1 See page 5 for the meaning of AUC. 1

In addition, no data is provided regarding numbers of offenders or outcome in respect of the different risk categories. Craissati and Beech (2004), in the only other published English study, evaluated Risk Matrix 2000 in 310 sex offenders managed by the probation service and resident in two London boroughs, followed-up for an average of four years (although only 235 could be rated). Just nine individuals were reconvicted for sexual offences and four for non-sexual violent offences, making any attempt to assess predictive accuracy impossible. Risk Matrix categories, however, were associated with any failure of community supervision (AUC = 0.70), and with what is referred to as sexually risky behaviour (AUC = 0.65), although the reliability of these soft measures and their relationship to the reconviction outcome for which Risk Matrix is designed is unclear. Knight and Thornton (2007) evaluated Risk Matrix 2000, as well as a number of other risk assessment instruments, in a sample of 566 sex offenders who had either been assessed or treated at the Massachusetts Treatment Center, a facility for the detention of men defined as sexually dangerous. Follow-up was reported for three, ten and fifteen years. The ability of Risk Matrix 2000 to predict recidivism was limited (AUCs for sexual reconviction were between 0.63 and 0.67, and for non-sexual violent reconviction they were no better than chance, in the 0.50s). This data, however, was based on referrals made to the Treatment Center between 1959 and 1984, and the extent to which the study population resembles modern sex offenders is unclear. In addition, the nature of the sample is such that most of those in it will have fallen at the high end of the risk spectrum, which means that the evaluation may relate more to being able to distinguish between high risk sex offenders rather than to the more heterogeneous group of sex offenders that is typically encountered in practice. An unpublished Canadian study (Kingston et al) followed-up 280 convicted child molesters for an average of 11.2 years. Although moderate predictive accuracy was reported (the AUC for sexual reconviction was 0.65 and for nonsexual violent reconviction 0.71), 54% of the sample were incest offenders, limiting the extent to which its findings can be generalised. More importantly, however, because of limitations in the information available to the researchers, one of the seven variables that contribute to the scoring of Risk Matrix (convictions for non-contact sex offences) was not used in the determination of Risk Matrix categories, which though discounted by the authors will have meant that the Risk Matrix scores are likely to be inaccurate. Based on the above, it is clear that the empirical foundation for Risk Matrix 2000 is thin. In addition to a paucity of studies, the Risk Matrix evidence base suffers from small sample sizes, selective study populations, and a lack of published data of the type that would allow for independent review. From a Scottish perspective, it remains to be demonstrated that Risk Matrix functions as expected in this setting. 2

Risk Matrix 2000 Risk Matrix 2000 is for use with males aged 18 and over who have been convicted of, or cautioned for, at least one sexual offence committed after the age of 16. It is composed of two main scales: Risk Matrix Sex (RMS), designed to predict sexual reconvictions, and Risk Matrix Violence (RMV), used in the prediction of convictions for non-sexual violence. A third scale, Risk Matrix C, combines the RMS and RMV scales to provide a prediction for sexual or non-sexual violence reconviction. Risk Matrix Sex is composed of seven variables that relate to an offender s history. The determination of RMS categories is a two step procedure: Step One combines information regarding: - age - previous sex offence sentencing occasions - previous sentencing occasions for all criminal offences to reach a preliminary risk rating; Step Two modifies the Step One preliminary rating depending on the presence of four aggravating factors : - any male victim ever - any stranger victim ever - any non-contact sex offence ever - whether or not the offender has ever lived in a cohabiting relationship for more than two years (referred to both as never married and single ) Risk Matrix Violence is composed of just three variables, which are combined in a single step: - age - previous non-sexual violence offence sentencing occasions - any convictions for burglary The variables referred to above are defined in the Risk Matrix Scoring Guide (Thornton, unpublished). Both RMS and RMV are divided into four risk categories. The Scoring Guide describes these as ordinal groupings along the risk continuum with the higher numbered categories representing relatively higher levels of risk, but which for heuristic purposes are labelled as Low, Medium, High and Very High risk. Based on the 1979 England and Wales prison cohort referred to above (Thornton et al, 2003), the Scoring Manual provides the following estimation of recidivism per risk level: 3

RMS risk category 5 year 10 year 15 year % % % Low 3 6 7 Medium 13 16 19 High 26 31 36 Very High 50 55 59 RMV risk category 5 year 10 year 15 year % % % Low 4 5 5 Medium 12 14 19 High 27 34 39 Very High 47 57 59 No confidence intervals are given, and it is not possible to determine from this data the extent to which the individual risk categories are distinct from each other, and if so by what margin. Because of variations in the base rate of reconviction in different settings, without further analysis it is not possible to generalise from this data to other jurisdictions, or to compare studies. This is recognised in the Scoring Guide, which notes that the recidivism rates it describes, reflect the jurisdiction, the era in which these offenders were at risk, and the duration of the follow-up, adding that, Varying any of these parameters would likely lead to different reconviction rates. However, even given different base rates, it is still possible to determine an instrument s discriminative properties, independent of the offender group on which it is tested, for example by examining more closely the relative differences between categories and the degree of precision within them (Mossman, 2006). This is discussed in more detail below. The evaluation of risk assessment instruments Research evaluations have repeatedly demonstrated that statistically derived (i.e., actuarial) assessment measures outperform both clinical judgement and structured risk assessment in determining the longer term likelihood of recidivism when considering populations of offenders (Grove & Meehl, 1996; Monahan, 1996; Grove et al, 2000; Doren, 2002; Hanson & Morton-Bourgon, 2007). In addition to Risk Matrix, there are a number of other measures of this type that have been applied to sexual offending, the best know of which are Static 99, the Sex Offender Risk Appraisal Guide (SORAG), the Minnesota Sex Offender Screening Tool Revised (MnSOST- R), and the Rapid Risk Assessment for Sex Offense Recidivism (RRASOR). 4

The accuracy of these instruments is typically assessed using Receiver Operating Characteristic (ROC) statistics (Mossman, 1994). This is to avoid the potential for illusionary accuracy when the base rates of a targeted outcome are low. For example, if the base rate of recidivism in a sample of offenders is 10%, then simply by saying that no offenders will recidivate will result in correct predictions 90% of the time, impressive accuracy but useless in practice. By taking into account correct predictions of both recidivism and non-recidivism, ROC analyses deal with this problem. An ROC graph plots the true positive rate for a test or instrument (that is, it s sensitivity, or for present purposes its success in detecting recidivists) against its false positive rate (1 minus its specificity, or its mistaken identification of individuals as recidivists when they are not). The resulting Area Under the Curve (AUC) of this graph provides a measure of accuracy that is independent of the base rate of the targeted outcome in the sample, in our case reconviction rate. An AUC of 0.5 amounts to chance accuracy (i.e., the test has no predictive value), AUCs of less than 0.5 are indicative of worse than chance performance, while an AUC of 1 represents perfect prediction. AUCs in the range of 0.60 to 0.80 are considered to represent moderate predictive accuracy. The risk assessment instruments referred to above are typically found to fall within this 0.60 to 0.80 range (Barbaree et al, 2001; Hanson & Morton-Bourgon, 2007; Langton et al, 2007). As Mossman (2006) demonstrates well, however, AUCs, although good at describing the accuracy of an assessment tool in a particular population and its ability to rank subjects in it according to their relative risk of recidivism, do not provide sufficient information to determine whether or not an instrument functions in a similar manner in different populations. Nor does a finding that an instrument is accurate in one population mean that translations can be made regarding reconviction rates for specific risk categories to other populations for example, because Very High risk individuals recidivate at a rate of 60% in one population does not mean they will do so at a similar rate in another. This is because differing base rates of recidivism between populations (which are influenced by the nature of the offenders in the population as well as by external factors such as detection rates and prosecution policies) make judgements about the stability of an instrument s performance between populations difficult. Mossman (2006) argues that in addition to accuracy rates, it is important to look at other measures indicative of a scale s performance in order to be able to interpret its outcome meaningfully. In particular, he recommends the use of Likelihood Ratios, which are a measure of the likelihood that a recidivist will be placed in a particular risk category compared with the likelihood that a non-recidivist will be placed in that same category 2. Thus, one would expect the Likelihood Ratio for individual risk categories to be similar across populations, regardless of base rates of reconviction, if the assessment instrument is functioning in a similar way between them. In other words, a consideration of Likelihood Ratios allows for a determination of whether or not risk categories are stable across populations. These considerations are particularly 2 The Likelihood Ratio as applied in the current study is equal to the number of recidivists in a risk category as a proportion of total recidivists in the population, divided by the number of non-recidivists in that risk category as a proportion of the total number of non-recidivists in the population. 5

pertinent for present purposes, as they are relevant to the issue of whether the findings of Risk Matrix 2000 evaluations in England can be readily applied in Scotland. Aims of the study The study described in this report sets out to examine the reliability, validity and interpretation of findings when Risk Matrix 2000 is used in a large Scottish sample. More specifically, it is intended to: determine the association between Risk Matrix risk levels and reconviction rates for sex offenders in a Scottish setting; establish whether the properties of Risk Matrix 2000, when applied to a Scottish sex offender population, are similar to its properties as described in the England and Wales validation studies reported in Thornton et al (2003). To achieve these goals requires: - an assessment of how well Risk Matrix ranks offenders in terms of their levels of risk; - establishing the probability of reconviction associated with each Risk Matrix category; - describing the properties of the scale in a manner which can be compared between populations independent of the base rate of reconviction. Taken together, these factors address the overall objective of the study, which is to establish the extent to which, and indeed whether, Risk Matrix 2000 can contribute to the systematic risk assessment of sex offenders in Scotland, and thereby assist in their management. 6

METHOD Identification of offenders From its database, the Scottish Prison Service (SPS) identified 1223 offenders released from Scottish prisons between 1996 and 2001 who had either been convicted of a sex offence, or whose index offences were considered to have a clear sexual motivation to them. An end-date of 2001 for prison release was chosen to allow sufficient time for there to be a minimum follow-up of five years for every offender. Because the computerised database used by the SPS commenced in 1996, it was not possible to systematically identify sex offenders released before that year. Of the 1223 prisoners, 194 were removed from the sample for the following reasons: no evidence of sexual offending was found in 158 men when their records were examined neither prison nor criminal records could be located for 18 men the sexual offences of 9 individuals were committed when they were under the age of 16 (Risk Matrix applies only to men with sex offence convictions from 16 years of age or older) 5 had been released from prison outside the 1996-2001 study period 4 were known to have died within the minimum five year follow-up period This left 1029 men in the study population, which we believe represents the entire cohort of sex offenders released from Scottish prisons between 1996 and 2001. The number of sex offenders released from prison each year is shown in Table 1. It can be seen that many fewer offenders were released in 2001 compared with the preceding years. We are unable to determine whether this is an anomaly or whether it reflects an error in case identification by the SPS. Table 1: Number of sex offenders released per year between 1996 and 2001 (where an offender reoffended and was released twice during this period, only the first prison release is counted). Year Number % 1996 157 15.3 1997 209 20.3 1998 220 21.4 1999 197 19.1 2000 191 18.6 2001 55 5.3 Total 1029 100 7

Ethnicity could not be determined in 202 cases (19.6%), although we believe that most of these 202 men were white Scottish. In the 827 cases for whom ethnicity was recorded, 820 (99%) were Caucasian, of whom 95% were born in Scotland. Power calculation Assuming recidivism rates and a distribution of offenders across risk categories similar to those described in the 1979 England and Wales validation study (Thornton et al, 2003), it was calculated that a sample size of 1000 would be needed to achieve confidence intervals under + 10% in respect of reconviction rates, and + 3.5-5% in respect of estimates of sensitivity and specificity; a sample size of 500 would expand the confidence interval around the smallest category, Very High risk, to + 15% (the others would be + 6-8%), and confidence intervals for sensitivity and specificity to + 5-8%. A sample size of 500 offenders was therefore considered to be the minimum required to ensure reasonably narrow confidence intervals. The original aim of the study was to evaluate the accuracy of Risk Matrix 2000 over follow-up periods of both five and ten years. It can be seen from Table 1 above, however, that ten year follow-up is possible for many fewer than the required 500 offenders, making the intended ten year evaluation non-viable. Instead, only a five year follow-up period is examined in detail, although one and two year follow-up periods are also described, and follow-up longer than five years is taken into account using survival analysis techniques. Data collection Data with which to calculate Risk Matrix scores was obtained by a single research worker from prison records provided by the SPS. Where insufficient information was available from the files attempts were made to obtain further data from Criminal Justice Social Work departments, although in the event the amount of material collected from this source was limited. Criminal records were obtained from the Scottish and English Criminal Records Offices. This was used both to determine reconvictions, and as a check on information regarding offending history contained in the prison records. In cases where there was disagreement between the two sources the Criminal Record was preferred. Missing information and final sample size Risk Matrix Sex (RMS) Of the 1029 offenders in the study population, RMS risk categories were calculated for 803 men (78%). It was not possible to do so for the remaining 226 individuals because of missing information relating to one or more variables as follows (see page 3 for a list of the relevant variables): 8

lack of basic Step 1 information: 23 lack of sufficient Step 2 information: 203 In terms of missing information needed to complete Step 2, in 183 cases (90%) no Step 2 data at all was available, and in 15 (7%) just one of the four variables could be ascertained. Whether or not the offender met the criteria for Single, the only noncriminal variable, was missing in 201 of the 203 cases. Although there were an additional six cases in which Step 2 information was missing, five of these men already scored as Very High risk on Step 1 which meant that the absence of Step 2 information had no effect on risk category, and one scored as High risk on Step 1 but had two known aggravating factors, making the two that were unknown redundant. Follow-up reconviction data was available for 771 of the offenders for whom Risk Matrix Sex categories could be calculated (96% of those with RMS scores and 75% of the study population). Arrival at the final sample size for Risk Matrix Sex is illustrated in Figure 1. Figure 1: The Risk Matrix Sex study population. Sex offenders released from prison between 1996 and 2001 1029 Insufficient data to calculate RMS category: 226 803 (78.0%) Lack of reconviction data: 32 771 (74.9%) In order to examine whether offenders included in the study differed significantly from those for whom either RMS could not be calculated or for whom reconviction follow-up data was not available, Step One scores for the 771 offenders in the study population were compared with Step One scores in the 235 men for whom this information was available (that is, 91% of the 258 sex offenders for whom missing information meant that they could not be included in the RMS study population). The results are shown in Table 2 no cases score as Very High risk in the missing group because a Very High risk rating at Step 1 automatically results in a full RMS rating of Very High risk. Excluding the Very High risk Step One cases, the two groups were 9

found to differ, with more cases in the missing group rated as Medium and less as Low on Step 1 (chi square = 5.95, df=2, p=0.05). Table 2: Comparison of Step One categories between the 771 offenders in the study population and the 235 offenders for whom full RMS information was not available (missing cases). See text for discussion of differences between the two groups. Step One Category Study Population Missing Cases n % n % Low 300 39 74 32 Medium 350 45 129 55 High 106 14 32 14 Very High 15 2 - - Total 771 100 235 100 Based on the figures in Table 2, it may be that offenders for whom data was not available were of slightly higher risk than the sample population. However, the men in the study group had been sentenced to longer periods of imprisonment for the offences preceding their release: 612 (79.5%) served sentences of one year or more compared with 143 of the 249 (57.4%) missing cases for whom sentence length was known (chi square = 46.52, df = 1, p <.0001), suggesting that the index offences of many of those in the missing group were less serious than they were for the study population. The ages of the two groups did not differ significantly, with a mean of 41.7 (sd 13.8) in the study population and a mean of 40.0 (sd 15.5) in the missing cases. As described in the Results section of this report, reconviction rates did not differ significantly between the study population and the missing cases. Risk Matrix Violence (RMV) Of the 1029 offenders in the study population, an RMV risk category was calculated for 1004 men (98%) because RMV is comprised of just three variables (see page 3), less information is required for it than for RMS, and hence there are many fewer missing cases than there were with RMS. A risk category could not be determined for the remaining 25 individuals because of missing information regarding their criminal histories. Follow-up reconviction data was available for 974 of the offenders for whom Risk Matrix Violence categories could be calculated (97% of those with RMV scores and 95% of the study population). 10

Arrival at the final sample size for Risk Matrix Violence is illustrated in Figure 2. Figure 2: The Risk Matrix Violence study population. Sex offenders released from prison between 1996 and 2001 1029 Insufficient data to calculate RMV category: 25 1004 (97.6%) Lack of reconviction data: 32 974 (94.7%) Because of the small number of missing RMV cases, no further analyses were undertaken comparing the RMV study population with the RMV missing cases. Reliability All of the data was collected by a single researcher. Forty cases were scored independently by a second rater (DG). For RMS, there was complete agreement in risk categories in 36 of 40 cases (90%), and similarly, there was full agreement in 36 of the 40 cases for RMV. Kappa for RMS was 0.84, and for RMV 0.85, indicating a high degree of inter-rater reliability. 11

RESULTS Analysis of reconviction rates was approached in two ways: first, looking at reconviction up to 5 years after release, and second, using techniques from survival analysis to describe time to reconviction taking into account variations in follow-up time. As referred to in the Methods section above, there was an insufficient number of offenders in the sample to allow for a meaningful ten year follow-up. The time at risk of reconviction has been calculated as the time between release from prison and the date of a first reconviction for a sexual (or violent) offence, or the time between the prison release date and 30 June 2007 if no reconvictions are recorded. As referred to in the Methods section, four men died after release from prison (two within a few days of release and two within about 3 years of release), and they have not been included in the analysis. The findings are complicated by the fact that an offender reconvicted for a general or a violent offence and sentenced to prison is not at risk of committing a further sexual offence during this period of imprisonment (or similarly, if reconvicted and sentenced for a sexual offence, he is not at risk of committing a further violent offence), and this not-at-risk period should be taken into account. In the 771 men included in the RMS analysis: 69 received prison sentences of less than 12 months for a non-sexual offence 13 received prison sentences of between 12 and 36 months for a non-sexual offence 9 received prison sentences of over 36 months for a non-sexual offence In the 974 men included in the RMV analysis: 65 received prison sentences of less than 12 months for a non-violent offence 29 received prison sentences of between 12 and 36 months for a non-violent offence 17 received prison sentences of over 36 months for a non-violent offence To take account of this information requires reliable data on the lengths of imprisonment actually served by the reoffenders rather than the sentences imposed, but this data was not available to us. However, while this issue needs to be borne in mind when interpreting the findings reported below, the impact is unlikely to be large, as in the case of RMS only 22 men (3%) received prison sentences of over a year, and just 46 men (5%) did so in the case of RMV. It should also be noted that virtually all of the risk assessment studies in the literature suffer from this same limitation, including the 1979 England and Wales study on which Risk Matrix 2000 is based. Distribution of RMS and RMV categories Table 3 shows the distribution of RMS categories across the cohort, comparing it with the 1979 cohort of sex offenders released from prison in England and Wales used in the Risk Matrix validation study (Thornton et al, 2003). It can be seen that the 12