Model calibration and Bayesian methods for probabilistic projections

ETH Zurich Reto Knutti Model calibration and Bayesian methods for probabilistic projections Reto Knutti, IAC ETH

Toy model Model: obs = linear trend + noise(variance, spectrum) 1) Short term predictability, 2) separation of trend plus noise, 3) structure of model / model evaluation, 4) calibration/probability

Questions What is probabilistic modeling? What is it useful for? Why is it difficult? What is model calibration in the climate model context? Why is it helpful? Why is it hard? Is it ok to tune a model? Why yes, why not?

The EU, UNFCCC and 2 C [...] the Council believes that global average temperatures should not exceed 2 degrees above pre-industrial level and that therefore concentration levels lower than 550 ppm CO 2 should guide global limitation and reduction efforts.[...] (1939th Council meeting, Luxembourg, 25 June 1996) The European Council calls upon all Parties to embrace the 2 C objective and to agree to global emission reductions of at least 50%, and aggregate developed country emission reductions of at least 80-95%, as part of such global emission reductions, by 2050 compared to 1990 levels. (EU Council, 2009) The Paris agreement aims at holding the increase in the global average temperature to well below 2 C above pre-industrial levels and to pursue efforts to limit the temperature increase to 1.5 C above preindustrial levels (UNFCCC Paris, 2015)

Why 2 C? 2 C has been agreed formally as a climate target. Science alone cannot defend 2 C. 2 C may be the worst we can tolerate and the best we can hope for. Even 2 C will have significant adverse impacts that require adaptation. Warming without intervention is likely to have serious negative and potentially irreversible impacts. 2 C as an illustration of a mitigation scenario. The following ideas apply also to any other target.

How do we get there? 2 C? Similar to our toy model idea, to answer this we need: a model to make prediction calibration to observations to take into account uncertainties

How do we get there? 2 C Similar to our toy model idea, to answer this we need: a model to make prediction calibration to observations to take into account uncertainties

Definitions Verification/Validation: any model is a simplification of the target system and cannot strictly be verified, i.e. proven to be true, nor can it be validated in the sense of being shown to accurately represent all the relevant processes Evaluation: testing a model and comparing model results with observations Calibration: estimating values for unknown or uncertain parameters Tuning: calibration, but with a negative undertone of being dishonest and adjusting parameters to get the right effect for the wrong reason Agreement between model and data does not imply that the modelling assumptions accurately describe the processes producing the observed climate system behaviour; it merely indicates that the model is one (of maybe several) that is plausible, meaning it is empirically adequate for a particular purpose.

Frequentist vs Bayes The probability of throwing 6 with a die is 1/6. Frequency of occurrence in a repeated experiment. Global temperature change in 2100 will be in the range of 2 to 3 C with 66% probability. Single outcome, true but unknown value. Probability is a degree of belief, probability measures the degree to which different outcomes are supported by current evidence (data, models), quantifies my judgement of what will happen.

Bayesian inference Thomas Bayes (1702-1761, Tunbridge Wells, Kent ) P( A, B) = P( A) P( B A) = P( B) P( A B) P( B A) = P( B) P( A B) P( A) What we want to know P( par data) = Likelihood (often Prior from a model) P( par) P( data par) P( data) Normalization

Example Prior: P(sun) = 0.8, P(rain) = 0.2 P(swim sun) = 0.5, P(swim rain) = 0.1 Thomas went swimming. What is the probability that the sun was shining given that additional information? P(sun swim) = P(swim sun) P(sun) / P(swim) =?

Bayesian inference Bayesian inference uses a numerical estimate of the degree of belief in a hypothesis before evidence has been observed and calculates a numerical estimate of the degree of belief in the hypothesis after evidence has been observed. Bayesian inference usually relies on degrees of belief, or subjective probabilities, in the induction process and does not necessarily claim to provide an objective method of induction (Source: Wikipedia)

Energy (im)balance Q = F - λ T GHG F λ T 2 Equilibrium CO 2 only GHG Transient Q=0 Aerosol F λ T 1 Q Equilibrium: Q = 0 F = λ T 2 Climate sensitivity: Global equilibrium temperature change for double CO 2 forcing: S = 1 / λ = T 2 / F Commitment warming: T 2 - T 1

How much warming and CO 2 is dangerous? Involves value judgments Equilibrium warming depends only on climate sensitivity 450 ppm equivalent CO 2 gives a ~50% probability for keeping warming below 2 C (Knutti and Hegerl 2008, based on IPCC 2007)

Compensating effects of forcing and feedbacks High S (small λ), small F Q = F - λ T Low S (large λ), large F

Climate sensitivity IPCC AR4: Climate sensitivity is most likely near 3 C, likely (>66%) in the range 2-4.5 C, very likely (>90%) larger than 1.5 C. Disturbingly long tail at the upper end. IPCC AR5: likely 1.5-4.5 C, less fat tail (Knutti and Hegerl 2008)

Why the long tail? (1) Equilibrium climate sensitivity is not well constrained from the transient climate response (Knutti et al. 2005)

Why the long tail? (2) Sensitivity S=1/(1-f) if f is the feedback/gain Explains much of the shape of most PDFs if the distribution of the feedbacks f is normal Shown by Roe and Baker 2007, but concept is 25 years old 30% reduction in feedback uncertainty would bring the 95% level from 8.5 C down to 6 C. - - - 30% reduction in feedback uncertainty (Knutti and Hegerl 2008, modified from Roe and Baker 2007)

Does the fat tail of climate sensitivity matter? There is a small probability that climate change may be very large. Allen and Frame 2007: The upper bound of sensitivity is inherently hard to find but does not matter, we simply have to adapt our stabilization CO 2 target as we go along and observe. For discount rates of a few percent anything beyond 50 or 100 years is irrelevant, so we should not care about sensitivity and stabilization. Martin Weitzman: the tail of the sensitivity is not only long but fat (polynomial). If the damage function is exponential, the expected damage is infinite. The somewhat strange conclusion is that we should allocate all resources to prevent the tiny probability of a true catastrophe.

Probabilistic projections with simple models?

Probabilistic projections with simple models

Probabilistic projections with simple models Economic uncertainties are reflected in scenarios Bayesian methods use observations to yield probabilistic projections.

Probabilistic attribution of warming Assume priors on model parameters Use Bayesian method to constrain parameters based on observations Run single forcings with constrained parameter distributions (Huber and Knutti, 2011)

Probabilistic attribution of warming Observed warming and ocean heat uptake, combined with prior knowledge about forcing and a model allow for attribution of the observed warming to causes Very likely more than 75% of the warming is externally forced Warming due to greenhouse gases very likely larger than the observed, and partly compensated by aerosol cooling. (Huber and Knutti, 2011)

The hard bits For Bayesian studies, PDFs depend on the assumed prior distribution. Uniform priors in climate sensitivity have been used as uninformative priors but have been criticized and may be overly pessimistic. No prior knowledge is impossible. Expert priors are unlikely to be independent of the data. Fuzzy PDFs, classes of priors, In theory, combining independent constraints should lead to a more narrow constraints. In practice, there is no formal way to combine different PDFs. We don t know how to properly account for structural uncertainty.

Bayesian methods Decide on a model that describes the qualitative behavior of your system. Decide on a prior distribution for uncertain parameters that reflects your prior belief before using the data. Calculate the posterior PDF of the parameters given the data Use posterior PDFs to make predictions, understand system,

Open issues Results depend strongly on prior distribution if the data constraint is weak. (Frame et al. 2005)

Thoughts on prior distributions There is usually no correct prior, every prior is subjective. There is no uninformative prior. MY PDF rather than THE PDF, because it is conditional on prior, data, model, assumptions, Avoid double counting. Prior knowledge and model must not be influenced by the data (or the amount of data). Avoid prior that gives zero probability to things that may be possible. Priors are particularly problematic if the data constraint is weak.

Expert prior distributions of climate sensitivity A uniform prior in the 1-10K (1-100K) range assumes >50% (>95%) probability that sensitivity is higher than any of the CMIP models. Expert priors may be an alternative to flat priors. But which expert should we ask? How do we ensure the data has not been used in generating the prior? (Morgan and Keith, 1995)

Constructing a likelihood Weighting for example by exp(-(model-observations)^2) RMS error (relative to internal variability)

Open issues All models are wrong. Structural error is usually not taken into account. Adding data might show that the model is inadequate. Sometimes all models have near zero likelihood, but we just scale them up such that the PDF integrates to one. Add discrepancy term. (Sanderson et al 2008)

Computational aspects Analytical solutions are rarely possible. Simple brute force Monte Carlo sampling is often unfeasible. Imagine a climate model with 20 parameters and testing 10 values for each parameter. How long would this take? Importance sampling, Markov Chain Monte Carlo sampling etc. are powerful methods to approximate the posterior, but they are still expensive. Some are sequential and therefore slow. The acceptable parameter space may be very small. Imagine that two of the ten values for each of the 20 parameters give a good fit. How big is the hit rate for a good fit with random sampling?

Do people understand probabilities?

Do we need PDFs, and when should we stop? PDF of change in summer mean daily maximum temperature (ºC) over a particular 25 km square by the 2080s under the High emissions scenario. (UKCP09) Given the acknowledged systematic errors in all current climate models, treating model outputs as decision-relevant probabilistic forecasts can be seriously misleading. This casts doubt on our ability, today, to make trustworthy, high-resolution predictions out to the end of this century (Frigg et al. 2013)

Do we need PDFs, and when should we stop? PDFs in principal convey the maximum information. In an ideal quantifiable world, shared values/attitude towards risk, and an agreed goal, they allow for an optimal decision. Probabilistic methods are not fully objective, but they help formalize uncertainty assessments by being explicit about priors, data, methods. But they are technically very hard. And they are hard to communicate. In the presence of deep uncertainties, they may depend sensitively on subjective choice (structure, prior, model discrepancy), and imply accuracy that is not justified, in particular in the tails. Robust decisions are not optimal but perform well under a broad range of outcomes.