L03 Volatility Modeling & Multivariate Risk Models

Managing Volatility Risk

Historical Volatility & Historical Correlation

Volatility

  • Can define as std of returns

  • Helpful to standardise on an annualised basis

  • If volatility is constant, we can derive longer-period volatility from (in)famous 'square-root rule'

  • Important for derivatives pricing, hedging, and risk measurement

Historical Vol Forecasts

  • One forecasting rule is the historical MA

  • This gives unbiased estimate
  • If data period is daily, mean return will be low and we can set this to zero

  • This usually reduces variance
  • Change from to n in denominator makes little difference
  • These approaches give each observation the same weight, then no weight, in forecast

  • Easy to use

  • Problems

    • If true vol is constant, any differences are due only to sampling error
    • Can't allow for changes in true vol
    • More distant events in sample period have same weight as more recent ones

Some illustrative H vols

  • Figure gives two alternative H estimators, using and
  • For large , vol estimate is smoother and less responsive
  • When shock occurs, both jump and remain high
  • For high , plateau is lower but longer lasting

EWMA

  • Can ameliorate some of these problems using an exponential weight

  • Weight attached to observation decays over time

  • Can also be rewritten as updating rule

  • Higher means weight declines slowly
  • For daily data, RiskMetrics uses
  • Figure shows how low- rises most after shock, but decays faster

EWMA Forecasting

  • Can use EWMA to forecast:

for any

  • EWMA forecast is same as current value

  • But flat vol forecast not very appealing

    • Ignores other information; not plausible

GARCH models

  • EWMA models also take to be constant

  • This is implausible and not appealing

  • Popular alternative is a GARCH

  • This fits nicely with some stylised features of returns

    • Returns show vol clustering
    • Returns show leptokurtosis – heavy tails
  • GARCH can accommodate both of these

  • Basic GARCH model

for and

  • Errors are often normal, in which case return are conditionally normal
  • Returns can be also

GARCH

  • Most popular is GARCH

  • High implies that vol is persistent and takes a long time to change

  • High means that vol is spikey and quick to react

  • Often is over 0.7 and less than 0.25

Properties of GARCH

  • GARCH depends on same variables as EWMA, but has three parameters not 1

  • EWMA is special case with , and

  • GARCH with positive intercept allows vol to be mean-reverting

    • This is appealing
    • Long run vol tends to revert to

算例:GARCH(1,1) Volatility Forecasting

Given: S&P 500 daily returns. Estimated GARCH(1,1) parameters:

Current conditional variance: (daily vol = 1.4%)

Step 1: Long-run (unconditional) variance

Long-run daily volatility = (annualized: )

Step 2: -period ahead forecast

Horizon Daily vol (%) Annualized vol (%)
1 0.000184 1.357 21.5
5 0.000143 1.194 19.0
10 0.000107 1.034 16.4
20 0.000081 0.902 14.3
0.000067 0.817 13.0

Persistence parameter implies very slow mean reversion.

Other GARCH models

  • IGARCH is applicable when returns not stationary

  • EWMA is special case where
  • Components GARCH
    • Lets vol converge to a long-term vol that changes over time
  • Factor GARCH
    • This links returns from n assets to one or more underlying factors (e.g., as with CAPM)

    • This leads to

Covariances and correlations

  • Covariance between and is

  • Correlation is

  • Correlation only good measure of dependency if returns are elliptical
    Only defined if vols exist – need to check for this
  • Estimate correlation
    • Equal-weighted covariance/correlation
    • EWMA covariances/correlation
    • GARCH covariances/correlation

Implied Volatility & Implied Correlation

Implied Volatility

  • Black-Scholes model

where

  • Implied volatility

  • Quote ISD vs premium

    • ISD is more intuitive (similar to quoting bonds with yield instead of price)
    • reflect the market's view
    • ISD is a risk-neutral volatility
  • These are volatilities generated from option prices

  • Given that other variables (price, etc.) are observable, can infer implied vol from option price (e.g., Black-Scholes)

  • Implied vol is a forward-looking estimator
    • Takes account of all information, not just historical backward-looking information
  • Implied vols generally regarded as better than historical ones
  • Implied vols are dependent on option-pricing model
    • Possible problems due to holes in Black-Scholes, volatility smiles/smirks, etc.
  • Also, implied vols only exist for assets on which options have been written

VIX

  • VIX is a popular measure of the implied volatility of S&P 500 index options

  • Properties about VIX

    • average value: on the order of 21%; daily change: about 2.4%
    • assuming normal distribution: daily movements should be within
    • the actual distribution is far from normal

案例:VIX"波动率闪电崩盘"(2018年2月5日)

背景: 2018年2月5日,VIX指数单日暴涨115%(从17升至37),创下历史纪录。这一事件导致多个反向波动率产品爆仓。

事件经过:

时间 VIX XIV价格 事件
2月2日 17.3 $99 正常交易
2月5日盘后 37.3 ~$15 VIX飙升115%
2月5日晚 -- 触发加速赎回 XIV清算

XIV(VelocityShares Daily Inverse VIX Short-Term ETN):

  • 追踪S&P 500 VIX短期期货指数的反向表现
  • 当VIX下跌时获利,VIX上涨时亏损
  • 2月5日亏损超过80%,触发产品条款中的加速赎回事件

关键教训:

  • Gamma/Vega风险: 反向波动率产品的非线性特征使其在极端事件中损失加速
  • 流动性螺旋: VIX期货市场流动性不足时,Delta对冲交易加剧价格波动
  • 产品设计缺陷: 加速赎回条款在极端市场中可能无法保护投资者
  • 尾部风险不可忽视: 波率产品的"稳定收益"实为卖出尾部保险

Implied Volatility Surface

  • If B-S were correct, the ISDs should be constant across strike prices and maturities
  • If we plot ISDs against strike prices and maturities we obtain the implied volatility surface
    • volatility smile (ISDs vs. strike prices)
    • term structure of volatility
    • spot & forward volatility

Prediction of the Volatility Surface

  • Sticky strike
    • there is no structural change in the volatility curve
    • price movement is largely temporary
  • Sticky moneyness
    • permanent shift in the volatility curve

Implied Correlation

  • These are directly analogous to implied volatilities
  • But can only use if relevant options exist
  • Need two or more options, or spread options
  • Have same pros and cons as implied vols, and also be unstable
  • Exchange option involves the surrender of an asset (B) in exchange for acquiring another (A). The payoff on such a call is

  • The valuation formula (Margrabe Model)
    • replace by the price of asset B ()
    • replace the risk-free rate by the yield on asset B ()
    • the volatility is
  • Rainbow Option: exposed to two or more sources of uncertainty
    • e.g. An option on a basket

Currency Implied Correlation: A Triplet of Currency Options

  • Exchange rates are expressed relative to a base currency (usually USD)

  • The cross rate is the exchange rate between two currencies other than the reference currency

  • Example: let be the dollar price of GBP and be the dollar/euro rate. Then the euro/pound rate is

the volatility of the cross rate is

Portfolio Average Correlation

  • Expressing portfolio ISD by the ISD of its components through the average correlation

  • the average correlation is a summary measure of diversification benefits across the portfolio

  • all else equal, an increasing correlation increases the total portfolio risk

  • A dispersion trade takes a short position in index volatility, which is offset by long position in the volatility of the index components

Covariance Matrices

Forecasting covariance matrices

  • Estimated cov or corr matrices must be positive definite or positive semi-definite
    • This imposes constraints on how we can estimate them
    • Can't estimate parameters independently, and then hope for this condition to be satisfied
  • Historical covariance matrices
    • Straightforward to estimate:
      • Choose window size, and estimate parameters simultaneously
    • Drawbacks
      • Only accurate if true matrix constant
      • Can suffers from ghost effects
  • Multivariate EWMA
    • This is more flexible and has smaller ghost effects
    • Must choose same decay factor for all terms
    • If we don't, there is no guarantee that matrix will be PD or PSD
  • Multivariate GARCH
    • This can be difficult
      • Can need a lot of parameters
      • High dimensions hard to handle
      • Problems of convergence of routines, etc.
    • Can use methods such as orthogonal GARCH to get around some of these problems

Generating PD or PSD covariance matrices

  • One way to ensure PD or PSD matrices is to adjust eigenvalues, and then recover matrix from adjusted eigenvalues
    • If we want PD, eigenvalues must be positive
    • If we want PSD, eigenvalues must be non-negative
  • Obtain eigenvalues, adjust any –ve (and maybe 0) ones using some rule
  • Adjusted matrix satisfies our requirements

Computational problems

  • Even if true matrix is PD (or PSD), estimated matrix might not be

  • Risk factors might be highly correlated

  • This can produce 0 or –ve estimated eigenvalues

  • These problems can be aggravated if covariance matrix is used for trading or risk management

  • Possible answers:

    • Choose risk factors that are not too highly correlated
    • Don't choose too many risk factors
    • Alternatively, can adjust eigenvalues

Variance Swaps

Variance Swaps

  • A variance swap is a forward contract on the realized variance, the payoff is

  • it can be written on any asset (usually equities or equity indices)
  • A correlation swap is similar to a variance swap, however its payoff is tied to the realized average correlation in a portfolio over the selected period
  • e.g. A one year contract on S&P 500 index: , . If , the payoff to the long position is

  • Market value of a variance swap

  • Correlation trading

Dynamic Trading

Dynamic Option Replication

Holding a call option is equivalent to holding a fraction of underlying asset

Dynamic replication of a put

Static Option Replication

Another approach to replication is to use a portfolio of options that is rebalanced infrequently. Static replication is achieved by matching the value of the target option with the portfolio of options at selected boundaries and dates.

Consider, for example, an up-and-out call option that expires in a year with a strike price of 100 and a barrier of 120. The current stock price is at 100. If it hits 120 at any time before expiration, the option dies.

For boundaries, we choose

and

To replicate the payoff at maturity, we could choose one long call option with plus two short calls with , which represent the loss in value if . With more options, the value of the replicating portfolio converges to the desired pattern.

Implications for Trading

  • Dynamic replication of a long option is bound to loss money
    • it buys the asset after the price has gone up (too late)
    • the loss of each transaction will culminate to an option premium, which is driven by the realized volatility
  • Selling an option and dynamically hedging it using the underlying instrument
    • the strategy is delta-neutral
    • revenue from selling an option: a function of implied volatility
    • cost from dynamic hedging: a function of realized volatility
    • in equity markets, implied volatility tends to be greater than the realized volatility
  • More general implications
    • large scale automatic trading system have the potential to be destabilizing
    • selling an asset after its price has gone down is similar to prudent risk-management practices

Advanced Risk Models: Multivariate

The Big Idea

Components of a Multivariate Risk Modeling Systems

  • Risk system
    • portfolio position system
    • risk factor modeling system
    • aggregation system
  • Describe joint movements in the risk factors
    • specify an analytical distribution
    • take the joint distribution from empirical observations
  • Aggregation: VaR methods
    • delta-normal method
    • historical simulation method
    • Monte Carlo simulation method

Risk Mapping

Introduction

  • Have assumed so far that each position has its own risk factor, which we model directly
    • Distinguish between positions and risk factors
  • However, it is not always possible or desirable to model each position as having its own risk factor
  • Might wish to map our positions onto some smaller set of risk factors
    • Might wish to map positions to () risk factors

Reasons for mapping

  • Might not have enough data on our positions
    • E.g., might have small runs of Emerging Market data
    • Map to risk factors for which we do have data
  • Might wish to cut down on the dimensionality of our covariance matrices
    • This is important!
    • With positions, covariance matrix has terms
    • As rises, covariance matrix becomes more unwieldy
  • Need to keep dimensionality down to avoid computational problems too – rank problems, etc.

Stages of mapping

  • Construct a set of benchmark instruments or factors
    • Might include key bonds, equities, etc.
  • Collect data on their volatilities and correlations
  • Derive synthetic substitutes for our positions, in terms of these benchmarks
    • This substitution is the actual mapping
  • Construct VaR/CVaR of mapped portfolio
  • Take this as a measure of the VaR/CVaR of actual portfolio

Selecting Core Instruments

  • Usual approach to select key core instruments
    • Key equity indices, key zero bonds, key currencies, etc.
  • Want to have a rich enough set of these proxies, but don't want so many that we run into covariance matrix problems
  • RiskMetrics core instruments
    • Equity positions represented by equivalent amounts in key equity indices
    • Fixed income positions by represented by combinations of cashflows of a limited number of maturities
    • FX positions represented by relevant amounts in `core' currencies
    • Commodity positions represented by amounts of selected standardised futures positions

Mapping with Principal Components

  • Can use PCA to identify key factors
  • Small number of PCs will explain most movement in our data set
  • PCA can cut down dramatically on dimensionality of our problem, and cut down on number of covariance terms
    • E.g., with 50 original variables, have separate covariance terms
    • With 3 PCs, have only 3 separate covariance terms

Mapping Positions to Risk Factors

  • Most positions can be decomposed into primitive building blocks

  • Instead of trying to map each type of position, we can map in terms of portfolios of building blocks

  • Building blocks are

    • Basic FX
    • Basic equity
    • Basic fixed-income
    • Basic commodity

A General Example of Risk Mapping

  • Replace each of the positions with a exposure on the risk factors. Define as the exposure of instruments to risk factor
  • Aggregate the exposures across the positions in the portfolio,
  • Derive the distribution of the portfolio return from the exposures and movements in risk factors, , using one of the three VaR methods

Example: Mapping with Factor Models

  • Decompose stock return
    • a constant term (not important for risk management purpose)
    • a component due to the market
    • a residual term
  • The portfolio return
    • Mean:
    • Variance:
    • For equally weighted portfolio:
    • The mapping: on stock on index
  • This approach is useful especially when there is no return history

Example: Mapping with Fixed-Income Portfolios

  • Risk-free bond portfolio
    • maturity mapping: replace the current value of each bond by a position on a risk factor with the same maturity
    • duration mapping: maps the bond on a zero-coupon risk factor with a maturity equal to the duration of the bond
    • cash flow mapping: maps the current value of each bond payment on a zero-coupon risk factor with maturity equal to the time to wait for each cash flow

Corporate bond portfolio

  • Decomposition:
  • the movement in the value of bond price :

  • the portfolio:

  • aggregation:

  • Variance:

  • on bond on on

Choice of Risk Factors

It should be driven by the nature of the portfolio:

  • portfolio of stocks that have many small positions well dispersed across sectors

  • portfolios with a small number of stocks concentrated in one sector

  • an equity market-neutral portfolio

Mapping Complex Positions

  • Complex positions are handled by apply financial engineering theory
  • Reverse-engineer complex positions into portfolios of simple positions
  • Map complex positions in terms of collections of synthetic simple positions
  • Some examples, using FE/FI theory:
    • Coupon-paying bonds: can regard as portfolios of zeros
    • FRAs: equivalent to spreads in zeros of different maturities
    • FRNs: equivalent to a zero with maturity equal to period to next coupon payment (because it reprices at par)
    • Vanilla IR swaps: equivalent to portfolio long a fixed-coupon bond and short a FRN
    • Structured notes: equivalent to combinations of IR swaps and conventional FRNs
    • FX forwards: equivalent to spread between foreign currency bond and domestic currency bond
    • Commodity, equity and FX swaps: combinations of spread between forward/futures and bond position

Dealing with Optionality

  • All these positions can be mapped with linear based mapping systems because of their being (close to) linear

  • These approaches not so good with optionality

    • Non-linearity of options positions can lead to major errors in mapping
  • With non-linearity, need to resort to more sophisticated methods, e.g., delta-gamma and duration-convexity

Joint Distribution of Risk Factors

  • Copula is a function of the values of the marginal distributions plus some parameters, , that are specific to this function. For example,

  • Sklar's theorem: For any joint density there exists a copula that links the marginal densities:

  • This result enables us to construct joint density functions from the marginal density functions and the copula function

  • Takes account of dependence structure

  • To model joint density function, specify marginals, choose copula, and then apply copula function

Common Copulas

  • Independence (product) copula$ = uv$
    • Good for independent random variables
  • Minimum copula
    • Good for comonotonic variables
  • Maximum copula
    • Good for countermonotonic variables
  • Gaussian copulas
    • For multi-variable normality, does not have closed form copula functions
  • t-copulas
    • For multi-variable
  • Gumbel copulas, Archimedean copulas
  • Extreme value copulas
    • Arising from EVT

Tail Dependence

  • Upper & lower conditional probabilities

  • When and approaches zero as , a copula is said to exhibit tail independence

  • Gives an idea of how one variable behaves in limit, given high value of another

算例:Gaussian Copula — Simulating Joint Default

Given: Two firms, each with 1-year default probability . Asset return correlation .

Step 1: Convert default probabilities to standard normal thresholds

Step 2: Simulate correlated standard normal variables from the Gaussian copula:

Step 3: Default occurs if .

Scenario Joint outcome Probability (approx.)
Neither defaults 91.1%
Only Firm 1 defaults 3.9%
Only Firm 2 defaults 3.9%
Joint default 1.1%

Key insight: Joint default probability (1.1%) exceeds the independence case () by a factor of 4.4x. The Gaussian copula with generates significant default clustering.

案例:LTCM危机(1998)— Copula与相关性风险

背景: 长期资本管理公司(LTCM)是一家由诺贝尔奖得主Myron Scholes和Robert Merton担任合伙人的对冲基金,1998年9月濒临破产。

核心策略与风险假设:

  • 利率互换利差套利:做多被低估的利差、做空被高估的利差,预期利差收敛
  • 依赖高斯Copula假设:认为不同市场的极端事件相关性较低
  • 使用高杠杆(名义资产超过$1,000亿,资本仅约$47亿)

1998年8月俄罗斯违约引发的连锁反应:

市场 LTCM假设 实际情况
美国国债vs. OIS利差 收窄至正常水平 继续扩大
新兴市场vs.发达市场相关性 低相关 相关性急剧上升至接近1
流动性 充足 流动性蒸发,买卖价差扩大数十倍
波动率 平稳 隐含波动率飙升

9月21日单日亏损: $5.53亿(约占资本的15%)

根本原因:

  • 模型风险: 高斯Copula低估了尾部相关性——在极端市场压力下,所有头寸趋向于同向运动
  • 流动性风险: 杠杆过高,无法在流动性枯竭时平仓
  • 拥挤交易: 多家机构持有相似头寸,危机中争相平仓加剧亏损

教训: Copula选择对风险管理至关重要;t-Copula等具有尾部依赖性的模型在极端市场中更为现实。

Extreme Value Theory

Peaks Over Threshold Approach & the GP Distribution

The limit distribution for values beyond a cutoff point () belongs to the following family

  • It is called the generalized Pareto (GP) distribution
    • is the scale parameter
    • is the shape parameter that determines the speed at which the tail disappears
    • EVT distribution is only asymptotically valid (i.e., as grows large)
  • It subsumes other distributions as special cases
    • : normal distribution
    • : Gumbel
    • : Fr'{e}chet (fat tails)
    • : Weibull

Block Maxima vs. Peaks over Threshold

  • Block maxima approach: group the sample into successive blocks, from which each maximum is identified.
  • Generalized extreme value (GEV) distribution

EVT vs. Normal Densities

VaR and EVT

Close-form solutions for VaR and CVaR rely heavily on the estimation of and .

  • Maximum likelihood
    • define a cutoff point (include 5% of the data in the tail)
    • only consider losses beyond and maximize the likelihood of the observations over the two parameters and
  • Method of moments
    • fitting the parameters so that the GP moments equal the observed moments
  • Hill's estimator
    • sort all observations from highest to lowest
    • tail index is estimated from

    • no theory tells how to choose
    • we may plot against and choose the value in a flat area

Problems with EVT

  • Estimates are sensitive to changes in the sample

  • Results depend on assumptions and estimation method

  • It relies on historical data

VaR Methods

Delta-Normal

  • Assumption
    • portfolio exposures are linear
    • risk factors are jointly normally distributed
  • The VaR
    • portfolio return is normally distributed
    • the portfolio variance:
    • VaR is directly obtained from the standard normal deviate that corresponds to the confidence level :

    • diversified VaR vs. undiversified VaR
  • Advantages & Drawbacks
    • advantages: simple, closed-form, more precise, less sampling variability
    • drawbacks: can not account for nonlinear effects (option), underestimate the occurrence of large observations (normal assumption)

Historical Simulation

  • The Idea: replays a ''tape'' of history to current positions
    • go back in time (e.g. over the past 250 days)
    • project hypothetical factor values using the factor movements
    • derive the portfolio values
  • The VaR
    • current portfolio value as function of current risk factors:
    • sampling factor movements from the historical distribution:
    • construct hypothetical factor values:
    • current portfolio value:
    • portfolio return:
    • VaR is obtained from the difference between the average and the c-th quantile:
  • Advantages & Drawbacks
    • advantages: no specific distributional assumption, intuitive
    • drawbacks: its reliance on a short historical moving window to infer movements in market prices

Monte Carlo Simulation

  • The Idea
    • is similar to the historical simulation method
    • the movements in risk factors are generated from a prespecified distribution:
    • the risk manager needs to specify the marginal distribution of risk factors as well as their copula
  • Advantages & Drawbacks
    • advantages: most flexible
    • drawbacks: computational burden, subject to model risk, sampling variability
  • It should converge to the delta-normal VaR if all risk factors are normal and exposures are linear

算例:Monte Carlo VaR — Complete Workflow

Given: A portfolio with three assets: $400,000 in Stock A, $300,000 in Stock B, and $300,000 in a call option on Stock A (delta = 0.65, gamma = 0.02).

Step 1: Specify joint distribution of risk factors

  • , daily
  • , daily

Step 2: Generate scenarios
For each scenario :

  1. Draw from bivariate normal
  2. Compute factor changes: ,
  3. Compute option P&L:
  4. Portfolio P&L:

Step 3: Sort portfolio P&L values; 99% VaR = 100th worst loss

Step 4: 99% CVaR = average of worst 100 losses

Comparison with Delta-Normal: Delta-Normal VaR . The MC VaR is higher because the gamma term introduces convexity (positive skewness in P&L when long options is not captured by the linear method).

Comparison of Methods

Features Delta-Normal Historical Simulation Monte Carlo Simulation
Valuation Linear Full Full
Distribution
Shape
Extreme events

Normal
Low probability

Actual
In recent data

General
Possible
Implementation
Ease of computation
Communicability
VAR precision

Yes
Average
Excellent

Average
Easy
Poor with short window

No
Difficult
Good with many iterations
Major pitfalls Nonlinearities, fat tails Time variation in risk, unusual events Model risk

Limitations of Risk Systems

Limitations of Risk Systems

  • Illiquid Assets

  • Losses Beyond VaR

  • Issues with Mapping

  • Reliance on Recent Historical Data

  • Procyclicality

  • Crowded Trades

课堂练习

A、B两只股票最近30周的周回报率如下所示(单位:1%):

A: -3,2,4,5,0,1,17,-13,18,5,10,-9,-2,1,5,-9,6,-6,3,7,5,10,10,-2,4,-4,-7,9,3,2;

B: 4,3,3,5,4,2,-1,0,5,-3,1,-4,5,4,2,1,-6,3,-5,-5,2,-1,3,4,4,-1,3,2,4,3。

某金融机构用A、B按1:1比例构造投资组合。

(a) 请分别计算组合在90%置信水平下的VaR和CVaR。
(b) 通过计算验证VaR和CVaR是否为相容风险度量(coherent measure of risk)。

课堂练习

某交易组合是由价值300,000美元的黄金投资和价值500,000美元的白银投资构成,假定以上两资产的日波动率分别为1.8%和1.2%,并且两资产回报的相关系数为0.6,请问:(

(a) 交易组合10天展望期的97.5%VaR为多少?
(b) 投资分散效应减少的VaR为多少?

课堂练习

A trader holds a portfolio with two positions: $500,000 in a stock index and a short position in 500 ATM call options (delta = 0.52, gamma = 0.018 per option). The stock index has daily volatility of 1.5%.

(a) Compute the 1-day 95% VaR using the delta-normal method (ignore gamma).
(b) Explain why the delta-normal method may underestimate risk in this case.
(c) Describe how you would use the Monte Carlo method to obtain a more accurate VaR estimate.

End of L03

Key takeaways:

  • Volatility forecasting: EWMA and GARCH models capture volatility clustering and mean reversion
  • Implied volatility is forward-looking but model-dependent; the volatility surface reveals B-S shortcomings
  • Copulas separate marginal distributions from dependence structure — critical for portfolio risk
  • EVT provides a framework for modeling tail behavior beyond normal assumptions
  • Three VaR methods (delta-normal, historical simulation, MC) trade off speed, flexibility, and model risk
  • Real-world failures (LTCM, VIX flash crash) illustrate the consequences of model misspecification

Next lecture: L04 — Credit Risk Management (default risk, credit exposures, CDS, credit portfolio models)