Lecture 07

Machine Learning in Empirical Asset Pricing

Financial Machine Learning · Lecture 07

Outlines

Financial Machine Learning · Lecture 07

Part 1 · Empirical Asset Pricing w/o ML

  • Introduction to Empirical Asset Pricing
  • Theoretical Foundation: Stochastic Discount Factor (SDF)
  • Historical Development within SDF Framework
  • GMM and Time-Varying Risk
  • Current Developments and Challenges
  • Summary and Key Takeaways
Financial Machine Learning · Lecture 07

What is Empirical Asset Pricing?

  • Empirical asset pricing seeks to understand how risk influences asset returns.
  • It examines the relationship between expected returns and risk via the risk-return trade-off.
  • The Stochastic Discount Factor (SDF) serves as a fundamental concept, harmonizing various asset pricing models.
  • Importance: Understanding empirical asset pricing aids investors and asset managers in making informed decisions based on risk assessments.
Financial Machine Learning · Lecture 07

The Core Problem: Cross-Section of Returns

  • The cross-section of returns refers to variations in asset returns based on risk factors.
  • Key questions include:
    • Why do certain assets outperform others?
    • How is risk assessed and priced in financial markets?
  • Understanding these dynamics is crucial for effective investment strategies.
Financial Machine Learning · Lecture 07

Theoretical Foundation: The Stochastic Discount Factor (SDF)

  • Definition: The SDF, denoted , satisfies the equation:

    where is the gross return on asset and is the information set at time .
  • The SDF encapsulates the trade-off between risk and expected return, forming the backbone of asset pricing models.
Financial Machine Learning · Lecture 07

Interpreting the SDF: Economic Intuition

  • The SDF represents the marginal utility of consumption, linking asset prices with consumption choices and investor preferences.
  • It establishes a framework where risk factors directly impact expected asset returns, crucial for empirical asset pricing insights.
Financial Machine Learning · Lecture 07

Deriving the SDF from Consumption: CCAPM

  • The Consumption Capital Asset Pricing Model (CCAPM) derives the SDF from intertemporal consumption choices, integrating economic factors.
  • Key implication: Asset returns reflect consumers' time preferences and risk aversion, emphasizing connections between macroeconomic influences and asset pricing.
Financial Machine Learning · Lecture 07

CAPM within the SDF Framework

  • Capital Asset Pricing Model (CAPM) can be expressed within the SDF approach as:

  • This model suggests that expected returns are driven by market risk alone.
  • Limitations: While foundational, CAPM fails to address the complexities of asset returns observed in empirical studies.
Financial Machine Learning · Lecture 07

Limitations of CAPM and the Birth of Multifactor Models

  • CAPM's singular focus on market risk does not adequately capture asset return dynamics.
  • The introduction of multifactor models arose to address CAPM's limitations, incorporating additional factors impacting expected returns.
  • Empirical findings indicate that multiple factors beyond market risk are significant in explaining asset returns.
Financial Machine Learning · Lecture 07

Arbitrage Pricing Theory (APT): A Multifactor Extension

  • APT presents a framework where the SDF is expressed as a linear combination of multiple risk factors:

  • represents systematic risk factors influencing asset returns.
  • APT enhances the flexibility and empirical robustness of asset pricing models, allowing for a broader consideration of risks.
Financial Machine Learning · Lecture 07

The Fama-French 3-Factor Model: Factors and SDF Representation

  • The Fama-French model incorporates three factors: market return, size (SMB), and value (HML):

  • It addresses real empirical phenomena such as size and value premiums, providing a more comprehensive view of expected returns.
Financial Machine Learning · Lecture 07

Carhart Model: Adding Momentum

  • The Carhart model expands on Fama-French by introducing a momentum factor.
  • This model reflects the empirical observation that past performances influence future returns, enriching the SDF representation and improving predictive power.
Financial Machine Learning · Lecture 07

Empirical Testing: The Challenge of Testing Asset Pricing Models

  • Testing asset pricing models presents challenges, including limitations in available data and common issues related to model misspecification.
  • Researchers strive to empirically validate theoretical models to enhance their predictive capabilities in real-world scenarios.
Financial Machine Learning · Lecture 07

GMM: A Framework for Estimation and Testing

  • Generalized Method of Moments (GMM), developed by Hansen, serves as a robust tool for estimating parameters in asset pricing models.
  • GMM provides a practical approach to empirically test SDF frameworks against observed data, contributing to model accuracy and relevance.
Financial Machine Learning · Lecture 07

Time-Varying Risk: Conditional Models

  • Conditional SDF models account for variations in risk over time:

  • These models reflect changing market conditions, enhancing asset pricing's adaptability and accuracy in volatile environments.
Financial Machine Learning · Lecture 07

Factor Proliferation: The Zoo of Factors

  • The growing number of identified factors presents challenges in effective asset pricing.
  • The SDF framework provides context for understanding these factors, connecting them to fundamental economic risks.
Financial Machine Learning · Lecture 07

Current Challenges in Classical Approaches

  • Researchers face challenges in integrating diverse empirical findings into cohesive asset pricing models.
  • The adequacy of traditional asset pricing models in contemporary markets is an ongoing debate among scholars and practitioners.
Financial Machine Learning · Lecture 07

Setting the Stage for Machine Learning

  • Recent advances in econometrics and machine learning have the potential to significantly enhance empirical asset pricing.
  • Machine learning techniques present promising avenues for improving both model estimation and testing methodologies.
Financial Machine Learning · Lecture 07

Summary and Key Takeaways

  • The SDF framework unifies various asset pricing models under a cohesive theoretical structure.
  • It incorporates vital insights from traditional models while allowing for adaptability to evolving market dynamics.
  • Grasping the nuances of the SDF is essential for navigating the complexities of asset pricing in today's financial landscape.
Financial Machine Learning · Lecture 07

References

  • Hansen, L. P. (1982). The Generalized Method of Moments Estimation.
  • Fama, E. F., & French, K. R. (1992). The Cross-Section of Expected Stock Returns.
  • Carhart, M. M. (1997). On Persistence in Mutual Fund Performance.
Financial Machine Learning · Lecture 07

Part2 · Factor Models, Machine Learning, and Asset Pricing


Giglio S, Kelly B, Xiu D. Factor models, machine learning, and asset pricing[J]. Annual Review of Financial Economics, 2022, 14(1): 337-368.

Financial Machine Learning · Lecture 07

Introduction

  • Factor Models: Key for modeling equity returns; provides a parsimonious statistical description of returns’ cross-sectional dependence.

  • APT Foundation: Arbitrage Pricing Theory offers a robust economic basis for understanding risk exposures and risk premia.

  • Challenges in Estimating Asset Risk Premia:

    • Low signal-to-noise ratio
    • Small sample size
    • Multiple correlated predictors
    • Ambiguity in functional forms
  • Machine Learning Applications: Adoption of variable selection and dimensionality reduction has been part of empirical asset pricing, enhancing model robustness.

Financial Machine Learning · Lecture 07
  • Recent Advancements: New methodologies from machine learning facilitate rigorous empirical discoveries, complementing traditional economic theories.

  • Objectives of the Paper:

    1. Review recent methodological contributions categorized by purpose.
    2. Discuss asymptotic theory related to data dimensions and compare methodologies.
Financial Machine Learning · Lecture 07

Model Specifications: Static Factor Models

  • Basic Equation:

    • : Excess returns of test assets (e.g., sorted portfolios)
    • : Factor exposure matrix
    • : Factor innovations
    • : Idiosyncratic errors
  • Expected Return Decomposition:

    • : Risk premia vector
    • : Pricing errors vector
  • No-Arbitrage Condition:

Financial Machine Learning · Lecture 07

Frameworks for Factor Models

  1. Observable Factors:

    • ; if factors are tradable portfolios.
  2. Latent Factors and Exposures:

    • Following Connor & Korajczyk (1986), assumes all factors are latent.
  3. Observable Exposures but Latent Factors:

    • MSCI Barra model allows for time-varying exposures (Rosenberg, 1974).
Financial Machine Learning · Lecture 07

Model Specifications: Conditional Factor Models

  • Need for Conditional Models:

    • Static models are inadequate for assets with variable risk exposures and nonlinear payoff structures.
  • Conditional Factor Model Specification:

  • Model Requirements:

    • : Excess returns vector.
    • : Idiosyncratic errors vector.
Financial Machine Learning · Lecture 07

Model Frameworks

  • Rosenberg (1974):

    • , where is an observable characteristics matrix.
  • Instrumented PCA (IPCA):

  • Time-Varying Risk Premia:

    • Gagliardini et al. (2016): .
  • Nonlinear Extensions:

    • Gu et al. (2021): Introduced a conditional autoencoder model for nonlinear dynamics.
  • Challenges:

    • Black-box nature of deep learning models.
    • Need for rigorous theoretical justification.
Financial Machine Learning · Lecture 07

Methodologies: Measuring Expected Returns

  • Objective:

    • Understand the behavior of expected returns amidst noise from unforeseeable influences.
  • Challenges:

    • Expected returns are difficult to measure because of noise in asset prices due to unpredictable news.
  • Importance of Improved Measurement:

    • Better measurement aids in refining economic theories to explain expected returns.
Financial Machine Learning · Lecture 07

Empirical Literature on Stock Return Prediction

  • Three Basic Strands:

    1. Cross-sectional regressions (Fama & French, 2008; Lewellen, 2015):

      • Focus on stock-level characteristics affecting expected returns.
    2. Time series regression of portfolio returns on predictors:

      • Challenges include handling high-dimensional predictors (Welch & Goyal, 2007; Koijen & Nieuwerburgh, 2011; Rapach & Zhou, 2013).
    3. Machine Learning Approaches:

      • Increased relevance for empirical asset pricing, leveraging variable selection and dimension reduction.
  • Limitations of Traditional Methods:

    • Inability to handle large numbers of predictors effectively.
    • Risk of overfitting and challenges with multiple comparisons.
Financial Machine Learning · Lecture 07

Methodologies: Estimating Factors and Exposures

  • Factor Model Variance:

    • Total variance decomposed into systematic risk and idiosyncratic risk.
  • TSR (Time Series Regression)

    • Factor exposure estimates:

    • Asset-by-asset time series regressions yield .
  • CSR (Cross-Sectional Regression)

    • When exposures are observable:

    • Commonly used for individual stocks, accommodating time-varying characteristics.
Financial Machine Learning · Lecture 07

Principal Component Analysis (PCA)

  • Used when neither factors nor loadings are known:

  • SVD provides estimates of latent factors and their loadings.
  • PCA allows flexibility in research despite challenges in interpreting factors.
Financial Machine Learning · Lecture 07

Risk Premia PCA

  • Extracts latent factors from realized return covariances:

  • Provides a framework to generalize risk-premia estimation.
Financial Machine Learning · Lecture 07

Instrumented PCA

  • Addresses flexibility by estimating conditional factors:

  • Handles dynamic and unobservable characteristics effectively.
Financial Machine Learning · Lecture 07

Autoencoder Learning

  • Introduced by Gu et al. (2021):

    • Proposes a conditional autoencoder structure for modeling risk-return trade-offs.
  • Architecture:

    • Structure allows betas to depend on stock characteristics non-linearly.
  • Mathematical Representation:

    • Initializing the model and propagating information through layers enhances flexibility in feature extraction.
Financial Machine Learning · Lecture 07

Methodologies: Estimating Risk Premia

  • Risk Premium of a Factor:

    • Informs about the compensation investors require for holding associated risk.
    • Essential for understanding asset pricing models.
  • Estimating Risk Premia:

    • Simple average return for a factor provides a starting point.
    • Models often formulated for non-tradable factors, emphasizing the need for tradable counterparts.
Financial Machine Learning · Lecture 07

Classical Two-pass Regressions

Methodology

  1. First-pass Regression (Time Series):

    • Estimates via individual asset regressions.

  2. Second-pass Regression (Cross-Section):

    • Estimates risk premia with OLS on estimated .

Generalized Least Squares (GLS) Version

  • Replaces OLS in the second pass:

Financial Machine Learning · Lecture 07

Factor Mimicking Portfolios

  • Building Factor Mimicking Portfolios:
    • Fama & Macbeth (1973) suggest regressing realized returns onto .

  • Mimicking Non-Tradable Factors:
    • Goal: Estimate risk premia of non-tradable factors by constructing tradable proxies.
Financial Machine Learning · Lecture 07

Three-pass Regressions

  • Approach:
    1. First Pass: SVD of for and .
    2. Second Pass: Run OLS on to obtain risk premia.
    3. Third Pass: Projects onto .

Financial Machine Learning · Lecture 07

Weak Factors

  • Identifying Weak Factors:

    • Kan & Zhang (1999): Risk premia becomes distorted when including weak factors.
    • Kleibergen (2009): Proposes valid test statistics across various β values.
  • Addressing Weak Factors:

    • Jagedesh et al. (2019): Proposed multivariate variable selection to improve measurement accuracy.
Financial Machine Learning · Lecture 07

Test Assets

  • Importance of Test Assets:
    • Critical for empirical asset pricing analysis.
  1. Standard Approach: Use standard portfolios based on characteristics.
  2. Expanded Approach: Include broad asset clusters or characteristics-sorted portfolios.
  3. Targeted Test Assets: Focus on specific factors of interest.
Financial Machine Learning · Lecture 07

Methodologies: Estimating the SDF and its Loadings

  • Risk Premium and SDF:

    • A factor's risk premium equals its (negative) covariance with the stochastic discount factor (SDF).
    • In the setup of the model, the SDF is expressed as:

Financial Machine Learning · Lecture 07

Generalized Method of Moments (GMM)

SDF Loadings Estimation

  • Moment Conditions:

    • Formulate a set of moment conditions:

  • Optimization:

  • GMM Estimator:

    • Solver defined by:

Financial Machine Learning · Lecture 07

PCA-based Methods

  • PCA for SDF:

    • Strong covariation suggests that SDF can be represented as a function of dominant sources from asset returns.

  • SDF Estimation:

    • Can be achieved without relying on knowledge of factor identities.
Financial Machine Learning · Lecture 07

Penalized Regressions

  • Parameterization of SDF:

    • Represented in terms of a small number of linear combinations of factors:

  • Estimation Approach:

Financial Machine Learning · Lecture 07

Double Machine Learning

  • Applying DML:

    • Framework proposed to mitigate biases from numerous factors.
  • Example Framework:

    • Identify factors of interest and control for expected returns through respective regressions.
Financial Machine Learning · Lecture 07

Core Idea

  • Double Machine Learning (DML) is a framework that aims to estimate causal parameters in the presence of high-dimensional covariates.
  • It combines machine learning techniques with traditional econometric methods to control for confounding variables and minimize bias.
Financial Machine Learning · Lecture 07

Principles

  • Orthogonalization: DML utilizes two machine learning models to predict the outcomes and the confounders, effectively orthogonalizing the treatment effect from these variables.
  • Two-Step Estimation: It employs a two-step procedure where the first step involves predicting the nuisance parameters, and the second step focuses on estimating the causal effect.
Financial Machine Learning · Lecture 07

Applicable Scenarios

  • DML is particularly useful in settings with:
    • High-dimensional data: Many predictors compared to the number of observations.
    • Complex relationships: Non-linear and interaction effects among variables.
    • Causal inference: When aiming to derive treatment effects or causal relationships.
Financial Machine Learning · Lecture 07

Step-by-Step Procedure

  1. Model Specification: Specify the model with the treatment variable (e.g., policy intervention) and outcome variable (e.g., economic output).
  2. Nuisance Parameter Estimation:
    • Fit machine learning models to predict the outcome and the covariates.
    • Obtain residuals that are orthogonal to the treatment.
  3. Estimation of Treatment Effects:
    • Use regression or other econometric methods on the residuals obtained in step 2.
    • Estimate the causal effect of the treatment variable on the outcome variable.
Financial Machine Learning · Lecture 07

Parametric Portfolios and Deep Learning SDFs

  • Optimizing SDF:

  • Neural Network Approaches:

    • Use networks to incorporate past performance and characteristics for SDF estimation.
Financial Machine Learning · Lecture 07

Methodologies: Model Specification Tests and Model Comparison

  • Purpose:
    • Economic theory aids in identifying the best model but leads to numerous candidate models.
    • Recent notable models with observable portfolios include those by Fama & French (2015), Hou et al. (2015), and others.
Financial Machine Learning · Lecture 07

GRS Test and Extensions

  • Assessment of Factor Pricing Models:

    • Formalized as statistical hypothesis testing problems.
    • Common focus on zero alpha condition:

  • Estimation:

Financial Machine Learning · Lecture 07

GRS Test Statistics

  • Quadratic Test Statistic:

  • Limitations:

    • Requires ; asymptotic properties may be compromised under certain conditions.
Financial Machine Learning · Lecture 07

Model Comparison Tests

  • Comparative Analysis:

    • Testing models is often less informative than comparing them.
    • Classical GRS test relates to optimal Sharpe ratios of portfolios formed from assets.
  • Testing Insights:

    • Useful in assessing risk premiums across asset portfolios using factors.
Financial Machine Learning · Lecture 07

Bayesian Approach

  • Model Expansion:

    • Probabilistic methods allow for more robust model comparison.
  • Priors:

    • Utilize Spike-and-Slab priors to increase model selection robustness.
Financial Machine Learning · Lecture 07

Conclusion

  • Importance of Testing:
    • Essential for verifying model resilience against various factors and ensuring accurate pricing.
    • Both statistical tests and economic theory must guide model specification to enhance asset pricing predictions.
Financial Machine Learning · Lecture 07

Methodologies: Alphas and Multiple Testing

  • Definition of Alphas:

    • Portions of expected returns unexplained by risk factors, often termed anomalies.
    • Significant findings question the effectiveness of traditional asset pricing models.
  • Data-Snooping Concerns:

    • The prevalence of multiple testing leads to potential false discoveries in alpha estimation.
    • Examples of early proposals include Lo & MacKinlay (1990) and Sullivan et al. (1999).
Financial Machine Learning · Lecture 07

Hypothesis Testing Framework

Null Hypotheses for Alpha Testing

  • Propose single null hypothesis for a collection of alphas:

  • Testing Statistics:

    • : test statistic for (often the t-statistic).

Components of Testing

  • Let be the number of rejections and the total.
  • Both and are random variables, and performance can be modeled to limit relative to .
Financial Machine Learning · Lecture 07

Multiple Testing Procedures

  • Naïve Procedure: Tests individual hypotheses at predetermined level .
  • Bonferroni Correction: Adjusts level to to control the rate of false discoveries.

Enhanced Testing Examples

  • Giglio et al. (2021a):
    • Develops asymptotic guarantees for valid p-values and t-statistics.
  • Bayesian Methods:
    • Joint modeling of factors and improvements in alpha estimation.
Financial Machine Learning · Lecture 07

Bayesian Hierarchical Model

  • Treats alphas as properties with distributions, focusing on improving power while controlling for false discoveries.
  • Addressing changes in alpha estimates can augment statistical power without increasing error rates.
Financial Machine Learning · Lecture 07

Conclusions

  • The interplay of multiple testing in finance necessitates robust methodologies.
  • Bayesian approaches offer promising avenues for managing the complexities of alpha testing while mitigating data-snooping biases.
Financial Machine Learning · Lecture 07

Asymptotic Theory

  • Key Asymptotic Schemes:
    • Three primary asymptotic schemes are identified in asset pricing:
      1. Fixed N, Large T: Traditional approach emphasizing fixed asset counts and increasing time periods.
      2. Large N, Large T: Both asset counts and periods increase, offering flexibility in model application.
      3. Large N, Fixed T: Focuses on situations where the number of assets grows while the time series remains constant.
Financial Machine Learning · Lecture 07

Fixed N, Large T

  • Classical Approach:

    • Developed the central limit theorem for estimating risk premia.
    • Emphasizes the importance of adjusting estimates to account for potential bias in beta estimations.
  • Model Implications:

    • The variance of estimators provides insight into the reliability of risk premium estimates.
    • Important adjustments (e.g., Shanken adjustments) are recommended to improve accuracy.
Financial Machine Learning · Lecture 07

Large N, Large T

  • Comparative Benefits:
    • When both asset count and time increase, it eases the need for complex covariance structure estimations.
    • Simplifies the inference process regarding risk premia.
    • Both OLS and GLS yield similar asymptotic behavior, making model application more flexible.
Financial Machine Learning · Lecture 07

Large N, Fixed T

  • New Framework:

    • Offers distinct strategies for modeling ex-post risk premia.
    • Important for applying factor models in practical settings with a fixed timeframe.
  • Recommendations:

    • Utilizing this framework can better handle time-varying factors and invisible risk exposures.
Financial Machine Learning · Lecture 07

Conclusion

  • Model Selection:

    • Choosing the appropriate asymptotic scheme is crucial based on the data structure and research goals.
    • Each scheme provides specific benefits and should be considered according to the empirical context.
  • Future Directions:

    • Ongoing developments in asymptotic theory will continue to refine investment models and predictions in finance.
Financial Machine Learning · Lecture 07