Machine Learning in Empirical Asset Pricing"

Empirical Literature on Stock Return Prediction

Three Basic Strands:
1. Cross-sectional regressions (Fama & French, 2008; Lewellen, 2015):
  - Focus on stock-level characteristics affecting expected returns.
2. Time series regression of portfolio returns on predictors:
  - Challenges include handling high-dimensional predictors (Welch & Goyal, 2007; Koijen & Nieuwerburgh, 2011; Rapach & Zhou, 2013).
3. Machine Learning Approaches:
  - Increased relevance for empirical asset pricing, leveraging variable selection and dimension reduction.
Limitations of Traditional Methods:
- Inability to handle large numbers of predictors effectively.
- Risk of overfitting and challenges with multiple comparisons.

Lecture 07

Machine Learning in Empirical Asset Pricing

Outlines

Part 1 · Empirical Asset Pricing w/o ML

What is Empirical Asset Pricing?

The Core Problem: Cross-Section of Returns

Theoretical Foundation: The Stochastic Discount Factor (SDF)

Interpreting the SDF: Economic Intuition

Deriving the SDF from Consumption: CCAPM

CAPM within the SDF Framework

Limitations of CAPM and the Birth of Multifactor Models

Arbitrage Pricing Theory (APT): A Multifactor Extension

The Fama-French 3-Factor Model: Factors and SDF Representation

Carhart Model: Adding Momentum

Empirical Testing: The Challenge of Testing Asset Pricing Models

GMM: A Framework for Estimation and Testing

Time-Varying Risk: Conditional Models

Factor Proliferation: The Zoo of Factors

Current Challenges in Classical Approaches

Setting the Stage for Machine Learning

Summary and Key Takeaways

References

Part2 · Factor Models, Machine Learning, and Asset Pricing

Introduction

Model Specifications: Static Factor Models

Frameworks for Factor Models

Model Specifications: Conditional Factor Models

Methodologies: Measuring Expected Returns

Empirical Literature on Stock Return Prediction

Methodologies: Estimating Factors and Exposures

Principal Component Analysis (PCA)

Risk Premia PCA

Instrumented PCA

Autoencoder Learning

Methodologies: Estimating Risk Premia

Classical Two-pass Regressions

Methodology

Generalized Least Squares (GLS) Version

Factor Mimicking Portfolios

Three-pass Regressions

Weak Factors

Test Assets

Methodologies: Estimating the SDF and its Loadings

Generalized Method of Moments (GMM)

SDF Loadings Estimation

PCA-based Methods

Penalized Regressions

Double Machine Learning

Core Idea

Principles

Applicable Scenarios

Step-by-Step Procedure

Parametric Portfolios and Deep Learning SDFs

Methodologies: Model Specification Tests and Model Comparison

GRS Test and Extensions

GRS Test Statistics

Model Comparison Tests

Bayesian Approach

Conclusion

Methodologies: Alphas and Multiple Testing

Hypothesis Testing Framework

Null Hypotheses for Alpha Testing

Components of Testing

Multiple Testing Procedures

Enhanced Testing Examples

Bayesian Hierarchical Model

Conclusions

Asymptotic Theory

Fixed N, Large T

Large N, Large T

Large N, Fixed T

Conclusion