Lecture 08

Causal Inference & Machine Learning in Finance

Financial Machine Learning · Lecture 8

Outlines

Financial Machine Learning · Lecture 8

Part 1 · Theoretical Framework of Causal Inference

  • Importance: Integrating machine learning with causal inference to enable credible estimation of treatment effects in high‑dimensional, nonlinear economic and financial data — improving bias reduction, uncovering heterogeneous effects, and supporting robust policy and decision analysis.

  • Focus:

    • Potential outcomes framework and identification conditions (SUTVA, ignorability, overlap).
    • Classical causal methods: regression adjustment, propensity‑score matching, IV/2SLS, DiD, RDD.
    • ML‑augmented causal tools: Lasso (variable selection), post‑selection inference, double‑selection; Double/Debiased Machine Learning (cross‑fitting, orthogonal scores); Causal Trees and Causal Forests for heterogeneous treatment effects; TMLE for targeted estimation.
    • Practical concerns: model selection, sparsity, diagnostic checks (balance, instrument validity, parallel trends), and sensitivity analyses.
Financial Machine Learning · Lecture 8

Potential Outcomes Model

  • Let:

    • : potential outcome without treatment.
    • : potential outcome with treatment.
  • Observed outcome:

Financial Machine Learning · Lecture 8

Potential Outcomes Model: Treatment Effects

  • Individual Treatment Effect:

    • invisible, thus unfeasible
  • Average Treatment Effect (ATE):

  • Average Treatment Effect on Treated (ATT):

Financial Machine Learning · Lecture 8

Randomized Controlled Trials (RCTs)

  • RCTs: Random assignment to treatment/control groups.
  • Control group serves as a baseline for measuring treatment effects.
  • Aims to facilitate comparison with counterfactual scenarios (outcomes without treatment).
Financial Machine Learning · Lecture 8

RCTs: Importance of Random Assignment

  • Reduces selection bias, ensuring comparability between groups.
  • Helps in establishing credible causal relationships by controlling external factors.
Financial Machine Learning · Lecture 8

RCTs: Assumption (SUTVA)

  • the stable unit treatment value (SUTVA) assumption: the effect of the treatment on one individual does not affect the outcome of the other individuals.
  • An alternative model would be to assume that , i.e., an individual's outcome depends on the treatment status of other individuals.
Financial Machine Learning · Lecture 8

RCTs: Assumption (Random assignment)

The assumption expresses the independence of treatment with respect to potential outcomes, which is credible if and only if treatment is randomly assigned, without reference to the individual's potential outcome.

By comparing the average outcomes between the treated group and the control group, we have:

where denotes the (average) treatment effect.

Financial Machine Learning · Lecture 8

RCTs: The difference-in-means (DM) estimator

Using the central limit theorem we obtain that the estimator is asymptotically normal:

where

and we can derive confidence intervals on the ATE.

Financial Machine Learning · Lecture 8

RCTs: Estimating Treatment Effects

Assumptions:

  1. SUTVA: The treatment effect on one individual does not affect others.
  2. Random Assignment: Treatment assignment is independent of potential outcomes.

Estimation Approach:

Financial Machine Learning · Lecture 8

RCTs: OLS Estimation of Treatment Effect

  • Linear Model:

  • OLS can estimate average treatment effects under standard assumptions.
Financial Machine Learning · Lecture 8

Conditional Independence & Propensity Score

Key Assumption

  • Conditional Independence:

Propensity Score

  • Defines the likelihood of treatment given observable characteristics

  • The propensity score allows us to reduce the dimension of the problem, since it satisfies:

  • conditional on the propensity score , treatment assignment is independent of the potential outcomes.
Financial Machine Learning · Lecture 8

Characterizing the ATE: Using Inverse-Propensity Weighting (IPW)

  • to estimate the score p in a first step using non-parametric regression
    • when is discrete and takes values , a natural estimator of p is when
    • Then, we use:

    • Define the oracle estimator which is obtained assuming the propensity score is known:

    • the asymptotic properties of proceeds by decomposing

Financial Machine Learning · Lecture 8

Notice that term (B) vanishes since is unbiased:

Assumption (Overlap condition). There exists such that, for all , .

Assuming bounded outcome variables , we obtain a control in probability for the distance to the oracle estimator (A):

Thus, under the sufficient condition of having a convergent estimator in sup-norm of the propensity score, , the estimator is consistent.

Financial Machine Learning · Lecture 8

Characterizing the ATE: Regression Function Differences

For each :

where for , and thus

One way to obtain a consistent estimator of the ATE with this characterization would be to use a consistent non-parametric estimator of and then average over the observations:

Financial Machine Learning · Lecture 8

Efficient estimation of treatment effect

The augmented inverse propensity score (AIPW) estimator, defined by Robins et al. (1994) and Hahn (1998), is designed to correct the bias in due to the estimation of . It is given by

This AIPW estimator has two important properties:

  • It achieves the semiparametric efficiency bound (Robins et al., 1994; Hahn, 1998), with

    • where for .
  • It is doubly robust, meaning that it is consistent either if the estimators for are consistent, or if is consistent.

Financial Machine Learning · Lecture 8

Instrumental Variables (IV): 1 Endogeneity and instrumental variables

  • randomized experiments natural experiments
    • the treatment is not random and therefore not independent from the potential outcomes.
    • Consider a linear model allowing to estimate the effect of the treatment (discrete or continuous, and of dimension 1 here) while controlling for variables (also generally includes the intercept):

  • the exogeneity assumption may not hold.
    • define quantities such that

    • Assume .
    • the ordinary least squares estimator will estimate , which generally differ from the true parameters .
    • is the best linear predictor of on , but it is no longer the causal effect.

Financial Machine Learning · Lecture 8
  • We are in a context where but not .

  • Strategy: identify instrumental variables , where is larger than the number of endogenous variables (here: is scalar).

  • Instrument definition:

    1. Correlated with the endogenous variable .
    2. Uncorrelated with the residuals .
  • Define as the best linear prediction of using :

  • Denote .

Financial Machine Learning · Lecture 8
  • Formal Assumptions

    • Assumption (Rank condition). is non-singular.
    • Assumption (Exogeneity).
  • Identification Result

    • The true parameters take the form:

  • Relevance (Lower-Level Conditions)

    • Lower-level conditions for Assumption (3.5) (relevance of for ):
      • is non-singular.
      • .
Financial Machine Learning · Lecture 8

Two-Stage Least Squares (2SLS) — Intuition

  • The estimator obtained from the empirical counterpart of the identification formula is the two-stage least squares estimator.

  • 2SLS procedure:

    1. Regress on the instrument and controls ; obtain (prediction).
    2. Regress on the predicted and controls .
  • Special case: one instrument and no controls:

Financial Machine Learning · Lecture 8

Multiple Instruments and Transformations

  • Researchers may have multiple instruments or consider transformations of the initial instrument.

  • If Assumption (3.6) holds for , then the conditional moment implies unconditional moments:

    for any vector of instruments with .

  • This leads to the question: how to choose to minimize the asymptotic variance of the GMM estimator of ?

Financial Machine Learning · Lecture 8

Efficiency Considerations

  • Choice of does not affect identification of the causal effect, but it affects the precision of the 2SLS (or GMM) estimator.

  • We overview classical results on optimal instruments.

  • For simplicity, restrict to conditional homoscedasticity:

Financial Machine Learning · Lecture 8

IV: Optimal Instrumental Variables

  • Setting and Goal

    • Assume only is endogenous and scalar; denote .
    • Consider moment conditions for GMM based on instruments with :

    • Define .
  • GMM Estimator

    • The GMM estimator satisfies

      and
Financial Machine Learning · Lecture 8
  • Asymptotics
    • Asymptotic normality:

      where

    • The optimal minimizes the asymptotic variance in .
Financial Machine Learning · Lecture 8

Necessary Condition for an Efficient Instrument

  • Theorem (Necessary condition; Newey & McFadden, 1994):
    If an efficient choice exists, it must satisfy

    for all with .

  • Equivalently:

Financial Machine Learning · Lecture 8

Reformulation via Iterated Expectations

  • Rearranging yields

  • Under conditional homoscedasticity

    the condition is satisfied by

Financial Machine Learning · Lecture 8

Optimal Instrument and Efficiency Bound

  • Since multiplying by a constant matrix does not change efficiency, minimizes asymptotic variance.
  • The resulting optimal-object expression (semi-parametric efficiency) is:

  • Interpretation:
    • The optimal instrument is the regression function .
    • attains the semi-parametric efficiency bound (see Van der Vaart, 1998, Ch.25).
Financial Machine Learning · Lecture 8

Practical Remarks

  • may be high-dimensional and challenging to estimate nonparametrically.
  • With few instruments, Newey & McFadden (1994) suggest series (sieve) estimators for the regression .
  • The optimal instrument improves precision but does not affect identification.
Financial Machine Learning · Lecture 8

Summary of Key Concepts

  • Key Terms: Causal inference, RCT, treatment effects, average treatment effect (ATE), conditional independence, propensity score, IV methods, local average treatment effect (LATE).
  • References:
    • Angrist & Pischke (2009)
    • Imbens & Rubin (2015)
Financial Machine Learning · Lecture 8

Part 2 · Classic Causal Inference Models

  • DID
  • PSM
  • RDD
Financial Machine Learning · Lecture 8

Difference-in-difference (DID): Introduction

  • Definition:

    • Difference-in-Differences (DID) is a statistical technique used to estimate causal effects by comparing the outcomes of treatment and control groups before and after an intervention.
  • Objective:

    • To isolate the effect of a specific treatment or policy by controlling for unobserved factors that could influence the outcome.
  • Applicable Issues:

    • Policy analysis, economic interventions, program evaluations in finance, and social sciences.
Financial Machine Learning · Lecture 8

DID: Basic Concept

  • Model / Formula:

  • Assumptions:

    • Parallel Trends Assumption: The treatment and control groups would have followed the same trajectory in the absence of treatment.
  • Causal Inference Analysis:

    • The DID approach helps mitigate bias from confounding factors by ensuring that any changes in outcomes can be attributed to the intervention rather than other influences.
Financial Machine Learning · Lecture 8

DID: Key Components

  • Treatment Group:

    • The group that receives the intervention.
  • Control Group:

    • The group that does not receive the intervention, used for comparison.
  • Time Periods:

    • Pre-treatment: Measurements taken before the intervention.
    • Post-treatment: Measurements taken after the intervention.
Financial Machine Learning · Lecture 8

DID: Example

  • Context:

    • Analyzing the effect of a financial deregulation (e.g., the repeal of Glass-Steagall Act) on bank performance.
  • Data:

    • Treatment Group: Banks affected by deregulation.
    • Control Group: Similar banks that weren’t affected.
  • Observation:

    • Compare key financial metrics (e.g., ROA, stock performance) before and after deregulation to estimate its impact.
Financial Machine Learning · Lecture 8

DID: Steps to Implement

  1. Identify treatment and control groups.
  2. Collect longitudinal data for both groups before and after the intervention.
  3. Calculate changes in outcomes for both groups.
  4. Apply the DID formula to estimate the causal effect.
  5. Validate the parallel trends assumption using pre-treatment data.
Financial Machine Learning · Lecture 8

DID: Limitations and Considerations

  • Assumption Validity:

    • The parallel trends assumption may not be justified in all cases.
  • Confounding Variables:

    • External factors can skew results; controlling for these factors is crucial.
  • Data Quality:

    • Requires reliable longitudinal data and appropriate sample sizes for meaningful results.
Financial Machine Learning · Lecture 8

DID: Applications of Machine Learning

  • Latest Literature:
    • Machine learning techniques can be applied to enhance DID models by:
      • Improving Propensity Score Matching: Using algorithms to better pair treatment and control groups.
      • Feature Selection: Identifying key variables that affect treatment outcomes and adjusting for them.
      • Robustness Checks: Applying machine learning methods to assess the validity of findings from traditional DID.
Financial Machine Learning · Lecture 8

DID: Conclusion and References

  • Summary:

    • DID is a powerful tool for estimating causal effects in the absence of randomization, particularly useful in the fields of economics and finance.
  • References:

    1. Ashenfelter, O. (1978). "Determining participation in income maintenance programs."
    2. Card, D. (1990). "The impact of the Mariel boatlift on the Miami labor market."
    3. Goodman-Bacon, A. (2018). "Difference-in-Differences with Variation in Treatment Timing."
    4. Doudchenko, N., & Imbens, G. W. (2016). "Balancing, Regression, and Randomization Inference."
Financial Machine Learning · Lecture 8

Propensity Score Matching (PSM) Methodology: Introduction

  • Definition:

    • Propensity Score Matching (PSM) is a statistical technique used to estimate the effect of a treatment or intervention by accounting for the covariates that predict receiving the treatment.
  • Objective:

    • To reduce selection bias by matching treated and control units with similar characteristics based on their propensity scores.
  • Applicable Issues:

    • Policy evaluations, treatment effects in healthcare, finance, and social sciences.
Financial Machine Learning · Lecture 8

PSM: Basic Concept

  • Model / Formula:

    • Propensity Score:
      • where indicates treatment and includes covariates.
  • Assumptions:

    • Ignorability: The potential outcomes are independent of treatment assignment given the observed covariates.
    • Common Support: There is overlap in the distribution of covariates between treated and control groups.
  • Causal Inference Analysis:

    • PSM aims to create a quasi-experimental condition that allows the estimation of causal effects without randomization by balancing covariates.
Financial Machine Learning · Lecture 8

PSM: Key Components

  • Treatment Group:

    • Individuals receiving the intervention or treatment.
  • Control Group:

    • Individuals not receiving the intervention, used for comparison.
  • Propensity Score:

    • The probability of receiving the treatment given a set of observed covariates.
Financial Machine Learning · Lecture 8

PSM: Example

  • Context:

    • Evaluating the impact of financial literacy training on savings behavior.
  • Data:

    • Treatment Group: Individuals who completed the training.
    • Control Group: Similar individuals who did not participate.
  • Process:

    • Match participants based on their propensity scores to estimate the training's effect on savings rates.
Financial Machine Learning · Lecture 8

PSM: Steps to Implement

  1. Estimate the propensity score using a logistic regression model (or other methods) to predict treatment assignment based on covariates.
  2. Match treated and control units based on their propensity scores (e.g., nearest neighbor, caliper matching).
  3. Assess the balance of covariates post-matching to ensure similarity between groups.
  4. Estimate treatment effects using the matched sample.
Financial Machine Learning · Lecture 8

PSM: Limitations and Considerations

  • Assumption Validity:

    • The method relies heavily on the ignorability assumption; unmeasured confounders can bias results.
  • Matching Quality:

    • Poor matching can lead to inadequate control of bias; it’s essential to check balance.
  • Data Quality:

    • Requires a rich dataset with sufficient covariates to ensure valid estimates.
Financial Machine Learning · Lecture 8

PSM: Applications of Machine Learning

  • Latest Literature:
    • Machine learning techniques are being increasingly integrated into PSM to enhance model estimation:
      • Propensity Score Estimation: Use of advanced algorithms (e.g., Random Forest, Gradient Boosting) to improve the accuracy of propensity score estimation.
      • Automation of Matching Process: ML algorithms can automate the matching of treated and control groups more effectively than traditional methods.
      • Robustness and Generalization: Assessing treatment effects across various populations by leveraging machine learning’s predictive capabilities.
Financial Machine Learning · Lecture 8

PSM: Conclusion and References

  • Summary:

    • PSM is a valuable tool for estimating causal effects in observational studies, particularly useful in finance and economics.
  • References:

    1. Rosenbaum, P. R., & Rubin, D. B. (1983). "The central role of the propensity score in observational studies for causal effects."
    2. Imbens, G. W. (2004). "Nonparametric estimation of average treatment effects under exogeneity: A review."
    3. Lechner, M. (2002). "Some practical issues in the evaluation of labor market policies."
    4. Leuven, E., & Sianesi, B. (2003). "PSMATCH2: Stata Module to Perform Full Mahalanobis and Genetic Matching."
Financial Machine Learning · Lecture 8

Regression Discontinuity Design (RDD) Methodology: Introduction

  • Definition:

    • Regression Discontinuity Design (RDD) is a quasi-experimental pretest-posttest design that capitalizes on a cutoff or threshold to assign treatment, allowing for causal inferences about the effect of an intervention.
  • Objective:

    • To estimate causal effects by exploiting a discontinuity in the assignment of treatment based on an observed variable.
  • Applicable Issues:

    • Policy evaluations, program assessments in education and health, and financial interventions.
Financial Machine Learning · Lecture 8

RDD: Basic Concept

  • Model / Formula:

    • The basic RDD model can be expressed as:

      • Where is the outcome variable, is an indicator for whether the treatment is applied, and captures the functional form of the covariate around the cutoff.
  • Assumptions:

    • Continuity Assumption: The potential outcomes are continuous at the cutoff; no abrupt changes other than treatment effect.
    • Local Randomization: Units near the cutoff are considered as if randomly assigned between treatment and control groups.
  • Causal Inference Analysis:

    • RDD allows for the estimation of causal effects by comparing outcomes just above and just below the cutoff, assuming that all other factors remain constant.
Financial Machine Learning · Lecture 8

RDD: Key Components

  • Running Variable:

    • The continuous variable that determines treatment assignment based on a specified cutoff.
  • Cutoff Point:

    • The threshold at which the treatment changes (e.g., income level, test score).
  • Treatment Group:

    • Units just above the cutoff that receive the treatment.
  • Control Group:

    • Units just below the cutoff that do not receive the treatment.
Financial Machine Learning · Lecture 8

RDD: Example

  • Context:

    • Evaluating the impact of a minimum wage increase on employment levels.
  • Data:

    • Running Variable: Average wage level of firms.
    • Cutoff Point: Minimum wage legislation threshold.
  • Observation:

    • Compare employment trends for firms just above and just below the minimum wage threshold to estimate the policy's effect.
Financial Machine Learning · Lecture 8

RDD: Steps to Implement

  1. Identify the running variable and the cutoff point for treatment assignment.
  2. Collect data on the outcome variable and covariates around the cutoff.
  3. Perform graphical analysis to visualize the discontinuity at the cutoff.
  4. Estimate the causal effects using local polynomial regression or parametric methods.
  5. Check the robustness of results through sensitivity analyses and various bandwidths.
Financial Machine Learning · Lecture 8

RDD: Limitations and Considerations

  • Assumption Validity:

    • Results are only valid under the assumption that no other factors change at the cutoff.
  • Limited Generalization:

    • The results may only apply to units near the cutoff; the external validity can be limited.
  • Data Quality:

    • Requires large datasets with precise measures around the cutoff to ensure reliable estimates.
Financial Machine Learning · Lecture 8

RDD: Applications of Machine Learning

  • Latest Literature:
    • Recent studies have begun integrating machine learning techniques into RDD to enhance causal inference:
      • Flexible Functional Forms: Machine learning algorithms (e.g., Random Forests, Neural Networks) can model nonlinear relationships around the cutoff.
      • Automated Bandwidth Selection: Using ML approaches to determine optimal bandwidth can improve estimation accuracy.
      • Robustness Checks: ML methods can help validate RDD findings by comparing predictions from different models.
Financial Machine Learning · Lecture 8

RDD: Conclusion and References

  • Summary:

    • RDD is a robust method for estimating causal effects where random assignment is not feasible, particularly relevant in economics and finance.
  • References:

    1. Imbens, G. W., & Lemieux, T. (2008). "Regression Discontinuity Designs: A Guide to Practice."
    2. Lee, D. S., & Lemieux, T. (2010). "Regression Discontinuity Designs in Economics."
    3. Cattaneo, M. D., Idrobo, N., & Titiunik, R. (2019). "A Practical Introduction to Regression Discontinuity Designs."
    4. Gelman, A., & Imbens, G. W. (2019). "Why High-Order Polynomials Should Not Be Used in Regression Discontinuity Designs."
Financial Machine Learning · Lecture 8

Part 3 · Post-selection Inference

  • Importance: Model selection and sparsity in high-dimensional data.
  • Focus:
    • Post-selection inference
    • Lasso estimator as a variable selection tool
    • Double selection method
  • Related Chapters:
    • Chapter 5: Post-machine learning inference
    • Chapter 6: Instrumental variable models
    • Chapter 7: Further developments
Financial Machine Learning · Lecture 8

The Post-selection Inference Problem

  • Two-step inference:
    • First, select a model.
    • Then, report results as if true.
  • Challenge: Ignoring the selection step can lead to misleading results.
  • Context: Based on the works of Leeb and Pötscher (2005).
Financial Machine Learning · Lecture 8

The Model

  • Assumption: Sparse Gaussian linear model

    • is non-singular.
  • Models:
    • : Sparse (R) vs. Unrestricted (U)
Financial Machine Learning · Lecture 8

Consistent Model Selection

  • Decision rule for inclusion of :

  • Asymptotic properties (Lemma 4.1):
    • Consistency of model selection as .
Financial Machine Learning · Lecture 8

Distribution of the Post-selection Estimator

  • Post-selection estimator given by:

  • Distribution results indicate deviation from Gaussian even under selection.
Financial Machine Learning · Lecture 8

High Dimension, Sparsity, and the Lasso

  • Lasso estimator defined as:

    • Balances fit (MSE) and sparsity (L1 norm).
  • Assumptions: Sparse Gaussian linear model, with sparsity constraint on .
Financial Machine Learning · Lecture 8

Theoretical Elements on the Lasso

Theorem: Consistency of Lasso

  • The Lasso estimator is consistent in :

  • Implications of penalty strength and eigenvalue conditions.
Financial Machine Learning · Lecture 8

Regularization Bias

  • Regularization Bias: Bias from including/removing variables leading to erroneous inference.
  • Risk of dual objectives: (selecting variables and accurate estimation).
Financial Machine Learning · Lecture 8

The Double Selection Method

  1. Selection on Treatment: Regress on using Lasso.
  2. Selection on Outcome: Regress on using Lasso.
  3. Final Estimation: OLS regression of on selected and .
  • Outcome:
    • Provides valid inference on treatment effects.
Financial Machine Learning · Lecture 8

Empirical Application: Education on Wage

  • Data: French Enquête Emploi from 2017-2019, 162,254 observations.
  • Modeling: Log of monthly salary explained by education levels and other controls.
  • Importance:
    • Double selection method improves precision in estimating treatment effects.
Financial Machine Learning · Lecture 8

Summary

  • Key Concepts:
    • Post-selection inference, Lasso estimator, regularization bias, double selection method.
  • Recommendations:
    • Understanding the nuances of variable selection and estimation under uncertainty.
Financial Machine Learning · Lecture 8

Additional References

  • Tibshirani (1994) - Lasso regression.
  • Belloni et al. (2014) - Accessible econometric references.
  • R Packages: hdm for practical implementation.
Financial Machine Learning · Lecture 8

Part 4 · Generalization and Methodology

  • Focus: Importance of generalization in econometric modeling.
  • Goals:
    • Understand methodologies for assessing model performance.
    • Explore overfitting and its implications.
  • Contents:
    • Statistical learning theory
    • Cross-validation techniques
    • Regularization methods
Financial Machine Learning · Lecture 8

Generalization in Econometrics

  • Definition: Generalization refers to a model's ability to perform well on unseen data.
  • Key Concept: Bias-variance tradeoff impacts prediction accuracy:
    • Bias: Error due to approximating complex truths.
    • Variance: Error due to model sensitivity to fluctuations in training set.
Financial Machine Learning · Lecture 8

Statistical Learning Theory

  • Framework: Analyzes relationships between generalization abilities of learning algorithms and model complexity.
  • Risk Function:

    • : Loss function.
    • : Expected risk (average loss).
Financial Machine Learning · Lecture 8

Model Complexity

  • Definition: Complexity measures the capacity of a model to fit data.
  • Types:
    • Flexible models: Can fit diverse patterns (e.g., high-degree polynomials).
    • Rigid models: Offer less variance (e.g., linear models).
  • Implication: Increased complexity can lead to overfitting.
Financial Machine Learning · Lecture 8

Overfitting

  • Definition: Occurs when a model captures noise in training data rather than underlying patterns.
  • Indicators:
    • High accuracy on training set vs. low accuracy on validation/test set.
  • Consequences: Leads to poor generalization and predictive performance.
Financial Machine Learning · Lecture 8

Cross-Validation Techniques

  • Purpose: Assess model performance and avoid overfitting.
  • Methods:
    • K-fold Cross-Validation: Divide data into subsets; train on and validate on the remaining.
    • Holdout Method: Split dataset into training and test sets.
  • Goal: Ensure the model generalizes well to new data.
Financial Machine Learning · Lecture 8

Regularization Techniques

  • Purpose: Combat overfitting by adding a penalty for complexity.
  • Common Methods:
    • Lasso Regression:

    • Ridge Regression:

  • Effect: Reduces model complexity and enhances prediction stability.
Financial Machine Learning · Lecture 8

Model Selection Criteria

  • Common Criteria:
    • AIC (Akaike Information Criterion):

    • BIC (Bayesian Information Criterion):

  • Use: Select models based on the trade-off between goodness of fit and model complexity.
Financial Machine Learning · Lecture 8

Conclusion

  • Key Takeaways:
    • Emphasis on generalization is crucial for model validity.
    • Understanding the balance between bias and variance helps improve performance.
    • Regularization and cross-validation are powerful tools against overfitting.
Financial Machine Learning · Lecture 8

Further Reading

  • Flach, P. A. (2015): "Machine Learning: The Art and Science of Algorithms that Make Sense of Data".
  • Hastie, T., Tibshirani, R., & Friedman, J. (2009): "The Elements of Statistical Learning".
  • Bishop, C. M. (2006): "Pattern Recognition and Machine Learning".
Financial Machine Learning · Lecture 8

Part 5 · High Dimension and Endogeneity

  • Focus: Addressing endogeneity and high-dimensional settings in econometrics.
  • Key Topics:
    • Sources of endogeneity
    • Instrumental variables (IV)
    • Double/directed machine learning techniques
  • Goal: Develop practical inference methods in complex models.
Financial Machine Learning · Lecture 8

Introduction to Endogeneity

  • Definition: Occurs when an explanatory variable correlates with the error term in a regression model.
  • Consequences:
    • Biased and inconsistent estimates.
    • Misleading inference results.
Financial Machine Learning · Lecture 8

Sources of Endogeneity

  1. Omitted Variable Bias:
    • When a relevant variable is left out of the model.
  2. Measurement Error:
    • Errors in the variable values leading to biased estimates.
  3. Simultaneity:
    • Mutual causation between dependent and independent variables.
Financial Machine Learning · Lecture 8

Instrumental Variables (IV)

  • Purpose: Provide valid estimates when endogeneity is present.
  • Conditions:
    1. Relevance: Instrument must be correlated with the endogenous variable.
    2. Exogeneity: Instrument affects the dependent variable only through the endogenous variable.
Financial Machine Learning · Lecture 8

The Two-Stage Least Squares (2SLS) Method

  1. First Stage: Regress the endogenous variable on the instrument(s) and other controls.

  2. Second Stage: Replace the endogenous variable with predicted values and estimate the outcome.

Financial Machine Learning · Lecture 8

Limitations of IV

  • Weak Instruments: Instruments that are weakly correlated with the endogenous variable can lead to unreliable estimates.
  • Overidentification: Using more instruments than necessary may complicate identification.
Financial Machine Learning · Lecture 8

Double/Debiasing Machine Learning (DML)

  • Purpose: Address challenges in high-dimensional settings with endogeneity.
  • Key Features:
    • Combines machine learning with econometric techniques.
    • Enhances estimation precision while addressing bias.
Financial Machine Learning · Lecture 8

DML Framework

  1. Modeling: Uses machine learning algorithms to predict both treatment and outcome.
  2. Debiasing: Adjusts for the bias introduced by selection procedures.
Financial Machine Learning · Lecture 8

Theoretical Foundation of DML

  • Assumptions:
    • Cross-sectional independence between the error terms.
    • Stable treatment effects for observations.
  • Estimation:

Financial Machine Learning · Lecture 8

Practical Application

  • Example: Estimating causal effects of education on wages.
  • Data: Longitudinal data capturing education, experience, and wages.
  • Analysis Steps:
    1. Identify endogenous variables.
    2. Apply 2SLS or DML as needed.
Financial Machine Learning · Lecture 8

Conclusion

  • Summary:
    • High-dimensional data requires robust techniques to address endogeneity.
    • DML serves as an effective approach for causal inference.
    • Strong emphasis on model selection and validation.
Financial Machine Learning · Lecture 8

Further Reading

  • Bibliography:
    • Angrist, J. D., & Pischke, J. S. (2009). "Mostly Harmless Econometrics".
    • Belloni, A., Chen, D., et al. (2014). "Inference in High-Dimensional Sparse Econometric Models".
    • Imbens, G. W., & Rubin, D. B. (2015). "Causal Inference in Statistics, Social, and Biomedical Sciences".
Financial Machine Learning · Lecture 8

Part 6 · Going Further

  • Focus: Advanced topics in econometrics and machine learning.
  • Goals:
    • Explore state-of-the-art methods and their applications.
    • Discuss limitations and future research directions.
Financial Machine Learning · Lecture 8

Advanced Econometric Techniques

  • Machine Learning Integration: Harnessing ML methods like random forests, boosting, and neural networks.
  • Model Flexibility: Adoption of flexible modeling approaches to capture complex relationships.
Financial Machine Learning · Lecture 8

Regularization Techniques

  • Purpose: Prevent overfitting and enhance predictive performance.

  • Common Algorithms:

    • Lasso Regression:

    • Ridge Regression:

Financial Machine Learning · Lecture 8

Model Evaluation Metrics

  • Root Mean Square Error (RMSE):

  • Confusion Matrix: Evaluates classification performance.
Financial Machine Learning · Lecture 8

Causal Inference Frameworks

  • Counterfactual Framework: Establishing causal relationships using potential outcomes.
  • Graphical Models:
    • Directed Acyclic Graphs (DAGs) to illustrate assumptions and dependencies.
Financial Machine Learning · Lecture 8

Limitations and Challenges

  • Data Limitations: Quality and quantity of data can restrict model performance.
  • Interpretability: Balancing model complexity with the need for explainability.
Financial Machine Learning · Lecture 8

Future Directions

  • Research Focus: Addressing complexities in data and methodologies.
  • Collaboration: Interdisciplinary approaches to enrich econometric methods.
Financial Machine Learning · Lecture 8

Conclusion

  • Summary: This chapter enriches understanding of advanced econometric techniques and their applications.
  • Impact: Provides a foundation for future research and method development.
Financial Machine Learning · Lecture 8

Part 7 · Inference on Heterogeneous Effects

  • Focus: Understanding and estimating heterogeneous treatment effects.
  • Key Topics:
    • Methods for heterogeneous effects
    • Implications for policy and treatment personalization
Financial Machine Learning · Lecture 8

Understanding Heterogeneity

  • Definition: Variation in treatment effects across different individuals or groups.
  • Importance: Acknowledges that a single policy may not be effective for everyone.
Financial Machine Learning · Lecture 8

Methods for Estimating Heterogeneous Effects

  • Subgroup Analysis: Stratifying data to observe effects within specific groups.
  • Quantile Regression: Examining effects across different quantiles of the outcome distribution.
Financial Machine Learning · Lecture 8

Statistical Framework

  • Model Specification:

Where:

  • : Outcome
  • : Treatment indicator
  • : Covariates
Financial Machine Learning · Lecture 8

Identifying Heterogeneous Effects

  • Interactions: Incorporating interaction terms to capture effect variations:

  • Random Effects Models: Include random intercepts/slopes to account for individual variations.
Financial Machine Learning · Lecture 8

Use of Machine Learning

  • Flexible Models: Employ machine learning models to identify and estimate treatment heterogeneity.
  • Feature Selection: Utilize high-dimensional covariates for improved accuracy in predictions.
Financial Machine Learning · Lecture 8

Implications for Policy

  • Personalized Treatment: Tailoring interventions based on estimated heterogeneous effects.
  • Evaluation: Regular assessments to gauge the effectiveness across different demographics.
Financial Machine Learning · Lecture 8

Conclusion

  • Summary: Understanding heterogeneity is fundamental for effective policy-making.
  • Future Research: Exploring novel methodologies to estimate and interpret heterogeneous effects effectively.
Financial Machine Learning · Lecture 8

Part 8 · Optimal Policy Learning

  • Focus: Principles of optimal policy learning in econometrics.
  • Key Objectives:
    • Develop methods for policy evaluation and optimization.
    • Establish effective treatment assignment strategies.
Financial Machine Learning · Lecture 8

Introduction to Optimal Policy Learning

  • Concept: Learning policies that maximize some objective function based on data.
  • Applications: Personalized treatment assignment, resource allocation.
  • Challenge: Balancing exploration and exploitation in decision-making.
Financial Machine Learning · Lecture 8

Framework for Policy Learning

  1. Assumptions:

    • The environment influences outcomes through policies.
    • Historical data is pivotal in learning efficient policies.
  2. Objective: Maximize expected utility or outcome, defined as:

Financial Machine Learning · Lecture 8

Treatment Assignment Strategies

  • Randomized Control Trials (RCTs): Benchmark for evaluating policy effectiveness.
  • Adaptive Treatments: Modification of treatment based on observed data over time.
  • Policy Evaluation: Use of estimator techniques to assess various policies' effectiveness.
Financial Machine Learning · Lecture 8

Estimation of Policy Effects

  • Key Metrics:

    • Average Treatment Effect (ATE):

  • Conditional ATE:

Financial Machine Learning · Lecture 8

Exploration vs. Exploitation

  • Exploration: Gathering more information to improve future policies.
  • Exploitation: Utilizing existing knowledge to optimize immediate outcomes.
  • Balancing Act: Policy learning algorithms must manage exploration and exploitation effectively for improved outcomes.
Financial Machine Learning · Lecture 8

Implications for Policy Design

  • Adaptive policies: Tailored as more information becomes available.
  • Robustness: Policies must be resilient to model specifications.
  • Evaluation: Implement regular evaluations to refine policies based on new data.
Financial Machine Learning · Lecture 8

Conclusion

  • Summary: Optimal policy learning is crucial for effective econometric modeling and decision-making.
  • Future Directions: Continued integration of econometric with machine learning techniques for enhanced policy learning.
Financial Machine Learning · Lecture 8

Further Reading

  • Related Works:
    • "Causal Inference in Statistics, Social, and Biomedical Sciences" by Imbens and Rubin.
    • "Reinforcement Learning: An Introduction" by Sutton and Barto.
Financial Machine Learning · Lecture 8

Part 9 · Literatures

Financial Machine Learning · Lecture 8

Literature:Inference on Treatment Effects after Selection among High-Dimensional Controls

Financial Machine Learning · Lecture 8

Inference on Treatment Effects after Selection among High-Dimensional Controls

  • Research Content: This paper presents a novel method for estimating treatment effects in the presence of high-dimensional control variables, particularly when the number of controls exceeds the sample size. The authors propose post-double-selection, a method that conducts two rounds of variable selection followed by treatment effect estimation in a partially linear model:

and

Financial Machine Learning · Lecture 8
  • Main Ideas and Contributions
    • The proposed method allows for the identification of relevant controls by relaxing assumptions about the sparsity of variables.
    • It provides uniformly valid inference, meaning the confidence intervals obtained are robust across a wide class of models.
    • The authors illustrate the effectiveness of this approach with a reanalysis of the impact of abortion on crime rates, showing that their method can yield different conclusions than previous, less rigorous models.
  • Significance for Empirical Finance
    • The methodology enhances empirical finance research by allowing more flexible model specifications without overfitting.
    • It highlights the importance of rigorous variable selection, addressing the potential biases associated with omitted variable errors.
    • The findings compel researchers to use robust, adaptive techniques when estimating treatment effects, thereby contributing to more accurate economic models and policy analyses.
Financial Machine Learning · Lecture 8

Literature:Propensity score estimation with boosted regression

Financial Machine Learning · Lecture 8

Propensity Score Estimation with Boosted Regression

  • Research Content: This paper addresses the challenges of estimating treatment effects in observational studies where participants differ significantly in their pre-treatment characteristics. Traditional methods like logistic regression may struggle with high-dimensional covariates. The authors propose using Generalized Boosted Models (GBM) for estimating propensity scores, enhancing the capacity to capture complex relationships between treatment assignment and covariates.

  • Main Ideas and Contributions

    • Boosting Technique: The paper demonstrates that the boosting technique can effectively model non-linear relationships and interactions among a large number of covariates, thus providing robust propensity score estimates.
    • Case Study Application: The method is illustrated using data from adolescent substance abuse treatment programs, revealing that GBM can significantly alter perceived treatment effects when compared to traditional logistic methods.
Financial Machine Learning · Lecture 8
    • Bias Reduction: By employing GBM, the authors manage to reduce hidden biases and improve covariate balance between treatment and control groups, contributing to better estimates of average treatment effects.
  • Significance for Empirical Finance

  • Enhanced Estimation: The use of GBM provides empirical finance researchers with a powerful tool to adjust for confounding variables, particularly in studies involving complex datasets where traditional methods fail.

  • Adapting to Non-Linearity: Recognizing and modeling non-linear relationships in financial data can lead to improved causal inferences, ultimately aiding in the evaluation of policy impacts and treatment efficacy in economic research.

  • Implications for Causal Inference: Findings underscore the necessity of utilizing advanced statistical techniques like boosting to mitigate biases in observational studies, setting a precedent for future empirical research methodologies.

Financial Machine Learning · Lecture 8

Literature:Recursive Partitioning for Heterogeneous Causal Effects

Financial Machine Learning · Lecture 8

Recursive Partitioning for Heterogeneous Causal Effects

  • Research Content: This paper introduces a novel method for identifying heterogeneous treatment effects using recursive partitioning techniques. The authors develop a framework that allows researchers to uncover variations in treatment effects across different subpopulations based on observed covariates. This approach is particularly relevant in contexts where treatment effects are expected to vary significantly among different groups.

  • Main Ideas and Contributions

    • Methodology: The study demonstrates how recursive partitioning can be effectively applied to causal inference, yielding insights into how different covariate patterns influence treatment effectiveness. This is formalized through the concept of Causal Trees, which create a tree structure reflecting variations in treatment effects.
    • Implementation and Results: The method is tested on synthetic and real data, illustrating robust performance relative to conventional models like linear regression and treatment assignment imbalance.
Financial Machine Learning · Lecture 8
    • Flexibility in Modeling: By accommodating non-linear interactions and allowing for easy interpretability through tree structures, this methodology enhances the understanding of causal relationships.
  • Significance for Empirical Finance

    • Targeted Policy Interventions: This approach can improve the design of financial policies and programs by identifying which subgroups benefit most from interventions, fostering targeted strategies.
    • Enhanced Predictive Performance: Recursive partitioning serves as a powerful tool for empirical finance, enabling the modeling of complex relationships where traditional methods may falter.
    • Guidance for Future Research: The findings encourage the adoption of machine learning techniques for causal inference, suggesting a shift towards more flexible, data-driven approaches in empirical finance research.
Financial Machine Learning · Lecture 8

Literature:Double/debiased machine learning for treatment and structural parameters

Financial Machine Learning · Lecture 8

Double Debiased Machine Learning for Treatment and Structural Parameters

  • Research Content: This paper introduces a Double Debiased Machine Learning (DML) framework designed to improve the estimation of treatment effects and structural parameters in complex econometric models. It addresses the challenge of bias when using machine learning methods for estimating treatment effects, particularly in high-dimensional settings where traditional inference methods may fall short. The DML framework combines regularization techniques with debiasing strategies, enhancing statistical efficiency without compromising the interpretability of the results.

  • Main Ideas and Contributions

    • DML Framework: The paper outlines the DML procedure, which consists of two stages: first, estimating nuisance parameters using machine learning techniques, and second, employing these estimators in a debiased manner to achieve consistent treatment effect estimates. The theoretical foundations of DML ensure that valid inference can be made even in high-dimensional contexts.
Financial Machine Learning · Lecture 8
    • Empirical Validation: The authors showcase the application of their method through simulations and real data examples, demonstrating significant performance improvements over traditional methods.
    • Broader Applicability: This method is generalizable to various econometric settings, making it applicable to a wide range of empirical research scenarios in finance and beyond.
  • Significance for Empirical Finance

    • Improved Estimation: The DML approach allows researchers in finance to obtain more reliable estimates of treatment effects and structural parameters, especially in complicated datasets characterized by many covariates.
    • Robustness: By addressing bias effectively, DML improves the robustness of causal inferences drawn from observational studies, leading to better policy implications and decision-making.
    • Guidance for Research Design: The findings promote the integration of advanced machine learning methods into causal inference, encouraging empirical finance scholars to rethink traditional methodological frameworks.
Financial Machine Learning · Lecture 8

Literature:Robust Nonparametric Confidence Intervals for Regression‐Discontinuity Designs

Financial Machine Learning · Lecture 8

Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs

  • Research Content: This paper focuses on enhancing the statistical inference in regression-discontinuity (RD) designs, a widely used method for causal inference in policy evaluation and economics. The authors develop robust nonparametric confidence intervals that account for potential nonparametric errors at the cutoff, allowing researchers to make credible inferences about treatment effects at that point of discontinuity.

  • Main Ideas and Contributions

    • Robust Methodology: The proposed method improves on conventional parametric approaches by utilizing nonparametric techniques that maintain robustness against misspecifications. This is particularly relevant in cases where the treatment effect can vary significantly at and around the cutoff.
    • Confidence Interval Construction: The authors derive new confidence intervals that are valid under weaker conditions while employing a data-driven bandwidth selection approach, ensuring optimal interval calibration based on the underlying data distribution.
Financial Machine Learning · Lecture 8
    • Simulation Studies: Extensive simulations and applications illustrate the advantages of the proposed method over standard methods, demonstrating greater accuracy and reliability in estimating treatment effects in RD designs.

Significance for Empirical Finance

  • Enhanced Causal Inference: The findings empower researchers in finance to more accurately estimate the impact of interventions and policies that exhibit discontinuities, leading to better-informed decisions.
  • Applicability: This robust framework can be applied to various empirical contexts, including evaluations of fiscal policy, educational programs, and health interventions, enhancing the quality of causal inference across disciplines.
  • Methodological Contribution: Encouraging the adoption of robust nonparametric methods in causal inference highlights the importance of flexibility in model specifications, ultimately leading to more reliable empirical findings in finance and economics.
Financial Machine Learning · Lecture 8

Literature:Policy Learning with Observational Data

Financial Machine Learning · Lecture 8

Policy Learning with Observational Data

  • Research Content: This paper investigates the general challenges and methodologies associated with learning optimal policies from observational data. It develops a comprehensive framework that integrates causal inference with machine learning techniques for effective policy evaluation in non-experimental settings. This approach acknowledges the inherent selection biases present in observational data while aiming to derive robust policy recommendations.

  • Main Ideas and Contributions

    • Framework Development: The authors propose a novel algorithm combining policy learning and counterfactual predictions, allowing for the identification of optimal treatment strategies. The framework encompasses a two-step process: estimating the causal effect of treatments and refining the policy recommendation based on those estimates.
    • Addressing Selection Bias: By utilizing techniques in both causal inference and machine learning, the paper demonstrates how to mitigate biases that typically skew the estimation of treatment effects.
Financial Machine Learning · Lecture 8
    • Empirical Applications: Through various empirical illustrations, the authors validate the effectiveness of their proposed methods in real-world scenarios, showing improvements in policy outcomes compared to traditional approaches.

Significance for Empirical Finance

  • Practical Implications: The findings underscore the potential of leveraging observational data for better policy-making in finance, particularly in contexts where randomized controlled trials are infeasible.
  • Advancing Methodologies: This research contributes a methodological advancement by combining machine learning with causal inference, encouraging empirical finance researchers to adopt more nuanced analytic techniques.
  • Improved Decision-Making: The ability to derive actionable insights from observational data enhances financial decision-making processes, allowing for the formulation of policies that are not only effective but also responsive to real-world complexities.
Financial Machine Learning · Lecture 8

Literature:GANITE Estimation of Individualized Treatment Effects using Generative Adversarial Nets

Financial Machine Learning · Lecture 8

GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets

  • Research Content: This paper introduces GANITE, a novel framework utilizing Generative Adversarial Networks (GANs) to estimate Individualized Treatment Effects (ITE) from observational data. The authors aim to address challenges in traditional ITE estimation methods, particularly in high-dimensional covariate spaces where conventional techniques tend to underperform or overfit.

  • Main Ideas and Contributions

    • Generative Approach: GANITE leverages the power of GANs to generate counterfactual outcomes, enabling more accurate estimation of the treatment effect for each individual. The framework incorporates a generator that models treatment assignment and an adversarial discriminator that assists in assessing the quality of counterfactual predictions.
    • Improved ITE Estimation: By formalizing the relationship between treatment assignment, covariates, and outcomes, GANITE enhances the precision of ITE estimates compared to traditional regression-based methods.
Financial Machine Learning · Lecture 8
    • Empirical Validation: The authors validate their approach through extensive simulations and real-world data applications, demonstrating significant improvements in estimating individualized treatment responses over existing methodologies.
  • Significance for Empirical Finance

    • Personalized Decision Making: GANITE's robust estimation of ITE allows financial analysts and policymakers to tailor interventions to individual characteristics, enhancing the efficacy of financial products and services.
    • Advancements in Methodology: The integration of GANs into causal inference represents a methodological leap, encouraging researchers to adopt advanced machine learning techniques in empirical finance studies.
    • Insightful Policy Formulation: By providing accurate treatment effect estimates, GANITE aids in the formulation of data-driven policies that respond to nuanced economic behaviors and conditions, contributing to more effective financial governance.
Financial Machine Learning · Lecture 8