深度学习在量化交易中的应用¶

简介¶

本笔记本展示了五个深度学习在量化金融领域的实战应用案例:

  1. 使用多层感知机(MLP)预测股票收益率
  2. 使用长短期记忆网络(LSTM)预测金融时间序列波动率
  3. 使用卷积神经网络(CNN)识别金融时间序列模式
  4. 使用生成对抗网络(GAN)生成合成金融时间序列
  5. 使用深度强化学习(DDPG)进行投资组合优化

每个案例都包含详细的实现代码、模型解释和结果可视化。

安装所需的库¶

在运行以下代码之前,请确保已安装所有必要的库。

In [1]:
# 安装所需的库
# %pip install numpy pandas matplotlib yfinance scikit-learn tensorflow 
# %conda install ta-lib

示例2:使用长短期记忆网络(LSTM)预测金融时间序列波动率¶

问题描述¶

波动率是衡量金融资产价格变动剧烈程度的重要指标,对期权定价、风险管理和交易决策至关重要。本示例将使用长短期记忆网络(LSTM)预测股票的未来波动率。

长短期记忆网络(LSTM)简介¶

LSTM是循环神经网络(RNN)的一种特殊形式,专门设计用于处理序列数据中的长期依赖关系。LSTM通过引入门控机制(输入门、遗忘门和输出门)解决了传统RNN的梯度消失问题,能够有效学习长序列中的模式。

模型架构¶

  • 输入层:接收历史波动率序列
  • LSTM层:捕捉时间序列中的长期依赖关系
  • 输出层:预测未来的波动率

实现步骤¶

  1. 获取股票历史数据
  2. 计算历史波动率(使用收益率的滚动标准差)
  3. 创建序列数据供LSTM学习
  4. 构建和训练LSTM模型
  5. 评估模型并预测未来波动率
In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
import warnings
warnings.filterwarnings('ignore')

# 设置随机种子以确保结果可重现
np.random.seed(42)
tf.random.set_seed(42)
2025-05-13 09:54:01.205899: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-05-13 09:54:01.206440: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-13 09:54:01.208538: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-13 09:54:01.214454: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1747101241.224577 1515429 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1747101241.227468 1515429 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1747101241.235003 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1747101241.235012 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1747101241.235013 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1747101241.235014 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
2025-05-13 09:54:01.238061: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
In [3]:
# 生成合成金融数据
def generate_synthetic_data(periods=2000, seed=42):
    """
    生成合成金融数据,包括价格和波动率
    
    参数:
        periods (int): 要生成的数据点数量
        seed (int): 随机数生成种子
        
    返回:
        pandas.DataFrame: 包含合成价格和波动率数据的DataFrame
    """
    np.random.seed(seed)
    
    # 创建日期索引 - 交易日为工作日
    end_date = pd.Timestamp.now().normalize()
    start_date = end_date - pd.Timedelta(days=int(periods * 1.5))  # 额外的日期以考虑周末和假期
    dates = pd.date_range(start=start_date, end=end_date, freq='B')[:periods]
    
    # 初始化数据框
    df = pd.DataFrame(index=dates)
    
    # 初始价格
    initial_price = 100.0
    
    # 生成具有波动率聚类特性的波动率序列
    # 波动率聚类是指高波动率倾向于跟随高波动率,低波动率倾向于跟随低波动率
    volatility = np.zeros(periods)
    vol = 0.01  # 初始波动率水平
    for i in range(periods):
        # 有5%的概率发生波动率冲击
        if np.random.random() < 0.05:
            # 波动率冲击可能是正面的(波动率增加)或负面的(波动率减少)
            shock = np.random.choice([-0.005, 0.005, 0.01, 0.015, 0.02])
            vol = max(0.002, vol + shock)  # 确保波动率为正
        
        # 波动率具有均值回归特性
        vol = vol * 0.98 + 0.01 * 0.02  # 缓慢回归到1%
        
        # 添加随机波动
        vol = max(0.002, vol * (1 + 0.1 * np.random.randn()))
        
        volatility[i] = vol
    
    # 基于波动率生成回报率
    returns = np.zeros(periods)
    for i in range(periods):
        # 基于当前波动率水平生成日回报率
        returns[i] = np.random.normal(0.0002, volatility[i])  # 均值略大于0,代表市场的正向偏向
    
    # 从回报率计算价格序列
    price = np.zeros(periods)
    price[0] = initial_price
    for i in range(1, periods):
        price[i] = price[i-1] * (1 + returns[i])
    
    # 创建OHLC价格数据
    df['Open'] = price
    df['High'] = price * (1 + np.abs(np.random.normal(0, 0.002, periods)))  # 略高于收盘价
    df['Low'] = price * (1 - np.abs(np.random.normal(0, 0.002, periods)))   # 略低于收盘价
    df['Close'] = price
    df['Volume'] = np.random.randint(100000, 10000000, periods)  # 随机成交量
    
    # 计算实际回报率
    df['Return'] = df['Close'].pct_change()
    
    # 计算实际波动率 (20日滚动标准差,年化)
    df['Volatility'] = df['Return'].rolling(window=20).std() * np.sqrt(252)
    
    # 删除NaN
    df.dropna(inplace=True)
    
    print(f"成功生成合成金融数据,共{len(df)}条记录")
    return df
In [4]:
# 计算历史波动率
def calculate_volatility(df, window=20):
    """
    计算历史波动率
    
    参数:
        df (pandas.DataFrame): 价格数据
        window (int): 计算波动率的窗口大小
        
    返回:
        pandas.DataFrame: 包含波动率的DataFrame
    """
    # 计算每日回报率
    df['Return'] = df['Close'].pct_change()
    
    # 计算滚动波动率 (20日年化波动率)
    # 乘以sqrt(252)将日波动率转换为年化波动率,252是一年中的交易日数量
    df['Volatility'] = df['Return'].rolling(window=window).std() * np.sqrt(252)
    
    # 删除含有NaN的行
    df.dropna(inplace=True)
    
    return df
In [5]:
# 创建序列数据
def create_sequences(df, target_col, sequence_length=20):
    """
    创建用于LSTM的序列数据
    
    参数:
        df (pandas.DataFrame): 包含目标列的DataFrame
        target_col (str): 目标列名
        sequence_length (int): 序列长度
        
    返回:
        tuple: (X, y) 输入序列和目标值
    """
    X, y = [], []
    for i in range(len(df) - sequence_length):
        # 序列X: t到t+sequence_length-1的数据
        # 目标y: t+sequence_length的数据
        X.append(df[target_col].values[i:i+sequence_length])
        y.append(df[target_col].values[i+sequence_length])
    
    return np.array(X), np.array(y)
In [6]:
# 准备数据
def prepare_data(df, target_col, sequence_length=20, test_size=0.2):
    """
    准备训练和测试数据
    
    参数:
        df (pandas.DataFrame): 源数据
        target_col (str): 目标列名
        sequence_length (int): 序列长度
        test_size (float): 测试集比例
        
    返回:
        tuple: 训练和测试数据
    """
    # 创建序列
    X, y = create_sequences(df, target_col, sequence_length)
    
    # 调整形状以适合LSTM (samples, time steps, features)
    X = X.reshape(X.shape[0], X.shape[1], 1)
    
    # 划分训练集和测试集
    train_size = int(len(X) * (1 - test_size))
    X_train, X_test = X[:train_size], X[train_size:]
    y_train, y_test = y[:train_size], y[train_size:]
    
    return X_train, X_test, y_train, y_test
In [7]:
# 构建LSTM模型
def build_lstm_model(sequence_length):
    """
    构建LSTM模型
    
    参数:
        sequence_length (int): 输入序列长度
        
    返回:
        tf.keras.Model: 构建的LSTM模型
    """
    model = Sequential([
        # 第一个LSTM层,返回完整序列以供第二个LSTM层处理
        LSTM(50, return_sequences=True, input_shape=(sequence_length, 1)),
        Dropout(0.2),  # 减少过拟合
        
        # 第二个LSTM层
        LSTM(50),
        Dropout(0.2),
        
        # 输出层 - 预测波动率(回归问题)
        Dense(1)
    ])
    
    # 编译模型
    model.compile(
        optimizer=Adam(learning_rate=0.001),
        loss='mse',           # 均方误差损失函数
        metrics=['mae']       # 同时监控平均绝对误差
    )
    
    return model
In [8]:
# 主函数
def main():
    # 生成合成金融数据,而不是获取真实的股票数据
    data = generate_synthetic_data(periods=2000, seed=42)
    
    # 显示波动率数据
    plt.figure(figsize=(14, 6))
    plt.plot(data.index, data['Volatility'])
    plt.title('合成数据历史波动率 (年化)')
    plt.xlabel('日期')
    plt.ylabel('波动率')
    plt.grid(True, alpha=0.3)
    plt.show()
    
    # 定义目标列和序列长度
    target_col = 'Volatility'
    sequence_length = 20  # 使用20天的历史数据预测
    
    # 准备数据
    X_train, X_test, y_train, y_test = prepare_data(data, target_col, sequence_length)
    print(f'训练集大小: {X_train.shape}, 测试集大小: {X_test.shape}')
    
    # 构建模型
    model = build_lstm_model(sequence_length)
    model.summary()
    
    # 训练模型
    history = model.fit(
        X_train, y_train,
        epochs=50,
        batch_size=32,
        validation_split=0.2,
        callbacks=[tf.keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)],
        verbose=1
    )
    
    # 评估模型
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mae = mean_absolute_error(y_test, y_pred)
    rmse = np.sqrt(mse)
    
    print(f'测试集MSE: {mse:.6f}')
    print(f'测试集RMSE: {rmse:.6f}')
    print(f'测试集MAE: {mae:.6f}')
    
    # 可视化训练历史
    plt.figure(figsize=(12, 5))
    plt.subplot(1, 2, 1)
    plt.plot(history.history['loss'], label='Training Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.title('Loss History')
    plt.xlabel('Epoch')
    plt.ylabel('Loss (MSE)')
    plt.legend()
    
    # 可视化预测结果
    plt.subplot(1, 2, 2)
    plt.plot(y_test, label='Actual Volatility')
    plt.plot(y_pred, label='Predicted Volatility')
    plt.title('Volatility Prediction')
    plt.xlabel('Trading Days')
    plt.ylabel('Volatility (Annualized)')
    plt.legend()
    plt.tight_layout()
    plt.show()
    
    # 将预测结果与实际数据合并
    # 为表格和后续分析准备数据
    test_start_idx = len(data) - len(y_test)
    test_data = data.iloc[test_start_idx:].copy()
    test_data['Predicted_Volatility'] = y_pred
    
    # 计算预测误差
    test_data['Prediction_Error'] = test_data['Volatility'] - test_data['Predicted_Volatility']
    test_data['Absolute_Error'] = np.abs(test_data['Prediction_Error'])
    test_data['Percentage_Error'] = test_data['Absolute_Error'] / test_data['Volatility'] * 100
    
    # 打印平均预测误差
    mean_percentage_error = test_data['Percentage_Error'].mean()
    print(f'平均百分比误差: {mean_percentage_error:.2f}%')
    
    # 可视化完整结果
    plt.figure(figsize=(14, 7))
    plt.plot(data.index, data['Volatility'], label='Historical Volatility', alpha=0.7)
    plt.plot(test_data.index, test_data['Predicted_Volatility'], label='Predicted Volatility', color='red')
    plt.title('Volatility Prediction using LSTM on Synthetic Data')
    plt.xlabel('Date')
    plt.ylabel('Volatility (Annualized)')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()
    
    # 绘制预测误差分布
    plt.figure(figsize=(12, 5))
    plt.subplot(1, 2, 1)
    plt.hist(test_data['Prediction_Error'], bins=30)
    plt.title('Prediction Error Distribution')
    plt.xlabel('Error')
    plt.ylabel('Frequency')
    
    plt.subplot(1, 2, 2)
    plt.plot(test_data.index, test_data['Prediction_Error'])
    plt.title('Prediction Error Over Time')
    plt.xlabel('Date')
    plt.ylabel('Error')
    plt.axhline(y=0, color='r', linestyle='-')
    plt.tight_layout()
    plt.show()

# 运行主函数
if __name__ == "__main__":
    main()
成功生成合成金融数据,共1980条记录
No description has been provided for this image
训练集大小: (1568, 20, 1), 测试集大小: (392, 20, 1)
E0000 00:00:1747101242.388621 1515429 cuda_executor.cc:1228] INTERNAL: CUDA Runtime error: Failed call to cudaGetRuntimeVersion: Error loading CUDA libraries. GPU will not be used.: Error loading CUDA libraries. GPU will not be used.
W0000 00:00:1747101242.390682 1515429 gpu_device.cc:2341] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM)                     │ (None, 20, 50)         │        10,400 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 20, 50)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM)                   │ (None, 50)             │        20,200 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 50)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 1)              │            51 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 30,651 (119.73 KB)
 Trainable params: 30,651 (119.73 KB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 2s 12ms/step - loss: 0.1821 - mae: 0.2663 - val_loss: 0.0160 - val_mae: 0.0904
Epoch 2/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0254 - mae: 0.1023 - val_loss: 0.0105 - val_mae: 0.0734
Epoch 3/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0202 - mae: 0.0912 - val_loss: 0.0081 - val_mae: 0.0655
Epoch 4/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0163 - mae: 0.0789 - val_loss: 0.0077 - val_mae: 0.0633
Epoch 5/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0123 - mae: 0.0703 - val_loss: 0.0078 - val_mae: 0.0655
Epoch 6/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0103 - mae: 0.0670 - val_loss: 0.0063 - val_mae: 0.0562
Epoch 7/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0091 - mae: 0.0646 - val_loss: 0.0060 - val_mae: 0.0545
Epoch 8/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0102 - mae: 0.0658 - val_loss: 0.0125 - val_mae: 0.0903
Epoch 9/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0167 - mae: 0.0780 - val_loss: 0.0055 - val_mae: 0.0511
Epoch 10/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0099 - mae: 0.0655 - val_loss: 0.0063 - val_mae: 0.0565
Epoch 11/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0098 - mae: 0.0635 - val_loss: 0.0050 - val_mae: 0.0480
Epoch 12/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0093 - mae: 0.0627 - val_loss: 0.0048 - val_mae: 0.0500
Epoch 13/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0097 - mae: 0.0611 - val_loss: 0.0050 - val_mae: 0.0539
Epoch 14/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0103 - mae: 0.0626 - val_loss: 0.0048 - val_mae: 0.0515
Epoch 15/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0080 - mae: 0.0588 - val_loss: 0.0047 - val_mae: 0.0513
Epoch 16/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0078 - mae: 0.0574 - val_loss: 0.0045 - val_mae: 0.0482
Epoch 17/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0090 - mae: 0.0625 - val_loss: 0.0052 - val_mae: 0.0563
Epoch 18/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0079 - mae: 0.0583 - val_loss: 0.0042 - val_mae: 0.0478
Epoch 19/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0087 - mae: 0.0603 - val_loss: 0.0039 - val_mae: 0.0439
Epoch 20/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0086 - mae: 0.0600 - val_loss: 0.0048 - val_mae: 0.0550
Epoch 21/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0097 - mae: 0.0633 - val_loss: 0.0049 - val_mae: 0.0546
Epoch 22/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0093 - mae: 0.0574 - val_loss: 0.0037 - val_mae: 0.0419
Epoch 23/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0075 - mae: 0.0557 - val_loss: 0.0055 - val_mae: 0.0588
Epoch 24/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0083 - mae: 0.0582 - val_loss: 0.0035 - val_mae: 0.0424
Epoch 25/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0078 - mae: 0.0551 - val_loss: 0.0034 - val_mae: 0.0400
Epoch 26/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0086 - mae: 0.0559 - val_loss: 0.0043 - val_mae: 0.0487
Epoch 27/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0092 - mae: 0.0597 - val_loss: 0.0032 - val_mae: 0.0402
Epoch 28/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0084 - mae: 0.0554 - val_loss: 0.0031 - val_mae: 0.0384
Epoch 29/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0087 - mae: 0.0564 - val_loss: 0.0033 - val_mae: 0.0406
Epoch 30/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0073 - mae: 0.0539 - val_loss: 0.0035 - val_mae: 0.0420
Epoch 31/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0071 - mae: 0.0534 - val_loss: 0.0030 - val_mae: 0.0386
Epoch 32/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0068 - mae: 0.0501 - val_loss: 0.0035 - val_mae: 0.0436
Epoch 33/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0064 - mae: 0.0514 - val_loss: 0.0029 - val_mae: 0.0376
Epoch 34/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0063 - mae: 0.0512 - val_loss: 0.0029 - val_mae: 0.0364
Epoch 35/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0064 - mae: 0.0505 - val_loss: 0.0027 - val_mae: 0.0354
Epoch 36/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0064 - mae: 0.0499 - val_loss: 0.0027 - val_mae: 0.0350
Epoch 37/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0066 - mae: 0.0495 - val_loss: 0.0031 - val_mae: 0.0394
Epoch 38/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0075 - mae: 0.0521 - val_loss: 0.0028 - val_mae: 0.0358
Epoch 39/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0081 - mae: 0.0532 - val_loss: 0.0029 - val_mae: 0.0382
Epoch 40/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0080 - mae: 0.0546 - val_loss: 0.0030 - val_mae: 0.0388
Epoch 41/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0065 - mae: 0.0519 - val_loss: 0.0025 - val_mae: 0.0333
Epoch 42/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0065 - mae: 0.0506 - val_loss: 0.0026 - val_mae: 0.0341
Epoch 43/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0063 - mae: 0.0494 - val_loss: 0.0031 - val_mae: 0.0394
Epoch 44/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0066 - mae: 0.0494 - val_loss: 0.0024 - val_mae: 0.0328
Epoch 45/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0065 - mae: 0.0480 - val_loss: 0.0025 - val_mae: 0.0333
Epoch 46/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0061 - mae: 0.0471 - val_loss: 0.0025 - val_mae: 0.0341
Epoch 47/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0053 - mae: 0.0465 - val_loss: 0.0026 - val_mae: 0.0359
Epoch 48/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0069 - mae: 0.0507 - val_loss: 0.0026 - val_mae: 0.0348
Epoch 49/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0060 - mae: 0.0491 - val_loss: 0.0022 - val_mae: 0.0318
Epoch 50/50
40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0055 - mae: 0.0460 - val_loss: 0.0023 - val_mae: 0.0318
13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step
测试集MSE: 0.002013
测试集RMSE: 0.044871
测试集MAE: 0.031868
No description has been provided for this image
平均百分比误差: 6.32%
No description has been provided for this image
No description has been provided for this image

结果分析与解释¶

在这个示例中,我们使用LSTM模型预测股票的波动率。以下是对结果的分析:

  1. 模型架构:

    • 两层堆叠LSTM结构能够捕捉波动率的长期和短期模式
    • Dropout层帮助防止过拟合
  2. 预测性能:

    • 评估指标(MSE、RMSE、MAE)显示了模型的预测误差
    • 预测曲线与实际波动率曲线的匹配程度
    • 平均百分比误差反映了预测的相对精度
  3. 误差分析:

    • 误差分布图显示模型是否有系统性偏差
    • 误差时间序列图显示预测误差随时间的变化

波动率预测的应用¶

  1. 期权定价:波动率是Black-Scholes期权定价模型的关键输入
  2. 风险管理:波动率预测帮助评估投资组合的风险敞口
  3. 交易策略:可以基于波动率预测设计波动率套利或均值回归交易策略
  4. 资产配置:调整投资组合中高波动资产的权重

改进方向¶

  1. 引入更多特征,如交易量、市场情绪指标、宏观经济数据等
  2. 尝试更复杂的LSTM架构,如双向LSTM或注意力机制
  3. 考虑波动率的异质性,可能需要不同市场环境下的不同模型
  4. 探索GARCH等传统波动率模型与深度学习模型的结合