安装所需的库¶
在运行以下代码之前,请确保已安装所有必要的库。
In [1]:
# 安装所需的库
# %pip install numpy pandas matplotlib yfinance scikit-learn tensorflow
# %conda install ta-lib
示例2:使用长短期记忆网络(LSTM)预测金融时间序列波动率¶
问题描述¶
波动率是衡量金融资产价格变动剧烈程度的重要指标,对期权定价、风险管理和交易决策至关重要。本示例将使用长短期记忆网络(LSTM)预测股票的未来波动率。
长短期记忆网络(LSTM)简介¶
LSTM是循环神经网络(RNN)的一种特殊形式,专门设计用于处理序列数据中的长期依赖关系。LSTM通过引入门控机制(输入门、遗忘门和输出门)解决了传统RNN的梯度消失问题,能够有效学习长序列中的模式。
模型架构¶
- 输入层:接收历史波动率序列
- LSTM层:捕捉时间序列中的长期依赖关系
- 输出层:预测未来的波动率
实现步骤¶
- 获取股票历史数据
- 计算历史波动率(使用收益率的滚动标准差)
- 创建序列数据供LSTM学习
- 构建和训练LSTM模型
- 评估模型并预测未来波动率
In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
import warnings
warnings.filterwarnings('ignore')
# 设置随机种子以确保结果可重现
np.random.seed(42)
tf.random.set_seed(42)
2025-05-13 09:54:01.205899: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. 2025-05-13 09:54:01.206440: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2025-05-13 09:54:01.208538: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used. 2025-05-13 09:54:01.214454: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1747101241.224577 1515429 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1747101241.227468 1515429 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered W0000 00:00:1747101241.235003 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1747101241.235012 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1747101241.235013 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. W0000 00:00:1747101241.235014 1515429 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once. 2025-05-13 09:54:01.238061: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
In [3]:
# 生成合成金融数据
def generate_synthetic_data(periods=2000, seed=42):
"""
生成合成金融数据,包括价格和波动率
参数:
periods (int): 要生成的数据点数量
seed (int): 随机数生成种子
返回:
pandas.DataFrame: 包含合成价格和波动率数据的DataFrame
"""
np.random.seed(seed)
# 创建日期索引 - 交易日为工作日
end_date = pd.Timestamp.now().normalize()
start_date = end_date - pd.Timedelta(days=int(periods * 1.5)) # 额外的日期以考虑周末和假期
dates = pd.date_range(start=start_date, end=end_date, freq='B')[:periods]
# 初始化数据框
df = pd.DataFrame(index=dates)
# 初始价格
initial_price = 100.0
# 生成具有波动率聚类特性的波动率序列
# 波动率聚类是指高波动率倾向于跟随高波动率,低波动率倾向于跟随低波动率
volatility = np.zeros(periods)
vol = 0.01 # 初始波动率水平
for i in range(periods):
# 有5%的概率发生波动率冲击
if np.random.random() < 0.05:
# 波动率冲击可能是正面的(波动率增加)或负面的(波动率减少)
shock = np.random.choice([-0.005, 0.005, 0.01, 0.015, 0.02])
vol = max(0.002, vol + shock) # 确保波动率为正
# 波动率具有均值回归特性
vol = vol * 0.98 + 0.01 * 0.02 # 缓慢回归到1%
# 添加随机波动
vol = max(0.002, vol * (1 + 0.1 * np.random.randn()))
volatility[i] = vol
# 基于波动率生成回报率
returns = np.zeros(periods)
for i in range(periods):
# 基于当前波动率水平生成日回报率
returns[i] = np.random.normal(0.0002, volatility[i]) # 均值略大于0,代表市场的正向偏向
# 从回报率计算价格序列
price = np.zeros(periods)
price[0] = initial_price
for i in range(1, periods):
price[i] = price[i-1] * (1 + returns[i])
# 创建OHLC价格数据
df['Open'] = price
df['High'] = price * (1 + np.abs(np.random.normal(0, 0.002, periods))) # 略高于收盘价
df['Low'] = price * (1 - np.abs(np.random.normal(0, 0.002, periods))) # 略低于收盘价
df['Close'] = price
df['Volume'] = np.random.randint(100000, 10000000, periods) # 随机成交量
# 计算实际回报率
df['Return'] = df['Close'].pct_change()
# 计算实际波动率 (20日滚动标准差,年化)
df['Volatility'] = df['Return'].rolling(window=20).std() * np.sqrt(252)
# 删除NaN
df.dropna(inplace=True)
print(f"成功生成合成金融数据,共{len(df)}条记录")
return df
In [4]:
# 计算历史波动率
def calculate_volatility(df, window=20):
"""
计算历史波动率
参数:
df (pandas.DataFrame): 价格数据
window (int): 计算波动率的窗口大小
返回:
pandas.DataFrame: 包含波动率的DataFrame
"""
# 计算每日回报率
df['Return'] = df['Close'].pct_change()
# 计算滚动波动率 (20日年化波动率)
# 乘以sqrt(252)将日波动率转换为年化波动率,252是一年中的交易日数量
df['Volatility'] = df['Return'].rolling(window=window).std() * np.sqrt(252)
# 删除含有NaN的行
df.dropna(inplace=True)
return df
In [5]:
# 创建序列数据
def create_sequences(df, target_col, sequence_length=20):
"""
创建用于LSTM的序列数据
参数:
df (pandas.DataFrame): 包含目标列的DataFrame
target_col (str): 目标列名
sequence_length (int): 序列长度
返回:
tuple: (X, y) 输入序列和目标值
"""
X, y = [], []
for i in range(len(df) - sequence_length):
# 序列X: t到t+sequence_length-1的数据
# 目标y: t+sequence_length的数据
X.append(df[target_col].values[i:i+sequence_length])
y.append(df[target_col].values[i+sequence_length])
return np.array(X), np.array(y)
In [6]:
# 准备数据
def prepare_data(df, target_col, sequence_length=20, test_size=0.2):
"""
准备训练和测试数据
参数:
df (pandas.DataFrame): 源数据
target_col (str): 目标列名
sequence_length (int): 序列长度
test_size (float): 测试集比例
返回:
tuple: 训练和测试数据
"""
# 创建序列
X, y = create_sequences(df, target_col, sequence_length)
# 调整形状以适合LSTM (samples, time steps, features)
X = X.reshape(X.shape[0], X.shape[1], 1)
# 划分训练集和测试集
train_size = int(len(X) * (1 - test_size))
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]
return X_train, X_test, y_train, y_test
In [7]:
# 构建LSTM模型
def build_lstm_model(sequence_length):
"""
构建LSTM模型
参数:
sequence_length (int): 输入序列长度
返回:
tf.keras.Model: 构建的LSTM模型
"""
model = Sequential([
# 第一个LSTM层,返回完整序列以供第二个LSTM层处理
LSTM(50, return_sequences=True, input_shape=(sequence_length, 1)),
Dropout(0.2), # 减少过拟合
# 第二个LSTM层
LSTM(50),
Dropout(0.2),
# 输出层 - 预测波动率(回归问题)
Dense(1)
])
# 编译模型
model.compile(
optimizer=Adam(learning_rate=0.001),
loss='mse', # 均方误差损失函数
metrics=['mae'] # 同时监控平均绝对误差
)
return model
In [8]:
# 主函数
def main():
# 生成合成金融数据,而不是获取真实的股票数据
data = generate_synthetic_data(periods=2000, seed=42)
# 显示波动率数据
plt.figure(figsize=(14, 6))
plt.plot(data.index, data['Volatility'])
plt.title('合成数据历史波动率 (年化)')
plt.xlabel('日期')
plt.ylabel('波动率')
plt.grid(True, alpha=0.3)
plt.show()
# 定义目标列和序列长度
target_col = 'Volatility'
sequence_length = 20 # 使用20天的历史数据预测
# 准备数据
X_train, X_test, y_train, y_test = prepare_data(data, target_col, sequence_length)
print(f'训练集大小: {X_train.shape}, 测试集大小: {X_test.shape}')
# 构建模型
model = build_lstm_model(sequence_length)
model.summary()
# 训练模型
history = model.fit(
X_train, y_train,
epochs=50,
batch_size=32,
validation_split=0.2,
callbacks=[tf.keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)],
verbose=1
)
# 评估模型
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mse)
print(f'测试集MSE: {mse:.6f}')
print(f'测试集RMSE: {rmse:.6f}')
print(f'测试集MAE: {mae:.6f}')
# 可视化训练历史
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Loss History')
plt.xlabel('Epoch')
plt.ylabel('Loss (MSE)')
plt.legend()
# 可视化预测结果
plt.subplot(1, 2, 2)
plt.plot(y_test, label='Actual Volatility')
plt.plot(y_pred, label='Predicted Volatility')
plt.title('Volatility Prediction')
plt.xlabel('Trading Days')
plt.ylabel('Volatility (Annualized)')
plt.legend()
plt.tight_layout()
plt.show()
# 将预测结果与实际数据合并
# 为表格和后续分析准备数据
test_start_idx = len(data) - len(y_test)
test_data = data.iloc[test_start_idx:].copy()
test_data['Predicted_Volatility'] = y_pred
# 计算预测误差
test_data['Prediction_Error'] = test_data['Volatility'] - test_data['Predicted_Volatility']
test_data['Absolute_Error'] = np.abs(test_data['Prediction_Error'])
test_data['Percentage_Error'] = test_data['Absolute_Error'] / test_data['Volatility'] * 100
# 打印平均预测误差
mean_percentage_error = test_data['Percentage_Error'].mean()
print(f'平均百分比误差: {mean_percentage_error:.2f}%')
# 可视化完整结果
plt.figure(figsize=(14, 7))
plt.plot(data.index, data['Volatility'], label='Historical Volatility', alpha=0.7)
plt.plot(test_data.index, test_data['Predicted_Volatility'], label='Predicted Volatility', color='red')
plt.title('Volatility Prediction using LSTM on Synthetic Data')
plt.xlabel('Date')
plt.ylabel('Volatility (Annualized)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# 绘制预测误差分布
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.hist(test_data['Prediction_Error'], bins=30)
plt.title('Prediction Error Distribution')
plt.xlabel('Error')
plt.ylabel('Frequency')
plt.subplot(1, 2, 2)
plt.plot(test_data.index, test_data['Prediction_Error'])
plt.title('Prediction Error Over Time')
plt.xlabel('Date')
plt.ylabel('Error')
plt.axhline(y=0, color='r', linestyle='-')
plt.tight_layout()
plt.show()
# 运行主函数
if __name__ == "__main__":
main()
成功生成合成金融数据,共1980条记录
训练集大小: (1568, 20, 1), 测试集大小: (392, 20, 1)
E0000 00:00:1747101242.388621 1515429 cuda_executor.cc:1228] INTERNAL: CUDA Runtime error: Failed call to cudaGetRuntimeVersion: Error loading CUDA libraries. GPU will not be used.: Error loading CUDA libraries. GPU will not be used. W0000 00:00:1747101242.390682 1515429 gpu_device.cc:2341] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform. Skipping registering GPU devices...
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ lstm (LSTM) │ (None, 20, 50) │ 10,400 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (Dropout) │ (None, 20, 50) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ lstm_1 (LSTM) │ (None, 50) │ 20,200 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (Dropout) │ (None, 50) │ 0 │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (Dense) │ (None, 1) │ 51 │ └─────────────────────────────────┴────────────────────────┴───────────────┘
Total params: 30,651 (119.73 KB)
Trainable params: 30,651 (119.73 KB)
Non-trainable params: 0 (0.00 B)
Epoch 1/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 2s 12ms/step - loss: 0.1821 - mae: 0.2663 - val_loss: 0.0160 - val_mae: 0.0904 Epoch 2/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0254 - mae: 0.1023 - val_loss: 0.0105 - val_mae: 0.0734 Epoch 3/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0202 - mae: 0.0912 - val_loss: 0.0081 - val_mae: 0.0655 Epoch 4/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0163 - mae: 0.0789 - val_loss: 0.0077 - val_mae: 0.0633 Epoch 5/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0123 - mae: 0.0703 - val_loss: 0.0078 - val_mae: 0.0655 Epoch 6/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0103 - mae: 0.0670 - val_loss: 0.0063 - val_mae: 0.0562 Epoch 7/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0091 - mae: 0.0646 - val_loss: 0.0060 - val_mae: 0.0545 Epoch 8/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0102 - mae: 0.0658 - val_loss: 0.0125 - val_mae: 0.0903 Epoch 9/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0167 - mae: 0.0780 - val_loss: 0.0055 - val_mae: 0.0511 Epoch 10/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0099 - mae: 0.0655 - val_loss: 0.0063 - val_mae: 0.0565 Epoch 11/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0098 - mae: 0.0635 - val_loss: 0.0050 - val_mae: 0.0480 Epoch 12/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0093 - mae: 0.0627 - val_loss: 0.0048 - val_mae: 0.0500 Epoch 13/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0097 - mae: 0.0611 - val_loss: 0.0050 - val_mae: 0.0539 Epoch 14/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0103 - mae: 0.0626 - val_loss: 0.0048 - val_mae: 0.0515 Epoch 15/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0080 - mae: 0.0588 - val_loss: 0.0047 - val_mae: 0.0513 Epoch 16/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0078 - mae: 0.0574 - val_loss: 0.0045 - val_mae: 0.0482 Epoch 17/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0090 - mae: 0.0625 - val_loss: 0.0052 - val_mae: 0.0563 Epoch 18/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0079 - mae: 0.0583 - val_loss: 0.0042 - val_mae: 0.0478 Epoch 19/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0087 - mae: 0.0603 - val_loss: 0.0039 - val_mae: 0.0439 Epoch 20/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0086 - mae: 0.0600 - val_loss: 0.0048 - val_mae: 0.0550 Epoch 21/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0097 - mae: 0.0633 - val_loss: 0.0049 - val_mae: 0.0546 Epoch 22/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0093 - mae: 0.0574 - val_loss: 0.0037 - val_mae: 0.0419 Epoch 23/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0075 - mae: 0.0557 - val_loss: 0.0055 - val_mae: 0.0588 Epoch 24/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0083 - mae: 0.0582 - val_loss: 0.0035 - val_mae: 0.0424 Epoch 25/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0078 - mae: 0.0551 - val_loss: 0.0034 - val_mae: 0.0400 Epoch 26/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0086 - mae: 0.0559 - val_loss: 0.0043 - val_mae: 0.0487 Epoch 27/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0092 - mae: 0.0597 - val_loss: 0.0032 - val_mae: 0.0402 Epoch 28/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0084 - mae: 0.0554 - val_loss: 0.0031 - val_mae: 0.0384 Epoch 29/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0087 - mae: 0.0564 - val_loss: 0.0033 - val_mae: 0.0406 Epoch 30/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0073 - mae: 0.0539 - val_loss: 0.0035 - val_mae: 0.0420 Epoch 31/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0071 - mae: 0.0534 - val_loss: 0.0030 - val_mae: 0.0386 Epoch 32/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0068 - mae: 0.0501 - val_loss: 0.0035 - val_mae: 0.0436 Epoch 33/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0064 - mae: 0.0514 - val_loss: 0.0029 - val_mae: 0.0376 Epoch 34/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0063 - mae: 0.0512 - val_loss: 0.0029 - val_mae: 0.0364 Epoch 35/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0064 - mae: 0.0505 - val_loss: 0.0027 - val_mae: 0.0354 Epoch 36/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0064 - mae: 0.0499 - val_loss: 0.0027 - val_mae: 0.0350 Epoch 37/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0066 - mae: 0.0495 - val_loss: 0.0031 - val_mae: 0.0394 Epoch 38/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0075 - mae: 0.0521 - val_loss: 0.0028 - val_mae: 0.0358 Epoch 39/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0081 - mae: 0.0532 - val_loss: 0.0029 - val_mae: 0.0382 Epoch 40/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0080 - mae: 0.0546 - val_loss: 0.0030 - val_mae: 0.0388 Epoch 41/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0065 - mae: 0.0519 - val_loss: 0.0025 - val_mae: 0.0333 Epoch 42/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0065 - mae: 0.0506 - val_loss: 0.0026 - val_mae: 0.0341 Epoch 43/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0063 - mae: 0.0494 - val_loss: 0.0031 - val_mae: 0.0394 Epoch 44/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0066 - mae: 0.0494 - val_loss: 0.0024 - val_mae: 0.0328 Epoch 45/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0065 - mae: 0.0480 - val_loss: 0.0025 - val_mae: 0.0333 Epoch 46/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0061 - mae: 0.0471 - val_loss: 0.0025 - val_mae: 0.0341 Epoch 47/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0053 - mae: 0.0465 - val_loss: 0.0026 - val_mae: 0.0359 Epoch 48/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step - loss: 0.0069 - mae: 0.0507 - val_loss: 0.0026 - val_mae: 0.0348 Epoch 49/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step - loss: 0.0060 - mae: 0.0491 - val_loss: 0.0022 - val_mae: 0.0318 Epoch 50/50 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step - loss: 0.0055 - mae: 0.0460 - val_loss: 0.0023 - val_mae: 0.0318 13/13 ━━━━━━━━━━━━━━━━━━━━ 0s 12ms/step 测试集MSE: 0.002013 测试集RMSE: 0.044871 测试集MAE: 0.031868
平均百分比误差: 6.32%
结果分析与解释¶
在这个示例中,我们使用LSTM模型预测股票的波动率。以下是对结果的分析:
模型架构:
- 两层堆叠LSTM结构能够捕捉波动率的长期和短期模式
- Dropout层帮助防止过拟合
预测性能:
- 评估指标(MSE、RMSE、MAE)显示了模型的预测误差
- 预测曲线与实际波动率曲线的匹配程度
- 平均百分比误差反映了预测的相对精度
误差分析:
- 误差分布图显示模型是否有系统性偏差
- 误差时间序列图显示预测误差随时间的变化
波动率预测的应用¶
- 期权定价:波动率是Black-Scholes期权定价模型的关键输入
- 风险管理:波动率预测帮助评估投资组合的风险敞口
- 交易策略:可以基于波动率预测设计波动率套利或均值回归交易策略
- 资产配置:调整投资组合中高波动资产的权重
改进方向¶
- 引入更多特征,如交易量、市场情绪指标、宏观经济数据等
- 尝试更复杂的LSTM架构,如双向LSTM或注意力机制
- 考虑波动率的异质性,可能需要不同市场环境下的不同模型
- 探索GARCH等传统波动率模型与深度学习模型的结合