Yield curve prediction

The goal of this case study is to use supervised learning-based models to predict the yield curve. This case study is inspired by the paper “Artificial Neural Networks in Fixed Income Markets for Yield Curve Forecasting” by Nunes, Gerding, McGroarty and Niranj

Content

1. Introduction
2. Getting Started - Load Libraries and Dataset
- 2.1. Load Libraries
- 2.2. Load Dataset
3. Exploratory Data Analysis
- 3.1 Descriptive Statistics
- 3.2. Data Visualisation
4. Data Preparation and analysis
- 4.1.Feature Selection
5.Evaluate Algorithms and Models
- 5.1. Train/Test Split and Evaluation metrics]]
- 5.3. Compare Models and Algorithms
6. Model Tuning and Grid Search
7. Finalize the Model
- 7.1. Results and comparison of Regression and MLP

# 1. Problem Definition

In the supervised regression framework used for this case study, three tenors (i.e. 1M, 5Y and 30Y) of the yield curve are the predicted variable. These tenors represent short term, medium term and long-term tenors of the yield curve.

Features

In order to make predictions, we use the following features:

  1. Previous Changes in the Treasury Curve at the following tenors: 1 Month, 3 Month, 1 Year, 2 Year, 5 Year, 7 Year, 10 Year, 30 Year

  2. Changes in % of Federal Debt held by -

      a. Public,
      b. Foreign Goverments
      c. Federal Reserve

  3. The Coporate Spread on Baa rated Debt Relative to the 10 Year

# 2. Getting Started- Loading the data and python packages

## 2.1. Loading the python packages

Feature Variables

# Load libraries
import numpy as np
import pandas as pd
import pandas_datareader.data as web
from matplotlib import pyplot
from pandas.plotting import scatter_matrix
import seaborn as sns
from sklearn.preprocessing import StandardScaler

from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV

from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Lasso
from sklearn.linear_model import ElasticNet
from sklearn.tree import DecisionTreeRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.ensemble import AdaBoostRegressor
from sklearn.neural_network import MLPRegressor

#Libraries for Deep Learning Models
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from keras.layers import LSTM
from keras.wrappers.scikit_learn import KerasRegressor

#Libraries for Statistical Models
import statsmodels.api as sm

#Libraries for Saving the Model
from pickle import dump
from pickle import load

# Time series Models
from statsmodels.tsa.arima_model import ARIMA
#from statsmodels.tsa.statespace.sarimax import SARIMAX

# Error Metrics
from sklearn.metrics import mean_squared_error

# Feature Selection
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2, f_regression

#Plotting
from pandas.plotting import scatter_matrix
from statsmodels.graphics.tsaplots import plot_acf

   #Diable the warnings
   import warnings
   warnings.filterwarnings('ignore')

## 2.2. Loading the Data

# Get the data by webscapping using pandas datareader
tsy_tickers = ['DGS1MO', 'DGS3MO', 'DGS1', 'DGS2', 'DGS5', 'DGS7', 'DGS10', 'DGS30',
               'TREAST', # -- U.S. Treasury securities held by the Federal Reserve ( Millions of Dollars )
               'FYGFDPUN', # -- Federal Debt Held by the Public ( Millions of Dollars )
               'FDHBFIN', # -- Federal Debt Held by Foreign and International Investors ( Billions of Dollars )
               'GFDEBTN', # -- Federal Debt: Total Public Debt ( Millions of Dollars )
               'BAA10Y', # -- Baa Corporate Bond Yield Relative to Yield on 10-Year
              ]
tsy_data = web.DataReader(tsy_tickers, 'fred').dropna(how='all').ffill()
tsy_data['FDHBFIN'] = tsy_data['FDHBFIN'] * 1000
tsy_data['GOV_PCT'] = tsy_data['TREAST'] / tsy_data['GFDEBTN']
tsy_data['HOM_PCT'] = tsy_data['FYGFDPUN'] / tsy_data['GFDEBTN']
tsy_data['FOR_PCT'] = tsy_data['FDHBFIN'] / tsy_data['GFDEBTN']

return_period = 5
#Y = tsy_data.loc[:, ['DGS1MO', 'DGS5', 'DGS30']].diff(return_period).shift(-return_period)
#return_period = 5
Y = tsy_data.loc[:, ['DGS1MO', 'DGS5', 'DGS30']].shift(-return_period)
Y.columns = [col+'_pred' for col in Y.columns]

#X = tsy_data.loc[:, ['DGS1MO', 'DGS3MO', 'DGS1', 'DGS2', 'DGS5', 'DGS7', 'DGS10', 'DGS30', 'GOV_PCT', 'HOM_PCT', 'FOR_PCT', 'BAA10Y']].diff(return_period)
X = tsy_data.loc[:, ['DGS1MO', 'DGS3MO', 'DGS1', 'DGS2', 'DGS5', 'DGS7', 'DGS10', 'DGS30', 'GOV_PCT', 'HOM_PCT', 'FOR_PCT', 'BAA10Y']]

dataset = pd.concat([Y, X], axis=1).dropna().iloc[::return_period, :]
Y = dataset.loc[:, Y.columns]
X = dataset.loc[:, X.columns]

dataset.head()

	DGS1MO_pred	DGS5_pred	DGS30_pred	DGS1MO	DGS3MO	DGS1	DGS2	DGS5	DGS7	DGS10	DGS30	GOV_PCT	HOM_PCT	FOR_PCT	BAA10Y
DATE
2010-01-06	0.02	2.55	4.71	0.03	0.06	0.40	1.01	2.60	3.33	3.85	4.70	0.061	0.649	0.304	2.49
2010-01-13	0.02	2.38	4.50	0.02	0.06	0.37	0.97	2.55	3.28	3.80	4.71	0.061	0.649	0.304	2.50
2010-01-21	0.01	2.41	4.57	0.02	0.06	0.31	0.87	2.38	3.09	3.62	4.50	0.061	0.649	0.304	2.51
2010-01-28	0.04	2.29	4.53	0.01	0.08	0.31	0.87	2.41	3.15	3.68	4.57	0.061	0.649	0.304	2.57
2010-02-04	0.05	2.39	4.69	0.04	0.09	0.32	0.80	2.29	3.06	3.62	4.53	0.061	0.649	0.304	2.62

dataset

	DGS1MO_pred	DGS5_pred	DGS30_pred	DGS1MO	DGS3MO	DGS1	DGS2	DGS5	DGS7	DGS10	DGS30	GOV_PCT	HOM_PCT	FOR_PCT	BAA10Y
DATE
2010-01-06	0.02	2.55	4.71	0.03	0.06	0.40	1.01	2.60	3.33	3.85	4.70	0.061	0.649	0.304	2.49
2010-01-13	0.02	2.38	4.50	0.02	0.06	0.37	0.97	2.55	3.28	3.80	4.71	0.061	0.649	0.304	2.50
2010-01-21	0.01	2.41	4.57	0.02	0.06	0.31	0.87	2.38	3.09	3.62	4.50	0.061	0.649	0.304	2.51
2010-01-28	0.04	2.29	4.53	0.01	0.08	0.31	0.87	2.41	3.15	3.68	4.57	0.061	0.649	0.304	2.57
2010-02-04	0.05	2.39	4.69	0.04	0.09	0.32	0.80	2.29	3.06	3.62	4.53	0.061	0.649	0.304	2.62
2010-02-11	0.06	2.48	4.71	0.05	0.11	0.38	0.91	2.39	3.15	3.73	4.69	0.061	0.649	0.304	2.70
2010-02-19	0.09	2.30	4.55	0.06	0.11	0.39	0.95	2.48	3.24	3.78	4.71	0.061	0.649	0.304	2.65
2010-02-26	0.11	2.35	4.64	0.09	0.13	0.32	0.81	2.30	3.05	3.61	4.55	0.061	0.649	0.304	2.62
2010-03-05	0.10	2.42	4.62	0.11	0.15	0.38	0.91	2.35	3.10	3.69	4.64	0.061	0.649	0.304	2.61
2010-03-12	0.13	2.48	4.58	0.10	0.15	0.41	0.97	2.42	3.15	3.71	4.62	0.061	0.649	0.304	2.55
2010-03-19	0.11	2.59	4.75	0.13	0.16	0.42	1.02	2.48	3.16	3.70	4.58	0.061	0.649	0.304	2.49
2010-03-26	0.15	2.67	4.81	0.11	0.14	0.43	1.04	2.59	3.31	3.86	4.75	0.061	0.649	0.304	2.49
2010-04-02	0.16	2.65	4.74	0.15	0.16	0.46	1.11	2.67	3.40	3.96	4.81	0.059	0.654	0.308	2.44
2010-04-09	0.15	2.49	4.67	0.16	0.16	0.46	1.08	2.65	3.34	3.90	4.74	0.059	0.654	0.308	2.43
2010-04-16	0.14	2.61	4.67	0.15	0.16	0.41	0.98	2.49	3.20	3.79	4.67	0.059	0.654	0.308	2.43
2010-04-23	0.14	2.43	4.53	0.14	0.16	0.46	1.10	2.61	3.30	3.84	4.67	0.059	0.654	0.308	2.37
2010-04-30	0.08	2.17	4.28	0.14	0.16	0.41	0.97	2.43	3.12	3.69	4.53	0.059	0.654	0.308	2.38
2010-05-07	0.15	2.16	4.32	0.08	0.13	0.38	0.83	2.17	2.87	3.45	4.28	0.059	0.654	0.308	2.55
2010-05-14	0.17	2.02	4.07	0.15	0.16	0.34	0.79	2.16	2.87	3.44	4.32	0.059	0.654	0.308	2.59
2010-05-21	0.15	2.10	4.22	0.17	0.17	0.35	0.76	2.02	2.67	3.20	4.07	0.059	0.654	0.308	2.76
2010-05-28	0.10	1.95	4.11	0.15	0.16	0.34	0.76	2.10	2.75	3.31	4.22	0.059	0.654	0.308	2.89
2010-06-07	0.02	2.07	4.20	0.10	0.12	0.35	0.74	1.95	2.62	3.17	4.11	0.059	0.654	0.308	3.00
2010-06-14	0.05	2.05	4.17	0.02	0.07	0.31	0.77	2.07	2.73	3.28	4.20	0.059	0.654	0.308	3.06
2010-06-21	0.07	1.83	4.01	0.05	0.12	0.29	0.74	2.05	2.72	3.26	4.17	0.059	0.654	0.308	3.02
2010-06-28	0.17	1.76	3.89	0.07	0.17	0.30	0.62	1.83	2.49	3.05	4.01	0.059	0.654	0.308	3.10
2010-07-06	0.16	1.90	4.10	0.17	0.17	0.32	0.62	1.76	2.40	2.95	3.89	0.057	0.666	0.319	3.05
2010-07-13	0.16	1.71	3.99	0.16	0.15	0.30	0.67	1.90	2.57	3.15	4.10	0.057	0.666	0.319	3.03
2010-07-20	0.16	1.82	4.08	0.16	0.16	0.27	0.61	1.71	2.39	2.98	3.99	0.057	0.666	0.319	2.99
2010-07-27	0.15	1.55	4.04	0.16	0.15	0.30	0.65	1.82	2.51	3.08	4.08	0.057	0.666	0.319	2.90
2010-08-03	0.15	1.46	4.00	0.15	0.16	0.27	0.53	1.55	2.27	2.94	4.04	0.057	0.666	0.319	2.92
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2019-05-31	2.30	1.85	2.57	2.35	2.35	2.21	1.95	1.93	2.03	2.14	2.58	0.096	0.736	0.302	2.37
2019-06-07	2.23	1.85	2.59	2.30	2.28	1.97	1.85	1.85	1.97	2.09	2.57	0.096	0.736	0.302	2.42
2019-06-14	2.16	1.80	2.59	2.23	2.20	2.00	1.84	1.85	1.96	2.09	2.59	0.096	0.736	0.302	2.42
2019-06-21	2.18	1.76	2.52	2.16	2.11	1.95	1.77	1.80	1.93	2.07	2.59	0.096	0.736	0.302	2.35
2019-06-28	2.23	1.86	2.53	2.18	2.12	1.92	1.75	1.76	1.87	2.00	2.52	0.096	0.736	0.302	2.31
2019-07-08	2.17	1.84	2.61	2.23	2.26	1.99	1.88	1.86	1.94	2.05	2.53	0.092	0.741	0.298	2.23
2019-07-15	2.13	1.80	2.58	2.17	2.16	1.95	1.83	1.84	1.96	2.09	2.61	0.092	0.741	0.298	2.25
2019-07-22	2.12	1.84	2.59	2.13	2.09	1.95	1.80	1.80	1.92	2.05	2.58	0.092	0.741	0.298	2.24
2019-07-29	2.07	1.55	2.30	2.12	2.12	1.98	1.85	1.84	1.93	2.06	2.59	0.092	0.741	0.298	2.13
2019-08-05	2.09	1.49	2.14	2.07	2.05	1.78	1.59	1.55	1.63	1.75	2.30	0.092	0.741	0.298	2.24
2019-08-12	2.06	1.47	2.08	2.09	2.00	1.75	1.58	1.49	1.56	1.65	2.14	0.092	0.741	0.298	2.23
2019-08-19	2.09	1.43	2.04	2.06	1.94	1.75	1.53	1.47	1.54	1.60	2.08	0.092	0.741	0.298	2.29
2019-08-26	2.06	1.35	1.95	2.09	2.01	1.75	1.54	1.43	1.49	1.54	2.04	0.092	0.741	0.298	2.27
2019-09-03	2.04	1.58	2.19	2.06	1.98	1.72	1.47	1.35	1.42	1.47	1.95	0.092	0.741	0.298	2.28
2019-09-10	2.10	1.66	2.27	2.04	1.95	1.81	1.67	1.58	1.66	1.72	2.19	0.092	0.741	0.298	2.22
2019-09-17	1.90	1.52	2.09	2.10	1.99	1.87	1.72	1.66	1.75	1.81	2.27	0.092	0.741	0.298	2.21
2019-09-24	1.79	1.51	2.11	1.90	1.92	1.78	1.60	1.52	1.58	1.64	2.09	0.093	0.741	0.298	2.20
2019-10-01	1.69	1.36	2.04	1.79	1.82	1.73	1.56	1.51	1.59	1.65	2.11	0.093	0.741	0.298	2.22
2019-10-08	1.71	1.57	2.23	1.69	1.72	1.62	1.42	1.36	1.45	1.54	2.04	0.093	0.741	0.298	2.30
2019-10-16	1.74	1.58	2.25	1.71	1.66	1.59	1.58	1.57	1.65	1.75	2.23	0.093	0.741	0.298	2.22
2019-10-23	1.61	1.61	2.26	1.74	1.65	1.58	1.58	1.58	1.67	1.77	2.25	0.095	0.741	0.298	2.18
2019-10-30	1.55	1.63	2.30	1.61	1.62	1.59	1.61	1.61	1.69	1.78	2.26	0.096	0.741	0.298	2.17
2019-11-06	1.59	1.63	2.31	1.55	1.56	1.58	1.61	1.63	1.73	1.81	2.30	0.097	0.741	0.298	2.14
2019-11-14	1.57	1.62	2.24	1.59	1.57	1.55	1.58	1.63	1.73	1.82	2.31	0.097	0.741	0.298	2.13
2019-11-21	1.62	1.62	2.21	1.57	1.58	1.55	1.60	1.62	1.71	1.77	2.24	0.098	0.741	0.298	2.15
2019-11-29	1.52	1.67	2.29	1.62	1.59	1.60	1.61	1.62	1.73	1.78	2.21	0.099	0.741	0.298	2.08
2019-12-06	1.55	1.66	2.26	1.52	1.53	1.57	1.61	1.67	1.78	1.84	2.29	0.099	0.741	0.298	2.07
2019-12-13	1.57	1.73	2.34	1.55	1.57	1.54	1.61	1.66	1.76	1.82	2.26	0.100	0.741	0.298	2.02
2019-12-20	1.56	1.68	2.32	1.57	1.58	1.52	1.63	1.73	1.84	1.92	2.34	0.101	0.741	0.298	1.96
2019-12-27	1.52	1.59	2.26	1.56	1.57	1.51	1.59	1.68	1.80	1.88	2.32	0.103	0.741	0.298	1.96

505 rows × 15 columns

# 3. Exploratory Data Analysis ## 3.1. Descriptive Statistics

dataset.shape

(505, 15)

pd.set_option('precision', 3)
dataset.describe()

	DGS1MO_pred	DGS5_pred	DGS30_pred	DGS1MO	DGS3MO	DGS1	DGS2	DGS5	DGS7	DGS10	DGS30	GOV_PCT	HOM_PCT	FOR_PCT	BAA10Y
count	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000	505.000
mean	0.544	1.644	3.180	0.541	0.578	0.746	0.961	1.646	2.041	2.400	3.185	0.110	0.711	0.320	2.588
std	0.780	0.593	0.612	0.779	0.797	0.810	0.750	0.595	0.558	0.553	0.614	0.022	0.023	0.018	0.451
min	0.000	0.570	1.950	0.000	0.000	0.090	0.180	0.570	0.930	1.400	1.950	0.057	0.649	0.285	1.580
25%	0.030	1.250	2.800	0.030	0.050	0.160	0.370	1.250	1.600	1.960	2.810	0.101	0.702	0.304	2.240
50%	0.120	1.620	3.040	0.110	0.130	0.300	0.680	1.620	2.050	2.340	3.050	0.113	0.720	0.324	2.600
75%	0.950	1.960	3.450	0.850	1.000	1.220	1.330	1.970	2.370	2.760	3.450	0.126	0.724	0.338	2.900
max	2.450	3.070	4.810	2.450	2.480	2.740	2.960	3.070	3.400	3.960	4.810	0.137	0.741	0.341	3.600

## 3.2. Data Visualization

Y.plot()

<matplotlib.axes._subplots.AxesSubplot at 0x25e4c803b70>

# histograms
dataset.hist(sharex=False, sharey=False, xlabelsize=1, ylabelsize=1, figsize=(12,12))
pyplot.show()

# density
dataset.plot(kind='density', subplots=True, layout=(5,4), sharex=False, legend=True, fontsize=1, figsize=(15,15))
pyplot.show()

#Box and Whisker Plots
dataset.plot(kind='box', subplots=True, layout=(5,4), sharex=False, sharey=False, figsize=(15,15))
pyplot.show()

Next We look at the interaction between these variables.

# correlation
correlation = dataset.corr()
pyplot.figure(figsize=(15,15))
pyplot.title('Correlation Matrix')
sns.heatmap(correlation, vmax=1, square=True,annot=True,cmap='cubehelix')

<matplotlib.axes._subplots.AxesSubplot at 0x25e3971e358>

Form the correlation plot, we see that the 1 month and the 30 year yield data points are negatively autocorrelated. The 5 year yield also seems toe be negativly correlated with the changes in foreign goverment purchases.

# Scatterplot Matrix
pyplot.figure(figsize=(15,15))
scatter_matrix(dataset,figsize=(15,16))
pyplot.show()

<Figure size 1080x1080 with 0 Axes>

1 Month

temp_Y = dataset['DGS1MO_pred']
res = sm.tsa.seasonal_decompose(temp_Y,freq=52)
fig = res.plot()
fig.set_figheight(8)
fig.set_figwidth(15)
pyplot.show()

5 Year

temp_Y = dataset['DGS5_pred']
res = sm.tsa.seasonal_decompose(temp_Y,freq=52)
fig = res.plot()
fig.set_figheight(8)
fig.set_figwidth(15)
pyplot.show()

30 Year

temp_Y = dataset['DGS30_pred']
res = sm.tsa.seasonal_decompose(temp_Y,freq=52)
fig = res.plot()
fig.set_figheight(8)
fig.set_figwidth(15)
pyplot.show()

Around Q1 2018, we observe a trend decrease in the 1 Month, 5 year and 30 year. However, the trend is most pronounced in the 1 month series.

## 4. Data Preparation and analysis

## 4.1. Univariate Feature Selection

bestfeatures = SelectKBest(k=5, score_func=f_regression)
for col in Y.columns:
    temp_Y = dataset[col]
    temp_X = dataset.loc[:, X.columns]
    fit = bestfeatures.fit(temp_X,temp_Y)
    dfscores = pd.DataFrame(fit.scores_)
    dfcolumns = pd.DataFrame(X.columns)
    #concat two dataframes for better visualization
    featureScores = pd.concat([dfcolumns,dfscores],axis=1)
    featureScores.columns = ['Specs','Score']  #naming the dataframe columns
    print(col)
    print(featureScores.nlargest(10,'Score'))  #print 10 best features
    print('--------------')

DGS1MO_pred
      Specs       Score
  DGS1MO  152945.490
  DGS3MO  100807.006
    DGS1   11168.249
    DGS2    3510.465
FOR_PCT    1010.264
 BAA10Y     361.423
    DGS5     342.936
 HOM_PCT     243.490
    DGS7      85.197
   DGS30      59.793
--------------
DGS5_pred
      Specs      Score
    DGS5  16564.639
    DGS7   2720.883
    DGS2    970.840
FOR_PCT    613.935
 BAA10Y    586.571
   DGS10    517.453
    DGS1    505.191
  DGS3MO    363.006
  DGS1MO    326.866
   DGS30     39.443
--------------
DGS30_pred
     Specs      Score
  DGS30  17108.682
  DGS10   1235.241
HOM_PCT    629.480
   DGS7    252.017
GOV_PCT    183.786
 DGS1MO     62.107
 DGS3MO     60.412
   DGS1     55.316
   DGS5     40.429
   DGS2     25.822
--------------

As expected, based on the univariate feature selection, all the time series are most dependent on the previous changes.

# 5. Evaluate Algorithms and Models

## 5.1. Train Test Split and evaluation metrics

# split out validation dataset for the end

validation_size = 0.2

#In case the data is not dependent on the time series, then train and test split randomly
seed = 7
X_train, X_validation, Y_train, Y_validation = train_test_split(X, Y, test_size=validation_size, random_state=seed)

We use the prebuilt scikit models to run a K fold analysis on our training data. We then train the model on the full training data and use it for prediction of the test data. The parameters for the K fold analysis are defined as -

   # test options for regression
   num_folds = 10
   scoring = 'neg_mean_squared_error'

## 5.2. Compare Models and Algorithms

# spot check the algorithms
models = []
models.append(('LR', LinearRegression()))
models.append(('LASSO', Lasso()))
models.append(('EN', ElasticNet()))
models.append(('KNN', KNeighborsRegressor()))
models.append(('CART', DecisionTreeRegressor()))
#Neural Network
models.append(('MLP', MLPRegressor()))

kfold_results = []
names = []
validation_results = []
train_results = []
for name, model in models:
    kfold = KFold(n_splits=num_folds, random_state=seed)
    #converted mean square error to positive. The lower the beter
    cv_results = -1* cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)
    kfold_results.append(cv_results)
    names.append(name)

    # Finally we Train on the full period and test against validation
    res = model.fit(X_train, Y_train)
    validation_result = np.mean(np.square(res.predict(X_validation) - Y_validation))
    validation_results.append(validation_result)
    train_result = np.mean(np.square(res.predict(X_train) - Y_train))
    train_results.append(train_result)

    msg = "%s: \nAverage CV error: %s \nStd CV Error: (%s) \nTraining Error:\n%s \nTest Error:\n%s" % \
    (name, str(cv_results.mean()), str(cv_results.std()), str(train_result), str(validation_result))
    print(msg)
    print('----------')

LR:
Average CV error: 0.007942891864184351
Std CV Error: (0.002627539557566172)
Training Error:
DGS1MO_pred    0.002
DGS5_pred      0.010
DGS30_pred     0.010
dtype: float64
Test Error:
DGS1MO_pred    0.001
DGS5_pred      0.009
DGS30_pred     0.010
dtype: float64
----------
LASSO:
Average CV error: 0.44035581436209686
Std CV Error: (0.05366688435468398)
Training Error:
DGS1MO_pred    0.574
DGS5_pred      0.352
DGS30_pred     0.388
dtype: float64
Test Error:
DGS1MO_pred    0.743
DGS5_pred      0.350
DGS30_pred     0.318
dtype: float64
----------
EN:
Average CV error: 0.4019832745740823
Std CV Error: (0.050762635401215755)
Training Error:
DGS1MO_pred    0.455
DGS5_pred      0.352
DGS30_pred     0.388
dtype: float64
Test Error:
DGS1MO_pred    0.592
DGS5_pred      0.350
DGS30_pred     0.318
dtype: float64
----------
KNN:
Average CV error: 0.009184607723577237
Std CV Error: (0.003118097916218511)
Training Error:
DGS1MO_pred    0.002
DGS5_pred      0.008
DGS30_pred     0.007
dtype: float64
Test Error:
DGS1MO_pred    0.002
DGS5_pred      0.011
DGS30_pred     0.011
dtype: float64
----------
CART:
Average CV error: 0.017137121951219508
Std CV Error: (0.003920838626872242)
Training Error:
DGS1MO_pred    0.0
DGS5_pred      0.0
DGS30_pred     0.0
dtype: float64
Test Error:
DGS1MO_pred    0.003
DGS5_pred      0.019
DGS30_pred     0.017
dtype: float64
----------
MLP:
Average CV error: 0.015686185139476356
Std CV Error: (0.004980051035156162)
Training Error:
DGS1MO_pred    0.003
DGS5_pred      0.011
DGS30_pred     0.025
dtype: float64
Test Error:
DGS1MO_pred    0.002
DGS5_pred      0.009
DGS30_pred     0.026
dtype: float64
----------

# compare algorithms
fig = pyplot.figure()
fig.suptitle('Algorithm Comparison')
ax = fig.add_subplot(111)
pyplot.boxplot(kfold_results)
ax.set_xticklabels(names)
fig.set_size_inches(15,8)
pyplot.show()

# compare algorithms
fig = pyplot.figure()

ind = np.arange(len(names))  # the x locations for the groups
width = 0.35  # the width of the bars

fig.suptitle('Algorithm Comparison')
ax = fig.add_subplot(111)
pyplot.bar(ind - width/2, [x.mean() for x in train_results],  width=width, label='Train Error')
pyplot.bar(ind + width/2, [x.mean() for x in validation_results], width=width, label='Validation Error')
fig.set_size_inches(15,8)
pyplot.legend()
ax.set_xticks(ind)
ax.set_xticklabels(names)
pyplot.show()

# 7. Grid search : MLPRegressor
'''
hidden_layer_sizes : tuple, length = n_layers - 2, default (100,)
    The ith element represents the number of neurons in the ith
    hidden layer.
'''
param_grid={'hidden_layer_sizes': [(20,), (50,), (20,20), (20, 30, 20)]}
model = MLPRegressor()
kfold = KFold(n_splits=num_folds, random_state=seed)
grid = GridSearchCV(estimator=model, param_grid=param_grid, scoring=scoring, cv=kfold)
grid_result = grid.fit(X_train, Y_train)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

   Best: -0.018006 using {'hidden_layer_sizes': (20, 30, 20)}
   -0.036433 (0.019326) with: {'hidden_layer_sizes': (20,)}
   -0.020793 (0.007075) with: {'hidden_layer_sizes': (50,)}
   -0.026638 (0.010154) with: {'hidden_layer_sizes': (20, 20)}
   -0.018006 (0.005637) with: {'hidden_layer_sizes': (20, 30, 20)}


# 7. Finalise the Model

# prepare model
model = MLPRegressor(hidden_layer_sizes= (20, 30, 20))
model.fit(X_train, Y_train)

   MLPRegressor(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
                beta_2=0.999, early_stopping=False, epsilon=1e-08,
                hidden_layer_sizes=(20, 30, 20), learning_rate='constant',
                learning_rate_init=0.001, max_iter=200, momentum=0.9,
                n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
                random_state=None, shuffle=True, solver='adam', tol=0.0001,
                validation_fraction=0.1, verbose=False, warm_start=False)



## 7.1. Results and comparison of Regression and MLP

# estimate accuracy on validation set
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
predictions = model.predict(X_validation)
mse_MLP = mean_squared_error(Y_validation, predictions)
r2_MLP = r2_score(Y_validation, predictions)

# prepare model
model_2 = LinearRegression()
model_2.fit(X_train, Y_train)
predictions_2 = model_2.predict(X_validation)

mse_OLS = mean_squared_error(Y_validation, predictions_2)
r2_OLS = r2_score(Y_validation, predictions_2)
print("MSE Regression = %f, MSE MLP = %f" % (mse_OLS, mse_MLP ))
print("R2 Regression = %f, R2 MLP = %f" % (r2_OLS, r2_MLP ))

MSE Regression = 0.006727, MSE MLP = 0.015661
R2 Regression = 0.979949, R2 MLP = 0.954716

The statistics of MLP and Linear regression are comparable. Let us check the prediction shape on the validation set.

Predictions - 5 Year - MLP

train_size = int(len(X) * (1-validation_size))
X_train, X_validation = X[0:train_size], X[train_size:len(X)]
Y_train, Y_validation = Y[0:train_size], Y[train_size:len(X)]

modelMLP = MLPRegressor(hidden_layer_sizes= (50,))
modelOLS = LinearRegression()
model_MLP = modelMLP.fit(X_train, Y_train)
model_OLS = modelOLS.fit(X_train, Y_train)

Y_predMLP = pd.DataFrame(model_MLP.predict(X_validation), index=Y_validation.index,
                      columns=Y_validation.columns)

Y_predOLS = pd.DataFrame(model_OLS.predict(X_validation), index=Y_validation.index,
                      columns=Y_validation.columns)

pd.DataFrame({'Actual : 1m': Y_validation.loc[:, 'DGS1MO_pred'],
              'Prediction MLP 1m': Y_predMLP.loc[:, 'DGS1MO_pred'],
              'Prediction OLS 1m': Y_predOLS.loc[:, 'DGS1MO_pred']}).plot(figsize=(10,5))

pd.DataFrame({'Actual : 5yr': Y_validation.loc[:, 'DGS5_pred'],
              'Prediction MLP 5yr': Y_predMLP.loc[:, 'DGS5_pred'],
              'Prediction OLS 5yr': Y_predOLS.loc[:, 'DGS5_pred']}).plot(figsize=(10,5))

pd.DataFrame({'Actual : 30yr': Y_validation.loc[:, 'DGS30_pred'],
              'Prediction MLP 30yr': Y_predMLP.loc[:, 'DGS30_pred'],
              'Prediction OLS 30yr': Y_predOLS.loc[:, 'DGS30_pred']}).plot(figsize=(10,5))

<matplotlib.axes._subplots.AxesSubplot at 0x25e44811860>

Overall, the regression and MLP are comparable, however, for 1m tenor, the fitting with MLP is slighly poor as compared to the regression. However,the multitask learning with neural network is more intuitive for modeling many time series simultaneousl

Summary

The linear regression model, despite its simplicity, is a tough benchmark to beat for such one step ahead forecasting, given the dominant characteristic of the last available value of the variable to predict. The ANN results in this case study are comparable to the linear regression models.

The good thing about ANN is that it is more flexible to changing market conditions. Also, ANN models can be enhanced by performing grid search on several other hyperparameters and using recurrent neural network such as LSTM.