No description has been provided for this image

[PBHPD1] - Regression with a Dense Network (DNN)¶

A Simple regression with a Dense Neural Network (DNN) using Pytorch - BHPD dataset

Objectives :¶

  • Predicts housing prices from a set of house features.
  • Understanding the principle and the architecture of a regression with a dense neural network

The Boston Housing Dataset consists of price of houses in various places in Boston.
Alongside with price, the dataset also provide theses informations :

  • CRIM: This is the per capita crime rate by town
  • ZN: This is the proportion of residential land zoned for lots larger than 25,000 sq.ft
  • INDUS: This is the proportion of non-retail business acres per town
  • CHAS: This is the Charles River dummy variable (this is equal to 1 if tract bounds river; 0 otherwise)
  • NOX: This is the nitric oxides concentration (parts per 10 million)
  • RM: This is the average number of rooms per dwelling
  • AGE: This is the proportion of owner-occupied units built prior to 1940
  • DIS: This is the weighted distances to five Boston employment centers
  • RAD: This is the index of accessibility to radial highways
  • TAX: This is the full-value property-tax rate per 10,000 dollars
  • PTRATIO: This is the pupil-teacher ratio by town
  • B: This is calculated as 1000(Bk — 0.63)^2, where Bk is the proportion of people of African American descent by town
  • LSTAT: This is the percentage lower status of the population
  • MEDV: This is the median value of owner-occupied homes in 1000 dollars

What we're going to do :¶

  • Retrieve data
  • Preparing the data
  • Build a model
  • Train the model
  • Evaluate the result

Step 1 - Import and init¶

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable


import numpy as np
import matplotlib.pyplot as plt
import sys,os

import pandas as pd

from modules.fidle_pwk_additional import convergence_history_MSELoss

import fidle

# Init Fidle environment
run_id, run_dir, datasets_dir = fidle.init('PBHPD1')


FIDLE - Environment initialization

Version              : 2.3.2
Run id               : PBHPD1
Run dir              : ./run/PBHPD1
Datasets dir         : /lustre/fswork/projects/rech/mlh/uja62cb/fidle-project/datasets-fidle
Start time           : 22/12/24 21:20:56
Hostname             : r3i6n3 (Linux)
Tensorflow log level : Info + Warning + Error  (=0)
Update keras cache   : False
Update torch cache   : False
Save figs            : ./run/PBHPD1/figs (True)
numpy                : 2.1.2
sklearn              : 1.5.2
yaml                 : 6.0.2
matplotlib           : 3.9.2
pandas               : 2.2.3
torch                : 2.5.0

Step 2 - Retrieve data¶

Boston housing is a famous historic dataset, which can be get here: Boston housing datasets

In [2]:
data = pd.read_csv('./BostonHousing.csv', header=0)

display(data.head(5).style.format("{0:.2f}").set_caption("Few lines of the dataset :"))
print('Missing Data : ',data.isna().sum().sum(), '  Shape is : ', data.shape)
Few lines of the dataset :
  crim zn indus chas nox rm age dis rad tax ptratio b lstat medv
0 0.01 18.00 2.31 0.00 0.54 6.58 65.20 4.09 1.00 296.00 15.30 396.90 4.98 24.00
1 0.03 0.00 7.07 0.00 0.47 6.42 78.90 4.97 2.00 242.00 17.80 396.90 9.14 21.60
2 0.03 0.00 7.07 0.00 0.47 7.18 61.10 4.97 2.00 242.00 17.80 392.83 4.03 34.70
3 0.03 0.00 2.18 0.00 0.46 7.00 45.80 6.06 3.00 222.00 18.70 394.63 2.94 33.40
4 0.07 0.00 2.18 0.00 0.46 7.15 54.20 6.06 3.00 222.00 18.70 396.90 5.33 36.20
Missing Data :  0   Shape is :  (506, 14)

Step 3 - Preparing the data¶

3.1 - Split data¶

We will use 70% of the data for training and 30% for validation.
The dataset is shuffled and shared between learning and testing.
x will be input data and y the expected output

In [3]:
# ---- Shuffle and Split => train, test
#
data_train = data.sample(frac=0.7, axis=0)
data_test  = data.drop(data_train.index)

# ---- Split => x,y (medv is price)
#
x_train = data_train.drop('medv',  axis=1)
y_train = data_train['medv']
x_test  = data_test.drop('medv',   axis=1)
y_test  = data_test['medv']

print('Original data shape was : ',data.shape)
print('x_train : ',x_train.shape, 'y_train : ',y_train.shape)
print('x_test  : ',x_test.shape,  'y_test  : ',y_test.shape)
Original data shape was :  (506, 14)
x_train :  (354, 13) y_train :  (354,)
x_test  :  (152, 13) y_test  :  (152,)

3.2 - Data normalization¶

Note :

  • All input data must be normalized, train and test.
  • To do this we will subtract the mean and divide by the standard deviation.
  • But test data should not be used in any way, even for normalization.
  • The mean and the standard deviation will therefore only be calculated with the train data.
In [4]:
display(x_train.describe().style.format("{0:.2f}").set_caption("Before normalization :"))

mean = x_train.mean()
std  = x_train.std()
x_train = (x_train - mean) / std
x_test  = (x_test  - mean) / std

display(x_train.describe().style.format("{0:.2f}").set_caption("After normalization :"))
display(x_train.head(5).style.format("{0:.2f}").set_caption("Few lines of the dataset :"))

x_train, y_train = np.array(x_train), np.array(y_train)
x_test,  y_test  = np.array(x_test),  np.array(y_test)
Before normalization :
  crim zn indus chas nox rm age dis rad tax ptratio b lstat
count 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00
mean 3.63 11.54 11.42 0.06 0.56 6.27 69.66 3.76 9.86 415.85 18.56 352.76 12.84
std 7.97 24.08 6.89 0.24 0.11 0.69 28.43 2.15 8.94 170.50 2.13 96.17 7.12
min 0.01 0.00 0.46 0.00 0.39 3.56 2.90 1.13 1.00 187.00 13.00 0.32 1.73
25% 0.08 0.00 5.22 0.00 0.45 5.89 47.20 2.06 4.00 284.00 17.40 373.30 7.18
50% 0.33 0.00 9.69 0.00 0.54 6.21 80.35 3.09 5.00 348.00 19.10 391.48 12.04
75% 3.99 12.50 18.10 0.00 0.63 6.56 94.55 5.12 24.00 666.00 20.20 396.23 17.16
max 73.53 100.00 27.74 1.00 0.87 8.78 100.00 12.13 24.00 711.00 22.00 396.90 36.98
After normalization :
  crim zn indus chas nox rm age dis rad tax ptratio b lstat
count 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00 354.00
mean 0.00 -0.00 -0.00 -0.00 -0.00 0.00 0.00 -0.00 0.00 0.00 0.00 -0.00 -0.00
std 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
min -0.45 -0.48 -1.59 -0.26 -1.52 -3.91 -2.35 -1.22 -0.99 -1.34 -2.61 -3.66 -1.56
25% -0.44 -0.48 -0.90 -0.26 -0.92 -0.55 -0.79 -0.79 -0.66 -0.77 -0.55 0.21 -0.79
50% -0.41 -0.48 -0.25 -0.26 -0.17 -0.09 0.38 -0.31 -0.54 -0.40 0.25 0.40 -0.11
75% 0.05 0.04 0.97 -0.26 0.64 0.42 0.88 0.63 1.58 1.47 0.77 0.45 0.61
max 8.77 3.67 2.37 3.88 2.75 3.62 1.07 3.90 1.58 1.73 1.62 0.46 3.39
Few lines of the dataset :
  crim zn indus chas nox rm age dis rad tax ptratio b lstat
453 0.58 -0.48 0.97 -0.26 1.36 1.62 1.04 -0.61 1.58 1.47 0.77 0.24 0.55
391 0.21 -0.48 0.97 -0.26 1.25 -0.32 0.45 -0.74 1.58 1.47 0.77 0.27 0.83
161 -0.27 -0.48 1.19 -0.26 0.42 1.76 0.74 -0.83 -0.54 -0.08 -1.82 0.23 -1.56
462 0.38 -0.48 0.97 -0.26 1.36 0.07 0.47 -0.48 1.58 1.47 0.77 0.46 0.16
157 -0.30 -0.48 1.19 -0.26 0.42 0.97 0.98 -0.88 -0.54 -0.08 -1.82 0.11 -1.16

Step 4 - Build a model¶

About informations about :

  • Optimizer
  • Basic neural-network blocks
  • Loss
In [5]:
class model_v1(nn.Module):
    """
    Basic fully connected neural-network for tabular data
    """
    def __init__(self,num_vars):
        super(model_v1, self).__init__()
        self.num_vars=num_vars
        self.hidden1 = nn.Linear(self.num_vars, 64)
        self.hidden2 = nn.Linear(64, 64)
        self.hidden3 = nn.Linear(64, 1)

    def forward(self, x):
        x = x.view(-1,self.num_vars)   #flatten the observation before using fully-connected layers
        x = self.hidden1(x)
        x = F.relu(x)
        x = self.hidden2(x)
        x = F.relu(x)
        x = self.hidden3(x)
        return x

Step 5 - Train the model¶

5.1 - Stochastic gradient descent strategy to fit the model¶

In [6]:
def fit(model,X_train,Y_train,X_test,Y_test, EPOCHS = 5, BATCH_SIZE = 32):
    
    loss = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(),lr=1e-3) #lr is the learning rate
    model.train()
    
    history=convergence_history_MSELoss()
    
    history.update(model,X_train,Y_train,X_test,Y_test)
    
    n=X_train.shape[0] #number of observations in the training data
    
    #stochastic gradient descent
    for epoch in range(EPOCHS):
        
        batch_start=0
        epoch_shuffler=np.arange(n) 
        np.random.shuffle(epoch_shuffler) #remark that 'utilsData.DataLoader' could be used instead
        
        while batch_start+BATCH_SIZE < n:
            #get mini-batch observation
            mini_batch_observations = epoch_shuffler[batch_start:batch_start+BATCH_SIZE]
            var_X_batch = Variable(X_train[mini_batch_observations,:]).float()
            var_Y_batch = Variable(Y_train[mini_batch_observations]).float()
            
            #gradient descent step
            optimizer.zero_grad()               #set the parameters gradients to 0
            Y_pred_batch = model(var_X_batch)   #predict y with the current NN parameters
            
            curr_loss = loss(Y_pred_batch.view(-1), var_Y_batch.view(-1))  #compute the current loss
            curr_loss.backward()                         #compute the loss gradient w.r.t. all NN parameters
            optimizer.step()                             #update the NN parameters
            
            #prepare the next mini-batch of the epoch
            batch_start+=BATCH_SIZE
            
        history.update(model,X_train,Y_train,X_test,Y_test)
    
    return history

5.2 - Get the model¶

In [7]:
   
model=model_v1( x_train[0,:].shape[0] )

print(model)
model_v1(
  (hidden1): Linear(in_features=13, out_features=64, bias=True)
  (hidden2): Linear(in_features=64, out_features=64, bias=True)
  (hidden3): Linear(in_features=64, out_features=1, bias=True)
)

5.3 - Train the model¶

In [8]:
torch_x_train=torch.from_numpy(x_train)
torch_y_train=torch.from_numpy(y_train)
torch_x_test=torch.from_numpy(x_test)
torch_y_test=torch.from_numpy(y_test)

batch_size  = 10
epochs      = 100


history=fit(model,torch_x_train,torch_y_train,torch_x_test,torch_y_test,EPOCHS=epochs,BATCH_SIZE = batch_size)

Step 6 - Evaluate¶

6.1 - Model evaluation¶

MAE = Mean Absolute Error (between the labels and predictions)
A mae equal to 3 represents an average error in prediction of $3k.

In [9]:
var_x_test = Variable(torch_x_test).float()
var_y_test = Variable(torch_y_test).float()
y_pred = model(var_x_test)

nn_loss = nn.MSELoss()
nn_MAE_loss = nn.L1Loss()

print('x_test / loss      : {:5.4f}'.format(nn_loss(y_pred.view(-1), var_y_test.view(-1)).item()))
print('x_test / mae       : {:5.4f}'.format(nn_MAE_loss(y_pred.view(-1), var_y_test.view(-1)).item()))
x_test / loss      : 12.4627
x_test / mae       : 2.5673

6.2 - Training history¶

What was the best result during our training ?

In [10]:
df=pd.DataFrame(data=history.history)
df.describe()
Out[10]:
loss mae val_loss val_mae
count 101.000000 101.000000 101.000000 101.000000
mean 22.222863 2.635554 27.133244 3.249773
std 75.132948 2.861175 83.131706 2.973938
min 4.191096 1.577489 11.814480 2.523123
25% 6.052614 1.805854 12.124697 2.599839
50% 7.975610 2.019151 12.359540 2.654237
75% 12.401917 2.378084 12.907097 2.718711
max 582.181335 21.925776 625.663635 23.338125
In [11]:
print("min( val_mae ) : {:.4f}".format( min(history.history["val_mae"]) ) )
min( val_mae ) : 2.5231
In [12]:
fidle.scrawler.history(history, plot={'MAE' :['mae', 'val_mae'],
                                'LOSS':['loss','val_loss']})
Saved: ./run/PBHPD1/figs/fig_PBHPD1_00
No description has been provided for this image
Saved: ./run/PBHPD1/figs/fig_PBHPD1_01
No description has been provided for this image

Step 7 - Make a prediction¶

The data must be normalized with the parameters (mean, std) previously used.

In [13]:
my_data = [ 1.26425925, -0.48522739,  1.0436489 , -0.23112788,  1.37120745,
       -2.14308942,  1.13489104, -1.06802005,  1.71189006,  1.57042287,
        0.77859951,  0.14769795,  2.7585581 ]
real_price = 10.4

my_data=np.array(my_data).reshape(1,13)
In [14]:
torch_my_data=torch.from_numpy(my_data)
var_my_data = Variable(torch_my_data).float()

predictions = model( var_my_data )
print("Prediction : {:.2f} K$".format(predictions[0][0]))
print("Reality    : {:.2f} K$".format(real_price))
Prediction : 9.01 K$
Reality    : 10.40 K$

No description has been provided for this image