Bayes Linear Regression
Bayes Linear Regression is a probabilistic approach that combines Bayes' Theorem with linear regression. Instead of providing fixed point estimates for the model parameters (such as the coefficients in linear regression), this method incorporates uncertainty by modeling the parameters as probability distributions.
Mathematical Formulation
Consider the linear regression model where the target variable is predicted by a vector of features (where is the number of features):
where:
- is the target value for the -th observation,
- is the feature vector for the -th observation,
- is the vector of unknown regression coefficients (parameters),
- is the error term (or residual), which is assumed to be normally distributed: , i.e., errors are independent and identically distributed with mean 0 and variance .
Thus, for each observation , the conditional probability of given the feature vector and the parameters is:
Prior Distribution
In Bayes Linear Regression, we assume a prior distribution for the parameters . A common choice is to assume a Gaussian prior on :
where is the prior variance, and is the identity matrix. This prior distribution expresses the belief that the coefficients are likely to be close to zero, but with some uncertainty.
Likelihood Function
Given the assumption of normally distributed errors, the likelihood function for the observed data given the feature matrix and parameters is:
This represents the likelihood of observing the target values given the feature vectors and parameters , with noise variance .
Posterior Distribution
By Bayes' Theorem, the posterior distribution of given the data is proportional to the product of the likelihood and the prior:
Substituting the expressions for the likelihood and prior:
Posterior Mean and Covariance
The posterior distribution of is Gaussian, and the mean and covariance can be computed as follows. Completing the square in the exponent:
where the posterior mean is given by:
and the posterior covariance is:
Prediction
For a new observation , the predictive distribution for the target variable is given by:
This integral can be evaluated, leading to the following Gaussian predictive distribution:
This provides a probabilistic prediction, giving both the predicted value and the uncertainty in the prediction, represented by the variance .
Implementation
import numpy as np
class BayesLinearRegression:
def __init__(self, tau=1.0, sigma=1.0):
self.tau = tau
self.sigma = sigma
self.beta_post = None
self.sigma_post = None
def fit(self, X, y):
# Prior covariance (tau^2 I)
tau2_I = self.tau ** 2 * np.eye(X.shape[1])
# Likelihood covariance (sigma^2 I)
sigma2_I = self.sigma ** 2 * np.eye(X.shape[0])
# Compute the posterior covariance
XTX = X.T @ X
self.sigma_post = np.linalg.inv(np.linalg.inv(tau2_I) + (1 / self.sigma**2) * XTX)
# Compute the posterior mean
self.beta_post = (1 / self.sigma**2) * self.sigma_post @ X.T @ y
def predict(self, X_new):
# Predictive mean
y_pred_mean = X_new @ self.beta_post
# Predictive covariance
y_pred_cov = self.sigma ** 2 + np.sum(X_new @ self.sigma_post * X_new, axis=1)
return y_pred_mean, y_pred_cov
x = np.array([[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [7, 8]])
y = np.array([3, 5, 7, 9, 11, 15])
b_lr = BayesLinearRegression(tau=1.0, sigma=1.0)
b_lr.fit(x, y)
y_pred_mean, y_pred_cov = b_lr.predict(np.array([[0, 1]]))
print(y_pred_mean, y_pred_cov)