A linear model predicts the target as a weighted sum of the input features.
The independence and additivity of the model’s structure make it transparent. The weights communicate the global (with respect to the entire model) feature influence and importance.
\[ f(\mathbf{x}) = -1.81 \;\; + \;\; 0.54 \times x_1 \;\; + \;\; 0.34 \times x_2 \]
\[\omega_0 = -1.81 \;\;\;\;\;\;\;\; \omega_1 = 0.54 \;\;\;\;\;\;\;\; \omega_2 = 0.34\]
Property | Linear Models |
---|---|
relation | ante-hoc |
compatibility | linear models |
modelling | regression (crisp classification) |
scope | global and local |
target | model and prediction |
Property | Linear Models |
---|---|
data | tabular |
features | numerical and (one hot-encoded) categorical |
explanation | model visualisation, feature influence & importance |
caveats | feature correlation, target nonlinearity |
\[ f(\mathbf{x}) = -1.81 \;\; + \;\; 0.54 \times x_1 \;\; + \;\; 0.34 \times x_2 \]
\[\omega_0 = -1.81 \;\;\;\;\;\;\;\; \omega_1 = 0.54 \;\;\;\;\;\;\;\; \omega_2 = 0.34\]
\[\omega_0 = -1.81 \;\;\;\;\;\;\;\; \omega_1 = 0.54 \;\;\;\;\;\;\;\; \omega_2 = 0.34\]
\[x_1 = 1.30 \;\;\;\;\;\;\;\; x_2 = 0.20\]
\[ f(\mathbf{x}) = -1.81 \;\; + \;\; \underbrace{0.54 \times 1.30}_{x_1} \;\; + \;\; \underbrace{0.34 \times 0.20}_{x_2} \]
\[ f(\mathbf{x}) = -1.81 \;\; + \;\; \underbrace{0.70}_{x_1} \;\; + \;\; \underbrace{0.07}_{x_2} \]
Increasing petal length (cm) by 1, increases the prediction by 0.54, ceteris paribus.
Increasing petal width (cm) by 1, increases the prediction by 0.34, ceteris paribus.
Manually introducing feature interaction terms allows linear models to account for such phenomena.
\[ f(\mathbf{x}) = \omega_0 + \omega_1 x_1 + \cdots + \omega_n x_n + \underbrace{\omega_{n+1} x_4 x_6}_{\textit{interaction}} \]
Generalized Linear Models (GLMs) allow to model alternative (to Gaussian) distributions of the prediction target.
\[ g(\mathbb{E}_Y(y|\mathbf{x})) = \omega_0 + \omega_1 x_1 + \cdots + \omega_n x_n \]
Generalized Additive Models (GAMs) allow to model nonlinear relationships – a weighted sum is replaced by a sum of arbitrary functions.
\[ g(\mathbb{E}_Y(y|\mathbf{x})) = \omega_0 + f_1(x_1) + \cdots + f_n(x_n) \]
This list is far from comprehensive and exhaustive.
\[ f(\mathbf{x}) = 0.2 \;\; + \;\; 0.25 \times x_1 \;\; - \;\; 0.47 \times x_2 \;\; + \;\; 0.01 \times x_3 \;\; + \;\; 0.70 \times x_4 \\ - \;\; 0.20 \times x_5 \;\; - \;\; 0.33 \times x_6 \;\; - \;\; 0.90 \times x_7 \]
Python | R |
---|---|
scikit-learn | built in |