$$ f(x)=w^T x+b $$
也可以:$x=[x_1, x_2,..., x_d,b]^T$,$\omega= [\omega_1, \omega_2, ..., \omega_d, 1]^T$
$$ f(x)=w^T x $$
Polynomial Curve Fitting → Linear Model
Minimize the sum of squared error (differences) between $w^Tx_i$ and $y_i$
$$ J_n(w)=\underset{i=1}{\overset{n}{\sum}}(y_i-w^Tx_i)^2\\=(y-X^Tw)^T(y-X^Tw) $$
To optimize it: computing the gradient gives and set it to zero:
$$ w=(XX^T)^{-1}Xy $$
The fitted values at the training inputs are:
$$ \hat{y}=X^Tw=X^T(XX^T)^{-1}Xy $$