ch4

Next: ch4 Up: ch4 Previous: ch4

Gradient Descent

To understand, consider simpler linear unit, where

$\begin{displaymath}o = w_{0} + w_{1}x_1 + \cdots + w_n x_n \end{displaymath}$

Let's learn $w_{i}$ 's that minimize the squared error

$\begin{displaymath}E[\vec{w}] \equiv \frac{1}{2}\sum_{d \in D}(t_{d} - o_{d})^{2} \end{displaymath}$

Where is set of training examples

Don Patterson 2001-12-13