next up previous
Next: ch4 Up: ch4 Previous: ch4

Incremental (Stochastic) Gradient Descent


Batch mode Gradient Descent:

Do until satisfied

Compute the gradient $\nabla E_{D}[\vec{w}]$

$\vec{w} \leftarrow \vec{w} -\eta \nabla E_{D}[\vec{w}] $




Incremental mode Gradient Descent:

Do until satisfied

For each training example $d$ in $D$ Compute the gradient $\nabla E_{d}[\vec{w}]$

$\vec{w} \leftarrow \vec{w} -\eta \nabla E_{d}[\vec{w}] $



\begin{displaymath}E_{D}[\vec{w}] \equiv \frac{1}{2}\sum_{d \in D}(t_{d} - o_{d})^{2} \end{displaymath}


\begin{displaymath}E_{d}[\vec{w}] \equiv \frac{1}{2}(t_{d} - o_{d})^{2} \end{displaymath}

Incremental Gradient Descent can approximate Batch Gradient Descent arbitrarily closely if $\eta$ made small enough



Don Patterson 2001-12-13