Life long learning

Continual learning

Evaluation

$R_{i, j}$ → after training task $i$ , accuracy on task $j$ .

$Accuracy = \dfrac{1}{T} \sum\limits^T R_{T, i}$

$Backward\ \ transfer = \dfrac{1}{T-1} \sum\limits^{T-1} R_{T, i} - R_{i, i}$

$Forward\ \ transfer = \dfrac{1}{T} \sum\limits^{T-1} R_{i-1, i} - R_{0, i}$

Selective synaptic plasticity

Partial neuron can be reformed

Catastrophic forgetting

$\theta^b$ : the model learned from previous task

Guard $b_i$ :

Each parameter has a $b_i$

How important should $\theta$ be close to $\theta^b$ in $i$ direction

If $b_i = 0$ → no constrain on $b_i$ , $b_i$ = $\infin$ → $\theta_i$ = $\theta_i^b$

$L^\prime(\theta) = L(\theta) + \lambda \sum\limits_i b_i(\theta_i - \theta_i^b)^2$

Gradient episodic memory(GEM)

Additional neural resource allocation

Memory reply