Network compression

Smaller model, less parameters.

Network pruning

Reduce parameter

Evaluate the importance of weight/neuron

Fin-tune the network

Lottery ticket hypothesis

Knowledge distillation

Student net learn from teacher net

Ensemble:

Average multi-models

Teacher net can be a ensemble

Temperature:

Smoothen softmax

$y_i^{\prime} = \dfrac{e^{y_i}}{\sum\limits_j e^{y_j}}$ $y_i^{\prime} = \dfrac{e^{\frac{y_i}{T}}}{\sum\limits_j e^{\frac{y_j}{T}}}$

Easy for student to learn

Parameter quantization

Using less bits to present a value

Weight clustering

Binary weights

Weight is either +1 or -1

Prevent overfitting

Huffman encoding

Depthwise separable convolution

Low rank approximation

Dynamic computing

The network adjust the required computation

Computation based on sample difficulty