1. Introduction
基本结构:
- input layer
- first layer
- convolutional layer
- aims to learn feature representations of the inputs. The activation function introduces nonlinearities to CNN.
- pooling layer
- aims to achieve shift-invariance by reducing the resolution of the feature maps.
- between convolutional layers,
- fully-connected layer
- aims to perform high-level reasoning.
- take all neurons in the previous layer and connect them to every single neuron of current layer to generate global semantic information.
- output layer
- last layer
2. Improvements
1. Convolutional Layer
- Tiled Convolution
- Transposed Convolution
- Dilated Convolution
- Network in Network
- Inception Module
2. Pooling Layer
- $L_p$ Pooling
Average Pooling和Max Pooling的一般形式 - Mixed Pooling
Average Pooling和Max Pooling的加权 - Stochastic Pooling
拟合multinomial分布,抽样pooling - Spectral Pooling
- Spatial Pyramid Pooling
- Multi-scale Orderless Pooling
3. Activation Function
- ReLU(Recitified Linear Unit)
- Leaky ReLU
- Parametric ReLU
- Randomized ReLU
- ELU(Exponential Linear Unit)
- Maxout
- Probout
4. Loss Function
- Hinge Loss
- Softmax Loss/Large-Margin Softmax
- Contrastive Loss
- Triplet Loss
anchor/positive/negative pair
- KL Divergence/JSD
5. Regularization
- lp-norm
- Dropout
随机将neuron的输出置为0 - DropConnect
随机将学习到的weight置为0
6. Optimization
- Data Augmentation
- Weight Initialization
- Stochastic Gradient Descent
- Batch Normalization
- Shortcut Connections
3. Fast processing of CNN
- FFT
- Structured Transforms
- Low Precision
- Weight Compression
- Sparse Convolution