Layer Normalization

Same behaviour at train and test time!

Used in RNN and transformers