Deep residential networks (ResNets) with batch normalization (batch norm) are widely used for computer vision tasks since batch norm can stabilize and accelerate the learning process. However, the batch norm can introduce reliability issues when implemented in production environments since it breaks the independence between training examples in the minibatch. Recent research suggests that ResNets can be trained to competitive accuracies without normalization layers. In this context, Brock et al. propose Adaptive Gradient Clippings to enable model developers to train Normalizer-Free Networks with larger batch sizes and stronger data augmentations. They introduce NFNets, a family of Normalizer-Free ResNets that achieve SOTA validation accuracies with faster training times, especially after fine-tuning.