Pruning is a popular method for finding sparse neural networks (NN) that can match or surpass dense NN in generalization and efficiency (at inference time). Unfortunately, naively training sparse NN from scratch negatively impacts generalization. Although Lottery Tickets (LT) and Dynamic Sparse Training (DST) show promise for addressing this shortcoming, the mechanisms whereby these approaches find the right sparse architectures are not well understood. In this paper, Evci et al. evaluate the role of gradient flow in finding and training sparse NN. They find that existing methods for initializing sparse NN have a poor gradient flow; however, these methods can be improved by taking into account the variance of each neuron separately. They also observe that  Lottery Tickets succeed because they re-learn the pruning solution from which they are derived.