Convolutional neural networks contain layers that project input features to a higher level representation without degrading their resolution. To infer high-level low resolution information by integrating over low-level, high-resolution measurements, some models may include down-sampling operators like pooling layers and strided convolutions that reduce the resolution of the immediate representation. Although strided convolutions help ensure that the most relevant features are extracted and improve shift invariance, finding the optimal stride configuration is challenging. Most researchers treat strides as hyperparameters rather than trainable parameters since learning the best combination of strides would be prohibitively expensive as the number of downsampling layers grows. In contrast, Riad et al. introduce DiffStride, a downsampling layer that learns it strides jointly with the rest of the network. DiffStride learns the size of a cropping mask by backpropagation and can improve model performance when used as a drop-in replacement for strided convolutions.