Transformer architectures, which continue to captivate ML research and engineering communities, eliminate recurrence and convolutions, and apply self-attention mechanisms to capture dependencies between input and output. However, multilayer perceptrons (MLP) are gaining popularity as several researchers demonstrate competitive results in different domains. In this paper, Liu et al. question the necessity of Transformers’ self-attention layers. They present an attention-free MLP-based alternative to Transformer, gMLP, that pairs spatial projections with multiplicative gaming. After studying the performance of this architecture on various vision and language modeling tasks, they conclude that with increased data and compute, simple spatial interaction mechanisms can match the performance of Transformers, possibly obviating the need for self-attention.