Several applications of reinforcement learning (RL) can be represented as sequential decision-making problems (i.e., involving a sequence of states, actions, and rewards). Since Transformers excel at modeling long sequences, Chen et al. propose replacing conventional RL algorithms with Transformers that are trained to predict the next token in a sequence of available returns, states, and actions. They demonstrate that the Decision Transformer can outperform model-free offline RL algorithms on several benchmarks – without dynamic programming.