Many database queries involve joining relational tables that may be partitioned across several compute nodes. In this paper, Parchas et al. explore approaches to finding the optimal distribution of data to reduce network cost and optimize query performance. Specifically, they seek to determine which attribute should be used to hash-distribute each database to minimize inter-server communications. They formalize the distribution key recommendation (DKR) problem as a graph-theoretic problem, introducing the Join Multi-Graph to represent the join characteristics of a cluster. After finding that DKR is NP-complete, they propose BaW, which combines heuristic and exact algorithms to recommend optimal distribution schemes and validate this approach on real-world data from Redshift clusters.