Companies are increasingly adopting data stacks wherein data flows from operational data systems into a data lake that provides cheap, flexible storage (including for unstructured data); and then into a data warehouse optimized for analytics and BI use cases. However, this architecture may compromise reliability (since keeping the lake and warehouse consistent is challenging), timeliness (since data must be loaded into the lake before the warehouse), and flexibility (since many data warehouses do not easily support popular ML frameworks). In this paper, Databricks and Stanford researchers discuss the Lakehouse – a data management solution that extends the data lake concept to provide high-performance, fast and direct I/O, and management features. The Lakehouse offers reliable data management by providing transactional views of a data lake; support for ML and DS use cases by leveraging declarative DataFrame APIs, and SQL performance by applying auxiliary data about Parquet/ORC datasets to optimize data layout within these formats.