Term of the Moment

Hello World


Look Up Another Term


Definition: data lake


A large storage repository that holds data in their original format prior to being parsed and analyzed. The term is often associated with Hadoop, which was designed to hold huge amounts of data; for example, a data lake may hold all the data in an organization.

Unlike a "data warehouse," which contains structured data that has been examined and cleansed (deduped) and is available for analytics, a data lake contains both structured and unstructured data. Storing data in a lake is faster than in a warehouse. See Hadoop, data warehouse and deduplication.

Lakehouse = Lake + Warehouse
A data lakehouse combines unstructured data from a data lake and structured data from a data warehouse along with analytical warehouse tools. For example, high-speed SQL searches in warehouses are used in lakehouses. The lakehouse term was coined as cloud providers began to include warehouse functions in data lakes.