Data Warehouse 1980 - Current

Data Warehouses (1980 - 2000):

Pros

  • High Quality Data.
  • Standard modeling technique (star schema/Kimball).
  • Reliability through ACID transactions.
  • Very good fit for business intelligence.

Cons

  • Closed Formats.
  • Support only SQL.
  • No support for Machine Learning.
  • No streaming support.
  • Limited scaling support.

Data Lakes (2010 - 2020)

Pros

  • Support for open formats.
  • Can support all data types & their use cases.
  • Scalability through underlying cloud storage.
  • Support for Machine Learning & AI.

Cons

  • Weak schema support.
  • No ACID transaction support.
  • Low data quality.
  • Leads to "Data Swamps".

Lakehouses (2020 and beyond):

Pros

  • Support for both BI and ML/AI workloads.
  • Standard Storage Format.
  • Reliability through ACID transactions.
  • Scalability through underlying cloud storage.

Cons

  • Cost Considerations.
  • Data Governance and Security.
  • Performance Overhead. (Due to ACID transactions)