![](/uploads/posts/2021-04/thumbs/1618974621_delta.png)
Автор: Denny Lee, Tathagata Das & Vini Jaiswal
Издательство: O’Reilly Media
Год: 2021-04-20
Формат: epub
Размер: 10 Mb
Язык: English
Analysis and machine learning models are only as good as the data they???re built with. Querying processed data and gaining insights from it requires a robust data pipeline--and an effective storage solution that ensures data quality, data integrity, and performance.
This guide introduces you to Delta Lake, an open-source format that enables you to build a lakehouse architecture on top of existing storage systems such as S3, ADLS, GCS, and HDFS. Delta Lake enhances Apache Spark by supporting data integrity, data quality, and performance and making it easy to store and manage massive amounts of complex data.