Автор: Andrew Madson, Toby Mao, Iaroslav Zeigerman
Издательство: O’Reilly Media, Inc.
Год: 2026-04-08
Язык: английский
Формат: pdf, epub
Размер: 10.1 MB
Data Transformation: The Definitive Guide provides a rigorous and practical roadmap for designing scalable, efficient, and maintainable data pipelines. Written by leaders in the field, this book introduces foundational principles and modern practices that treat data transformation with the same discipline as software development—equal parts theory and hands-on implementation.
Version Control for Code and Models: If the transformation code (SQL, Python scripts, SQLMesh, etc.) isn’t rigorously version-controlled, it’s difficult to recreate the exact logic used in a prior run. Teams that modify pipeline code without tracking versions will struggle to reproduce an earlier state of the pipeline. Without history, you can’t roll back to the exact code used at a given time. Past results may be irreproducible. The same goes for AI and Machine Learning pipelines. Not versioning models or parameters means you can’t later rerun the pipeline with the same model to get the same outcome.
Скачать Data Transformation: The Definitive Guide (Early Release)