Название: Automating Data Quality Monitoring at Scale: Scaling Beyond Rules with Machine Learning (Final)
Автор: Jeremy Stanley, Paige Schwartz
Издательство: O’Reilly Media, Inc.
Год: 2024
Страниц: 220
Язык: английский
Формат: True PDF, True/Retail EPUB
Размер: 21.4 MB, 10.1 MB
The world's businesses ingest a combined 2.5 quintillion bytes of data every day. But how much of this vast amount of data—used to build products, power AI systems, and drive business decisions—is poor quality or just plain bad? This practical book shows you how to ensure that the data your organization relies on contains only high-quality records. Most data engineers, data analysts, and data scientists genuinely care about data quality, but they often don't have the time, resources, or understanding to create a data quality monitoring solution that succeeds at scale. In this book, Jeremy Stanley and Paige Schwartz from Anomalo explain how you can use automated data quality monitoring to cover all your tables efficiently, proactively alert on every category of issue, and resolve problems immediately. We’ve written this book with three main audiences in mind. The first is the chief data and analytics officer (CDAO) or VP of data. The second audience for this book is the head of data governance. Our third audience is the data practitioner. Whether you’re a data scientist, analyst, or data engineer, your job depends on data quality, and the monitoring tools you use will have a significant impact on your day-to-day.