Автор: Rukmani Gopalan
Издательство: O’Reilly Media, Inc.
Год: 2023
Страниц: 244
Язык: английский
Формат: epub (true)
Размер: 10.2 MB
More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless end-to-end pipeline from data to insights.
This book provides a concise yet comprehensive overview on the setup, management, and governance of a cloud data lake. Author Rukmani Gopalan, a product management leader and data enthusiast, guides data architects and engineers through the major aspects of working with a cloud data lake, from design considerations and best practices to data format optimizations, performance optimization, cost management, and governance.
Learn the benefits of a cloud-based big data strategy for your organization
Get guidance and best practices for designing performant and scalable data lakes
Examine architecture and design choices, and data governance principles and strategies
Build a data strategy that scales as your organizational and business needs increase
Implement a scalable data lake in the cloud
Use cloud-based advanced analytics to gain more value from your data
Who Should Read This Book?
This book is primarily targeted at data architects, data developers, and data ops professionals who want to get a broad understanding of the various aspects of setting up and operating their cloud data lake. At the end of this book, you will have an understanding of the following:
• The benefits of a cloud-based big data strategy for your organization
• Architecture and design choices, including the modern data warehouse, data lakehouse, and data mesh
• Guidance and best practices for designing performant and scalable data lakes
• Data governance principles, strategies, and design choices
Whether you are taking your first steps or looking at modernizing your data lake on the cloud, my hope is that you will be prepared to have an informed, educated design conversation with your cloud provider and your engineering teams, and you will be able to plan and budget for your engineering investments in terms of time, effort, and money. Big data analytics is one of the areas where development, technologies, and paradigm shifts happen in the blink of an eye. To me, this illustrates the abundant opportunities that are now possible. I will keep the considerations neutral of any specific technology, so when a new technology emerges, we will be able to apply these fundamentals in the context of all the available technology choices.
Скачать The Cloud Data Lake (Final Release)