Название: Practical Implementation of a Data Lake: Translating Customer Expectations into Tangible Technical Goals
Автор: Nayanjyoti Paul
Издательство: Apress
Год: 2023
Страниц: 219
Язык: английский
Формат: pdf, epub
Размер: 10.1 MB
This book explains how to implement a data lake strategy, covering the technical and business challenges architects commonly face. It also illustrates how and why client requirements should drive architectural decisions. Drawing upon a specific case from his own experience, author Nayanjyoti Paul begins with the consideration from which all subsequent decisions should flow: what does your customer need? He also describes the importance of identifying key stakeholders and the key points to focus on when starting a new project. Next, he takes you through the business and technical requirement-gathering process, and how to translate customer expectations into tangible technical goals. From there, you’ll gain insight into the security model that will allow you to establish security and legal guardrails, as well as different aspects of security from the end user’s perspective. You’ll learn which organizational roles need to be onboarded into the data lake, their responsibilities, the services they need access to, and how the hierarchy of escalations should work. Subsequent chapters explore how to divide your data lakes into zones, organize data for security and access, manage data sensitivity, and techniques used for data obfuscation. Audit and logging capabilities in the data lake are also covered before a deep dive into designing data lakes to handle multiple kinds and file formats and access patterns. For Data Scientists and Architects, Machine Learning Engineers, and Software Engineers.