Автор: Bartosz Konieczny
Издательство: Leanpub
Год: 2022-12-19
Страниц: 303
Язык: английский
Формат: pdf (true), epub
Размер: 28.3 MB
Doing data engineering without the cloud is hard to imagine nowadays. Does it mean, each cloud provider is different? Well, yes, each of them has its own specificities but it doesn't mean they're completely different. They all support several common data engineering patterns that you will find in the book classified in 7 categories: data processing, data storage, data security, data warehouse, data management, data orchestration and data transfer. This organization and the explanation of each pattern will help you understand a new cloud provider easier, whenever you come from AWS, Azure, or GCP environment.
I’ve started my cloud data engineering journey with a data ingestion project on AWS. It turns out, it was also my first contact with data engineering overall, so I felt very excited about this completely new domain. After discovering all these new "data" things, I knew I wanted to settle my career in this field for a while. So I started to look for a new data challenge and found a project on GCP.
Although the scope was limited to the batch processing services, I tried to make some effort and see the others on my own. That’s when I had several aha moments. I realized that it was possible to solve the same problem with a similar approach on both clouds. From that moment, I’ve started to consider cloud services in terms of similarities.
That’s also how the idea of writing this book was born. I called it with the "data engineering patterns" keyword because the described solutions are present in at least 2 cloud providers. Hence they’re something more general that you can use to solve the problem while moving from one cloud project to another. A little bit like a design pattern that you can apply in different programming languages to solve a given problem.
Who is this book for?
If you’re a data engineer already working with a cloud provider, the book will help extend your cloud knowledge. In your situation, you can learn by analogy between the concepts you know and the ones you want to discover. If you are a data engineer with an on-premise background and feel comfortable with general data concepts like partitioning or horizontal scaling, you will also find something interesting in the book. The book should be for you a shortcut to map your on-premise knowledge to the flexible world of the cloud.
If you are an aspiring data engineer, the book’s content can help you structure and extend your knowledge to the most up-to-date cloud technologies. If you are a data scientist or data analyst, the content of the book can help you build better data products. Although it targets common data engineering problems, you can leverage this knowledge to innovate and improve your out-of-the-box thinking in your analytic or ML projects. Finally, if you’re a backend software engineer already working on the cloud, you should find something interesting for the services usually shared between data and software engineers, such as object stores, streaming brokers, or NoSQL databases. Maybe you’ll discover alternative ways to write your programs?
Скачать Data Engineering patterns on the cloud : How to solve common data engineering problems with cloud services?