Автор: Wei-Meng Lee
Издательство: O’Reilly Media, Inc.
Год: 2025
Страниц: 381
Язык: английский
Формат: epub
Размер: 22.5 MB
DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool.
Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL.
Understand the purpose of DuckDB and its main functions
Conduct data analytics tasks using DuckDB
Integrate DuckDB with pandas, Polars, and JupySQL
Use DuckDB to query your data
Perform spatial analytics using DuckDB's spatial extension
Work with a diverse range of data including Parquet, CSV, and JSON
Chapter 1, “Getting Started with DuckDB”, begins with an introduction to DuckDB, exploring its unique features and advantages over other database solutions. We will dive into why DuckDB is a preferred choice for high-performance analytical queries and will showcase its ability to integrate seamlessly with multiple programming languages and environments. This chapter sets the stage for a deeper exploration of the functionalities that make DuckDB a compelling option for data analysis.
Chapter 2, “Importing Data into DuckDB”, delves into the practical aspects of importing data into DuckDB. You will learn how to create databases, load data from various sources such as CSV, Parquet, and Excel files, and utilize different methods for loading data, including SQL queries and registration methods. This foundational knowledge is crucial for efficiently working with data in DuckDB.
In Chapter 3, “A Primer on SQL”, we’ll provide a primer on SQL tailored specifically for DuckDB users. Understanding SQL is essential for any data analyst or engineer, and this chapter will cover everything from basic commands to complex joins and aggregations. The hands-on examples will help you become proficient in querying and manipulating data and will make you more comfortable with DuckDB’s SQL syntax.
...
Chapter 9, “Using DuckDB in the Cloud with MotherDuck”, concludes the book with a deep dive into using DuckDB in the cloud through MotherDuck. You will learn how to sign up for MotherDuck, create and manage databases, and perform hybrid queries that combine local and cloud datasets. This chapter highlights the future of data analytics in a cloud-centric world, providing you with the tools to adapt to emerging trends in data management.
This book is designed for a diverse audience, including data analysts, data scientists, software developers, and decision-makers who are looking for efficient solutions to their data challenges. Whether you are new to DuckDB or have some experience with it, you will find valuable insights, practical examples, and best practices that will enhance your understanding and application of this powerful database system.
Скачать DuckDB: Up and Running: Fast Data Analytics and Reporting