Автор: Hadley Wickham, Garrett Grolemund, Mine Cetinkaya-Rundel
Издательство: O’Reilly Media, Inc.
Год: 2023-03-27
Страниц: 743
Язык: английский
Формат: pdf (true), epub
Размер: 22.9 MB
Learn how to use R to turn data into insight, knowledge, and understanding. Ideal for current and aspiring data scientists, this book introduces you to doing data science with R and RStudio, as well as the tidyverse—a collection of R packages designed to work together to make data science fast, fluent, and fun. Even if you have no programming experience, this updated edition will have you doing data science quickly.
You'll learn how to import, transform, and visualize your data and communicate the results. And you'll get a complete, big-picture understanding of the data science cycle and the basic tools you need to manage the details. Each section in this edition includes exercises to help you practice what you've learned along the way.
Updated for the latest tidyverse best practices, new chapters dive deeper into visualization and data wrangling, show you how to get data from spreadsheets, databases, and websites, and help you make the most of new programming tools.
Our goal in this part of the book is to give you a rapid overview of the main tools of Data Science: importing, tidying, transforming, and visualizing data. We want to show you the “whole game” of data science giving you just enough of all the major pieces so that you can tackle real, if simple, data sets. The later parts of the book, will hit each of these topics in more depth, increasing the range of data science challenges that you can tackle.
R has several systems for making graphs, but ggplot2 is one of the most elegant and most versatile. ggplot2 implements the grammar of graphics, a coherent system for describing and building graphs. With ggplot2, you can do more and faster by learning one system and applying it in many places. The Chapter 2 will teach you how to visualize your data using ggplot2. We will start by creating a simple scatterplot and use that to introduce aesthetic mappings and geometric objects – the fundamental building blocks of ggplot2. We will then walk you through visualizing distributions of single variables as well as visualizing relationships between two or more variables. We’ll finish off with saving your plots and troubleshooting tips.
Five chapters focus on the tools of Data Science:
• Visualisation is a great place to start with R programming, because the payoff is so clear: you get to make elegant and informative plots that help you understand data. In Chapter 1 you’ll dive into visualization, learning the basic structure of a ggplot2 plot, and powerful techniques for turning data into plots.
• Visualisation alone is typically not enough, so in Chapter 3, you’ll learn the key verbs that allow you to select important variables, filter out key observations, create new variables, and compute summaries.
• In Chapter 5, you’ll learn about tidy data, a consistent way of storing your data that makes transformation, visualization, and modelling easier. You’ll learn the underlying principles, and how to get your data into a tidy form.
• Before you can transform and visualize your data, you need to first get your data into R. In Chapter 7 you’ll learn the basics of getting .csv files into R.
Nestled among these chapters are five other chapters that focus on your R workflow. In Chapter 2, Chapter 4, Chapter 6, and Chapter 8 you’ll learn good workflow practices for writing and organizing your R code. These will set you up for success in the long run, as they’ll give you the tools to stay organised when you tackle real projects. Finally, Chapter 9 will teach you how to get help to keep learning.
You'll learn how to:
Visualize—create plots for data exploration and communication of results
Transform—discover types of variables and the tools you can use to work with them
Import—get data into R and in a form convenient for analysis
Program—learn R tools for solving data problems with greater clarity and ease
Скачать R for Data Science, 2nd Edition (Third Early Release)