Автор: Tiffany Timbers, Trevor Campbell, Melissa Lee
Издательство: CRC Press
Серия: Data Science Series
Год: 2022
Страниц: 443
Язык: английский
Формат: pdf (true)
Размер: 30.1 MB
Data Science: A First Introduction focuses on using the R programming language in Jupyter notebooks to perform data manipulation and cleaning, create effective visualizations, and extract insights from data using classification, regression, clustering, and inference.
The text emphasizes workflows that are clear, reproducible, and shareable, and includes coverage of the basics of version control. All source code is available online, demonstrating the use of good reproducible project workflows.
Based on educational research and active learning principles, the book uses a modern approach to R and includes accompanying autograded Jupyter worksheets for interactive, self-directed learning. The book will leave readers well-prepared for Data Science projects. The use of Jupyter notebooks for exercises immediately places the student in an environment that encourages auditability and reproducibility of analyses. The integration of Git and GitHub into the course is a key tool for teaching about collaboration and community, key concepts that are critical to Data Science.
You will spend the first four chapters learning how to use R to load, clean, wrangle (i.e., restructure the data into a usable format) and visualize data while answering descriptive and exploratory data analysis questions. In the next six chapters, you will learn how to answer predictive, exploratory, and inferential data analysis questions with common methods in data science, including classification, regression, clustering, and estimation. In the final chapters (11–13), you will learn how to combine R code, formatted text, and images in a single coherent document with Jupyter, use version control for collaboration, and install and configure the software needed for data science on your own computer. If you are reading this book as part of a course that you are taking, the instructor may have set up all of these tools already for you; in this case, you can continue on through the book reading the chapters in order. But if you are reading this independently, you may want to jump to these last three chapters early before going on to make sure your computer is set up in such a way that you can try out the example code that we include throughout the book.
The book is designed for learners from all disciplines with minimal prior knowledge of mathematics and programming. The authors have honed the material through years of experience teaching thousands of undergraduates in the University of British Columbia’s DSCI100: Introduction to Data Science course.
Скачать Data Science: A First Introduction