Автор: Maurizio Schiavi
Издательство: Amazon.com Services LLC
Год: 2020
Язык: английский
Формат: pdf, azw3, epub
Размер: 10.1 MB
All you need to know about EDA in R. Beginner to Intermediate users.
Machine Learning (ML) is a branch of Artificial Intelligence (AI), perhaps the most important of all. It consists of a series of techniques that allow machines, that is, our computers, to learn from data, correct forecast errors and improve more and more their performance. And they do it simply: trivially applying a sequence of operations in an iterated way, until they reach a good result.
Take for example one of my favorite Machine Learning algorithms: random forests. Without going too far into the topic, random forests are a set of decision trees; the latter take the data and choose, from time to time, the variable that most divides the dataset in two groups, different one to the other.
Each algorithm has its own peculiarities, its advantages and its limitations. Because of this, we need to have a solid theoretical basis on the whole data analysis process, starting from cleaning and exploring the data. And this is what we deal with in this ebook. We will see the preliminary steps of a data science project, how to deal with missing information and also representing variables effectively.
The goal of this ebook is to address the first phase of a data science project, that is what is called Exploratory Data Analysis (EDA). It starts from the preparation of the dataset, which can then be used for more advanced analyzes, and from a first exploration of the data through statistics and graphs. The software we will use is R, a free and downloadable software. I also recommend using RStudio, a free software that makes the data science experience simpler and more intuitive.
Contents:
Introduction
Before starting…
Case study
Packages installation
Chapter 01. How to import data
Capitolo 02. Dealing with missing values
Chapter 03. Data Statistics
Capitolo 04. Data visualization
Chapter 05: Hypotheses testing
Conclusions
About The Author
Скачать R Machine Learning: Exploratory Data Analysis