Автор: Kyle Walker
Издательство: CRC Press
Серия: The R Series
Год: 2023
Страниц: 378
Язык: английский
Формат: pdf (true)
Размер: 147.7 MB
Census data are widely used by practitioners to understand demographic change, allocate resources, address inequalities, and make sound business decisions. Until recently, projects using US Census data have required proficiency with multiple web interfaces and software platforms to prepare, map, and present data products. This book introduces readers to tools in the R programming language for accessing and analyzing Census data, helping analysts manage these types of projects in a single computing environment.
R is one of the most popular programming languages and software environments for statistical computing and is the focus of this book with respect to software applications. This section introduces some basics of working with R and covers some terminology that will help readers work through the sections of this book. If you are an experienced R user, you can safely skip this section; however, readers new to R will find this information helpful before getting started with the applied examples in the book. Once R is installed, I strongly recommend that you install RStudio, the premier integrated development environment (IDE) for R. While you can run R without RStudio, RStudio offers a wide variety of utilities to make analysts’ work with R easier and more streamlined.
R is free and open source software (FOSS), which means that R is free to download and install, and its source code is open for anyone to view. This brings the substantial benefit of encouraging innovation from the user community, as anyone can create new packages and either submit them for publication to the official CRAN repository or host them on their personal GitHub page. In turn, new methodological innovations are often quickly accessible to the R user community. However, this can make R feel fragmented, especially for users coming from commercial software designed to have a consistent interface. Package syntax will sometimes represent idiosyncratic choices of the developer, which can make R confusing to beginners.
The tidyverse ecosystem developed by RStudio is one of the most popular frameworks for data analysis in R and attempts to respond to problems introduced by package fragmentation. The tidyverse consists of a series of R packages designed to address common data analysis tasks (data wrangling, data reshaping, and data visualization, among many others) using a consistent syntax. Many R packages are now developed with integration within the tidyverse in mind. A good example of this is the sf package which integrates spatial data analysis and the tidyverse. This book is largely written with the tidyverse and sf ecosystems in mind; tidyverse is covered in greater depth in Chapter 3, and sf is introduced in Chapter 5.
Chapters in this book cover the following key topics:
- Rapidly acquiring data from the decennial US Census and American Community Survey using R, then analyzing these datasets using tidyverse tools;
- Visualizing US Census data with a wide range of methods including charts in ggplot2 as well as both static and interactive maps;
- Using R as a geographic information system (GIS) to manage, analyze, and model spatial demographic data from the US Census;
- Working with and modeling individual-level microdata from the American Community Survey’s PUMS datasets;
- Applying these tools and workflows to the analysis of historical Census data, other US government datasets, and international Census data from countries like Canada, Brazil, Kenya, and Mexico.
Скачать Analyzing US Census dаta: Methods, Maps, and Models in R