Автор: Daniel Zelterman
Издательство: Springer
Год: 2022
Страниц: 469
Язык: английский
Формат: pdf (true)
Размер: 10.2 MB
This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the Behavior Risk Factor Surveillance System, discussing both the shortcomings of the data as well as useful analyses. The text avoids theoretical derivations beyond those needed to fully appreciate the methods. Prior experience with R is not necessary.
Multivariate statistics is a mature field with many different methods. Many of these are mathematical. Fortunately, these methods have been programmed so you should be able to run these on your computer without much difficulty. This book is targeted at a graduate-level practitioner who may need to use these methods but does not necessarily know about the mathematical derivations. Readers should have taken at least one course in statistics previously and have some familiarity with such topics as t-test, degrees of freedom (df), p-values, statistical significance, and the chi-squared test of independence in a 2x2 table. They should also know the basic rules of probability such as independence and conditional probability. The reader should have some basic computing skills including data editing. It is not necessary to have experience with R or with programming languages although these are good skills to develop.
The field of statistics has developed many useful methods for analyzing data and many of these methods are already programmed for you and readily available in R. What’s more, R is free, widely available, open source, flexible, and the current fashion in statistical computing. Authors of new statistical methods are regularly contributing to the many libraries in R so many new results are included as well.
Finally, a word about the choice of R for the present book. There are a number of high-quality software packages available to the data analyst today. As with any type of tool, some are better suited for the task at hand than others. Understanding the strengths and limitations will help determine which is appropriate for your needs. It is better to decide this early, rather than invest a lot of time on a major project, only to be disappointed later. Let us begin with a side-by-side comparison of SAS4 and R, two popular languages regularly in use by the statistical community today. The most glaring differences between these packages are the capability to handle huge databases and the capability to provide a certification for the validity of the results. SAS is the standard package for many applications such as in pharmaceuticals and financials because it can handle massive datasets and provide third-party certification. In contrast, R is more suited for quick and nimble analyses of smaller datasets.
The learning curve for R is not terribly steep. Most users are up and running quickly, performing many useful actions. R provides a nice graphical interface encouraging visual displays of information as well as mathematical calculation. Once you get comfortable with R, you will probably want to learn more. It is highly recommended users of R work in Rstudio, an interface providing both assistance for novices as well as productivity tools for experienced users. The Rstudio opens four windows: one for editing code, a window for the console to execute R code, one to keep track of the variables defined in the workspace, and a fourth to display graphical images.
Скачать Applied Multivariate Statistics with R, 2nd Edition