Автор: Gabor J. Szekely, Maria L. Rizzo
Издательство: CRC Press
Серия: Monographs on Statistics and Applied Probability
Год: 2023
Страниц: 467
Язык: английский
Формат: pdf (true)
Размер: 10.1 MB
Energy distance is a statistical distance between the distributions of random vectors, which characterizes equality of distributions. The name energy derives from Newton's gravitational potential energy, and there is an elegant relation to the notion of potential energy between statistical observations. Energy statistics are functions of distances between statistical observations in metric spaces. The authors hope this book will spark the interest of most statisticians who so far have not explored E-statistics and would like to apply these new methods using R. The Energy of Data and Distance Correlation is intended for teachers and students looking for dedicated material on energy statistics, but can serve as a supplement to a wide range of courses and areas, such as Monte Carlo methods, U-statistics or V-statistics, measures of multivariate dependence, goodness-of-fit tests, nonparametric methods and distance based methods.
We could do accounting, “statistics,” “statistical inference,” “data science,” and “exact science.” Data gradually became more general than numbers; today data might mean vectors, functions, graphs, networks, and many other abstract notions. Data Science is a continuation of some of the data analysis fields such as statistics, Machine Learning, data mining, and predictive analytics. It was John Tukey who started to transform academic statistics in the direction of Data Science in his classical work “The Future of Data Analysis”. See also “50 Years of Data Science”. The energy of data is a new concept that was introduced in the 1980’s to help us to work with real numbers even if the data is complex objects like matrices, graphs, functions, etc. Instead of working with these objects themselves, we can work with their real-valued distances. For this, all we need is that distances between the data are defined. Mathematically this means that the data are in a metric space. In this way we can go back to the Paradise of real numbers.
- E-statistics provides powerful methods to deal with problems in multivariate inference and analysis.
- Methods are implemented in R, and readers can immediately apply them using the freely available energy package for R.
- The proposed book will provide an overview of the existing state-of-the-art in development of energy statistics and an overview of applications.
- Background and literature review is valuable for anyone considering further research or application in energy statistics.
Скачать The Energy of Data and Distance Correlation