Автор: Haiping Huang
Издательство: Springer, Higher Education Press
Год: 2022
Страниц: 302
Язык: английский
Формат: pdf (true), epub
Размер: 39.3 MB
This book highlights a comprehensive introduction to the fundamental statistical mechanics underneath the inner workings of neural networks. The book discusses in details important concepts and techniques including the cavity method, the mean-field theory, replica techniques, the Nishimori condition, variational methods, the dynamical mean-field theory, unsupervised learning, associative memory models, perceptron models, the chaos theory of recurrent neural networks, and eigen-spectrums of neural networks, walking new learners through the theories and must-have skillsets to understand and use neural networks. The book focuses on quantitative frameworks of neural network models where the underlying mechanisms can be precisely isolated by physics of mathematical beauty and theoretical predictions. It is a good reference for students, researchers, and practitioners in the area of neural networks.
Neural networks have become a powerful tool in various domains of scientific research and industrial applications. However, the inner workings of this tool remain unknown, which prohibits us from a deep understanding and further principled design of more powerful network architectures and optimization algorithms. To crack the black box, different disciplines including physics, statistics, information theory, non-convex optimization and so on must be integrated, which may also bridge the gap between the artificial neural networks and the brain. However, in this highly interdisciplinary field, there are few monographs providing a systematic introduction of theoretical physics basics for understanding neural networks, especially covering recent cutting-edge topics of neural networks.
In this book, we provide a physics perspective on the theory of neural networks, and even neural computation in models of the brain. The book covers the basics of statistical mechanics, statistical inference, neural networks, and especially classic and recent mean-field analysis of neural networks of different nature. These mathematically beautiful examples of statistical mechanics analysis of neural networks are expected to inspire further techniques to provide an analytic theory for more complex networks. Future important directions along the line of scientific machine learning and theoretical models of brain computation are also reviewed.
We remark that this book is not a complete review of both fields of artificial neural networks and mean-field theory of neural networks, instead, a biased-viewpoint of statistical physics methods toward understanding the black box of deep learning, especially for beginner-level students and researchers who get interested in the mean-field theory of learning in neural networks.
This book stemmed from a series of lectures about the interplay between statistical mechanics and neural networks. The book is organized into two parts—basics of statistical mechanics related to the theory of neural networks, and theoretical studies of neural networks including cortical models.
The first part is further divided into nine chapters. Chapter 1 gives a brief history of neural network studies. Chapter 2 introduces multi-spin interaction models and the cavity method to compute the partition function of disordered systems. Chapter 3 introduces the variational mean-field methods including the Bethe approximation and belief propagation algorithms. Chapter 4 introduces the Monte Carlo simulation methods that are used to acquire low-energy configurations of a statistical mechanical system. Chapter 5 introduces high-temperature expansion techniques. Chapter 6 introduces the spin glass model where the Nishimori line was discovered. Chapter 7 introduces the random energy model which is an infinite-body interaction limit of multi-spin disordered systems. Chapter 8 introduces a statistical mechanical theory of the Hopfield model that was designed for associative memory of random patterns based on the Hebbian local learning rule. Chapter 9 introduces the concepts of replica symmetry and replica symmetry breaking in the spin glass theory of disordered systems.
The second part is divided into nine chapters. Chapter 10 introduces the Boltzmann Machine Learning (also called the inverse Ising problem in physics or maximum entropy method in statistics) and the statistical mechanics of the restricted Boltzmann Machine Learning. In this chapter, a variational mean-field theory for learning a generic RBM of discrete synapses is also introduced in depth. Chapter 11 introduces the simplest model of unsupervised learning. Chapter 12 introduces the nature of unsupervised learning with RBM (only two hidden neurons are considered), i.e., the unsupervised learning process can be understood in terms of a series of continuous phase transitions, including both weight-reversal symmetry breaking and hidden-neuron-permutation symmetry breaking. Chapter 13 introduces a single-layer discrete perceptron and its mean-field theory. Chapter 14 introduces the mean-field model of multi-layered perceptron and its analysis via the cavity method. In this chapter, a mean-field training algorithm of multi-layered perceptron with discrete synapses is introduced, together with mean-field training from an ensemble perspective. Chapter 15 introduces the mean-field theory of dimension reduction in deep random neural networks. Chapter 16 introduces the chaos theory of random recurrent neural networks. In this chapter, the excitatory-inhibitory balance theory of cortical circuits is also introduced, together with the backpropagation through time for training a generic RNN. Chapter 17 introduces how the statistical mechanics technique can be applied to compute the asymptotic behavior of the spectral density for the Hermitian and the non-Hermitian random matrices. Finally, perspectives on a statistical mechanical theory toward deep learning and even other interesting aspects of intelligence are provided, hopefully inspiring future developments of the interdisciplinary fields across physics, machine learning and theoretical neuroscience and other involved disciplines.
Скачать Statistical Mechanics of Neural Networks