Название: Neural Text-to-Speech Synthesis
Автор: Xu Tan
Издательство: Springer
Серия: Artificial Intelligence: Foundations, Theory, and Algorithms
Год: 2023
Страниц: 214
Язык: английский
Формат: pdf (true)
Размер: 10.04 MB
Text-to-speech (TTS) synthesis is an Artificial Intelligence (AI) technique that renders a preferably naturally sounding speech given an arbitrary text. It is a key technological component in many important applications, including virtual assistants, AI-generated audiobooks, speech-to-speech translation, AI news reporters, audible driving guidance, and digital humans. In the past decade, we have observed significant progress made in TTS. These new developments are mainly attributed to Deep Learning techniques and are usually referred to as neural TTS. Many neural TTS systems have achieved human quality for the tasks they are designed for. This book first introduces the history of TTS technologies and overviews neural TTS, and provides preliminary knowledge on language and speech processing, neural networks and Deep Learning, and deep generative models.