Автор: Lloyd Wai Yee Low, Martti Tapani Tammi
Издательство: World Scientific Publishing
Год: 2023
Страниц: 268
Язык: английский
Формат: pdf (true)
Размер: 25.3 MB
Next-Generation Sequencing (NGS) is increasingly common and has applications in various fields such as clinical diagnosis, animal and plant breeding, and conservation of species. This incredible tool has become cost-effective. However, it generates a deluge of sequence data that requires efficient analysis. The highly sought-after skills in computational and statistical analyses include Machine Learning and, are essential for successful research within a wide range of specializations, such as identifying causes of cancer, vaccine design, new antibiotics, drug development, personalized medicine, and increased crop yields in agriculture. This invaluable book provides step-by-step guides to complex topics that make it easy for readers to perform specific analyses, from raw sequenced data to answer important biological questions using Machine Learning methods. It is an excellent hands-on material for lecturers who conduct courses in bioinformatics and as reference material for professionals. The chapters are standalone recipes making them suitable for readers who wish to self-learn selected topics. Readers gain the essential skills necessary to work on sequenced data from NGS platforms; hence, making themselves more attractive to employers who need skilled bioinformaticians.
Many developers of NGS tools prefer to use Linux as the operating system for their works. To use these tools (e.g. BWA, Bowtie, and SAMTOOLS) users need to have a good level of proficiency in Linux. However, to our knowledge, most biologists who need to work with NGS are unfamiliar with the operating system and require at least a gentle introduction on this topic for them to better understand commonly used commands in Linux. Otherwise, they need to juggle with two difficulties while learning NGS tools; (i) the general Linux features and (ii) the new tools that they need to master. The aim of this chapter is to remove the first difficulty associated with familiarizing oneself with the Linux system so that users can concentrate on understanding NGS tools. It is not possible to cover all aspects of Linux but the intention here is for users to be able to navigate the rest of the chapters with ease.
Until now all sequence reads are in just one file and sometimes there might be a requirement to separate these sequences into individual files. Now we make use of shell scripting to split each sequence into individual files. In a typical UNIX-like system (including Linux), Shell has been instrumental in bridging between the user and the computer. Shell is a command interpreter that interprets user instructions to Kernel for further execution. There are many types of Shell in Linux such as: Bourne Shell (SH), C Shell (CSH), Korn Shell (KSH), TC Shell (TCSH) and Bourne Again Shell (BASH). The latter one (BASH) is the most popular Shell because it incorporates useful features from the KSH and CSH. A Shell is not only an excellent command line interpreter, but also has scripting features that allows automation of tasks that would otherwise require lot of steps.
Readership: It is an excellent hands-on material for teachers and lecturers who conduct courses in bioinformatics and as a reference material for professionals. The chapters are written to be standalone recipes making it suitable for students who wish to self-learn selected topics such as how to apply Machine Learning to study genomic features. It is a necessary companion for undergraduates, graduate students, researchers and anyone interested in the exponentially growing field of bioinformatics.
Скачать Practical Bioinformatics for Beginners: From Raw Sequence Analysis to Machine Learning Applications