UPDATE: I now use this book in my teaching at the Data Science for Internet of Things – practitioner’s course
This blog is strictly not a book review since the book Fundamentals of Deep Learning: Designing Next-Generation Artificial Intelligence Algorithms by Nikhil Buduma is being published as an O Reilly early release (raw and unedited) book.
However, I have been a fan of Nikhil Buduma’s blog and writing. Hence, I bought the book as an early release and have enjoyed reading it. I also want to include it as a recommended book at the course I teach at Oxford University (Data Science for the Internet of Things)
There are very few accessible books on Deep Learning and it’s a complex and an evolving topic as I discussed in a recent blog – The evolution of Deep Learning Models. If you follow the detailed but readable posts on Nikhil’s blog such as A Deep dive into Recurrent neural networks - you will enjoy the book
The first three chapters are released of the table of contents
Chapter 1 : The Neural Network
Chapter 2 : Training Feed Forward Neural Networks
Chapter 3 : Implementing Neural Networks in Theano
Chapter 4 : Beyond Gradient Descent
Chapter 5 : Convolutional Neural Networks:
Chapter 6 : Hopfield Networks and Restricted Bolzmann Machines
Chapter 7 : Deep Belief Networks
Chapter 8 : Recurrent Neural Networks
Chapter 9 : Autoencoders
Chapter 10 : Supplementary: Universality Theorems
(The table of contents is evolving )
I spoke to Nikhil about the creation and evolution of the book. Here are some comments from our discussion
How did the book idea come about
I first started writing about deep learning on my blog around January. I’d been hacking on it for a while and figured I might share the lessons I had learned applying these models to problems I’m passionate about (healthcare and language processing) with my peers within the MIT community. My blog got some pretty good reception, and ended up piquing the interest of Ben Lorica and Mike Loukides from O’Reilly. We talked about the possibility of writing a book, and I figured it would be a great way to make the field more accessible to a larger audience.
Writing an accessible book on Deep Learning
There’s definitely materials online for people interested in deep learning – a hodgepodge of papers, tutorials, and some books. Most of these materials are geared towards a highly academic audience, and it’s not particularly simple to navigate these resources. My goal was to synthesize the progress in the field so that anybody with some mathematical sophistication (basic calculus and familiarity with matrix manipulation) and Python programming under their belt would be able to tackle deep learning head on.
Explanation of Deep Learning models
As with classical machine learning, deep learning models can also be classified into three major areas – supervised, unsupervised, and reinforcement learning. My approach to the book is to develop an intuition for the major types of models. But in addition to being able to build their own, I’d like readers to come away with an understanding of why each model is designed the way it is. I think it’s this understanding that will enable readers to successfully leverage deep learning to tackle their own data challenges. I’m also interested in exploring some of the more exotic networks (augmented networks, long-term recurrent convolutional networks, spatial transformer networks, etc.) towards the end of my book to provide insights into the cutting edge of the field. Again, the focus here will not only be on how the models are structured, but also on why they’re structured the way they are.
Any final comments?
I’ve had the opportunity to work with luminaries in the machine learning space while writing this book, including Mike and Ben from O’Reilly, Jeff Dean of Google, and Jeff Hammerbacher of Mt. Sinai and Cloudera. I’m excited to see what readers think of the early release as it comes out, so I can tailor the content to what they’re looking for.
The book link is – Fundamentals of Deep Learning: Designing Next-Generation Artificial Intelligence Algorithms by Nikhil Buduma. I very much look forward to reading it as it develops and using it for my course