I am currently taking part of the Deep Learning Nanodegree at Udacity, where I learn everything about deep neural networks and their real world application.
Now I just passed the second project (out of five), which is building a dog breed classifier using image recognition on a Convolutional Neural Network (CNN).
Before starting learning about CNNs, I did some of the optional content available after the first project.
Andrew A. Trask, author of Grokking Deep Learning, conducted a lesson about Sentiment Analysis with a handcrafted neural network, explained the roadblocks you might find while doing so and some optimisations to take care about. This part clarified a lot for me how back-propagation works and its costs. We used reviews from IMDB and classified them as positive or negative.
Another interesting section was the introduction to Keras. We faced the same problem about doing sentiment analysis using the IMDB dataset. Keras simplifies the task of creating a neural network, but it was still good to understand how to use vanilla Python (with Numpy) to build one by hand.
So far, I realised that building a neural network is not difficult, but knowing the amount of layers, size of them, adjusting hyper-parameters, etc. is still complicated for me.
The second project belongs to the part three of the course, where we learn about Convolutional Neural Networks. This type of neural network is widely used in image recognition and has the advantage over the previously learned neural networks (MLP or Multi-Layer Perceptrons) to understand data in a two dimensional space rather than one dimension.
For this part of the course, we got access to an EC2 instance in AWS with everything ready to work including GPU hardware acceleration. The course provided us with 100$ of credit to use on AWS.
However, having a PC with a GTX 1060, I decided to do the setup at home, which turned out to be even faster.
We have been learning TensorFlow from the basics, how variables and constants are declared, how to do arithmetical operations and of course how to build and train a model. Even if Keras will do that for us, there will be moments we will need to dig deeper and work with TensorFlow directly.
One thing that blown my mind is learning that neural networks can be reused, so you can load weights and biases from a trained neural network, add more layers, and train those new layers. For example, if you have to build a face recognition neural network, you can load a pre-trained model that will recognise human faces and add another layer that can be trained to recognise specific people.
This is the principle of the dog breed classifier project. We will use a pre-trained network to detect different subjects and then use the output to perform the actual breed classification.
Project: Dog Breed Classifier
This project was way longer than the first one, and looked really scary at the beginning, but it was fun to complete and didn’t face any confusing roadblocks.
In the project, we will both use existing neural networks, and as well create our own from scratch, to perform the task.
The neural network that I created by myself was able to just perform a 7% of accuracy. Considering that less than 1% would have been just pure random classification, it’s better than nothing, but it is really bad.
However, using an existing model like InceptionV3, I was able to obtain an accuracy of 81% when classifying dog breed.
Finally, the project asks us to implement a method that first will guess if the picture contains a human or a dog, and then which breed do they look like.
I look like a Bearded Collie!
My tips to pass this project are:
- Keep examples at hand, specially the one used in the CIFAR10 exercise.
- If a network is giving you problems, try a different one. I went first with Resnet50 for transfer learning but I had an issue I could not resolve, so I switched to InceptionV3.
- Add good explanations on your “answers”, mine were not so complete so I had to resubmit the project. Not a big issue but was a bit disappointed I did not pass the project the first time!
Next part of the course will be recurrent networks. Looking forward to it!