This concludes the lab on multiclass classification.
This was your first Computer Vision project, where you used a machine learning model to classify a collection of images. Computer vision is conceptually really simple, all you’re doing is feeding every pixel of an image into a machine learning model as a number.
Your pipeline used a classification algorithm derived from logistic regression. But if you really want to get superhuman results, you’ll need a much better algorithm: the convolutional deep neural network (CNN).
A CNN slides a small pattern-matching engine called a convolution matrix across an image to detect simple patterns like edges, shapes, and textures. Deeper layers inside the neural network are able to pick up more complex patterns like fur, clouds, a sausage, and so on.
For the last 10 years or so, fully trained CNNs have been able to achieve an accuracy of 99.9% on standardized image recognition tasks, which is far above the average human performance level. This is a domain where AI has achieved superhuman performance.
This is why research into self-driving cars and mobile autonomous robots has increased significantly in the last decade. Computer vision is now a solved problem in AI, so we’re now very close to having robots that can autonomously navigate their environment.