> AlexNet_
Error rate 15.3% — the deep learning revolution began.
> DEEP DIVE_
On September 30, 2012, a deep neural network called AlexNet submitted its results to the ImageNet Large Scale Visual Recognition Challenge, and the field of artificial intelligence was never the same. Built by Alex Krizhevsky, Ilya Sutskever, and their supervisor Geoffrey Hinton at the University of Toronto, AlexNet achieved a top-5 error rate of 15.3%, obliterating the second-place entry's 26.2% by a margin so large that many in the audience at the ECCV conference in Florence, Italy, initially suspected a bug. It was not a bug. It was the beginning of the deep learning revolution.
AlexNet's architecture was ambitious for its time: five convolutional layers followed by three fully connected layers, containing roughly 60 million trainable parameters. The network introduced several innovations that became standard practice almost overnight. It used Rectified Linear Units (ReLU) instead of the traditional sigmoid or tanh activation functions, dramatically accelerating training. It employed dropout regularization, randomly deactivating neurons during training to prevent overfitting. And it used data augmentation techniques to artificially expand the training set. None of these ideas were entirely new, but AlexNet combined them with unprecedented scale and rigor.
The hardware story is equally remarkable. Training a 60-million-parameter network on 1.2 million ImageNet images would have been impractical on CPUs of the era. Krizhevsky wrote custom CUDA kernels to split the computation across two NVIDIA GTX 580 graphics cards, each with just 3 gigabytes of memory. The model trained for about five to six days. This was one of the first demonstrations that GPUs, originally designed for rendering video game graphics, could be repurposed as powerful engines for neural network training, a realization that would eventually make NVIDIA one of the most valuable companies on Earth.
The aftershock of AlexNet's victory was seismic. Within months, virtually every top computer vision lab in the world had pivoted to deep learning. Google took notice and in 2013 acquired DNNresearch, the startup that Hinton, Krizhevsky, and Sutskever had formed, reportedly for $44 million. Sutskever would go on to co-found OpenAI and serve as its chief scientist. Hinton would win the Nobel Prize in Physics in 2024 for his foundational work on neural networks. The AlexNet moment proved a principle that the deep learning community had long argued: given enough data and enough compute, neural networks could achieve performance that no amount of hand-engineered features could match.