It all started at a small academic get-together in Whistler, British Columbia. The topic was speech recognition, and whether a new and unproven approach to machine intelligence—something called deep learning—could help computers more effectively identify the spoken word. Microsoft funded the mini-conference, held just before Christmas 2009, and two of its researchers invited the world’s preeminent deep learning expert, the University of Toronto’s Geoff Hinton, to give a speech. Hinton’s idea was that machine learning models could work a lot like neurons in the human brain. He wanted to build “neural networks” that could gradually assemble an understanding of spoken words as more and more of them arrived. Neural networks were hot in the 1980s, but by 2009, they hadn’t lived up to their potential.
