Today, American theoretical physicist and Nobel Prize winner Frank Wilczek launches the first of a series of monthly columns exclusively published in the South China Morning Post. Here he reflects on the significance of this year’s Nobel Prize in Physics and its implications for future research in artificial neural networks.
The 2024 Nobel Prize in Physics was awarded to John Hopfield and Jeffrey Hinton for their “fundamental discoveries and inventions that enable machine learning with artificial neural networks.” Because neural networks go beyond the limits of traditional physics, the award caused some grumbling within the physics community both online and offline. But I think it was an appropriate and, in fact, inspired choice. The historical roots of artificial neural networks date back to the early 1940s, not long after the modern concept of what a brain was was firmly established. In other words, the brains of humans and other animals are built from individual cells, neurons, that communicate with each other through electrical pulses.
Biological neurons come in many varieties and can be frighteningly complex, but Warren McCulloch and Walter Pitts turned to mathematical We have largely defined an ideal caricature neuron with useful properties. Their model “neuron” takes an input consisting of one or more “pulses” (which can come from multiple sources) by outputting its own pulse if the sum of the inputs is large enough. respond to. Such neurons can be wired to form functional networks. These networks can convert a stream of input pulses into a stream of output pulses through processing by intermediate neurons.
McCulloch and Pitts showed that artificial neural networks can perform all the basic logical operations required for universal computation.
McCulloch and Pitts’ work attracted the attention and admiration of some of the great pioneers of modern computing, including Alan Turing, Claude Shannon, and John von Neumann. But mainstream practical computing has gone in a different direction. In it, basic logical operations were performed directly on simple transistor circuits and adjusted using explicit instructions, or programs. Of course, this approach has been very successful. It gave us the brave new cyber world we live in today.
However, artificial neural networks have not been completely forgotten. Although such networks provide an unnecessarily complex and clumsy way to perform logic, they have significant potential advantages over standard transistor circuits. In other words, you can change gracefully. Specifically, you can change the input and output rules of a neuron by adjusting the relative importance (technically called “weights”) that you assign to inputs from different channels.