Term of the Moment

hodling


Look Up Another Term


Definition: neural network


A major AI architecture employed for many pattern recognition applications; however, the neural network's most popular use is the creation of language models used by ChatGPT, Gemini and other chatbots. Loosely based on the human nervous system, a computer-based neural network is technically an "artificial" neural network (ANN).

The neural network is used in image, language and speech recognition, text-to-speech conversion, robotics, diagnosing, forecasting and generative AI. Unlike regular applications that are programmed for precise results (if-then-else), neural network models are "trained" and fine-tuned using millions, billions, even trillions of examples of text and images. See AI secret sauce, AI programming, AI training, AI model and generative AI.

Layers and Nodes
A neural network is a "math machine" that learns from examples. It comprises multiple layers of computational units called "nodes" or "neurons" that are mathematically connected to each other. Neurons are broken down into parallel operations that are mapped onto the physical GPUs.

Datacenters can have thousands of servers, each containing thousands of neurons. The more servers, the larger the neural network and the more comprehensive the training. For smaller AI applications, a single desktop machine can contain a neural network; for example, see DGX Spark.

GPUs and Servers
AI servers contain four to 16 GPUs, each with their own processor and memory; for example, NVIDIA's H100 contains eight GPUs (see H100). In a large datacenter, there can be tens of thousands of servers, and it can take weeks and months to train huge language models. To reduce the training time of ever-larger neural networks to days instead of months, it is estimated that a million or more GPUs may be required. See GPU and AI training vs. inference.







A Single Node/Neuron
Nothing like "if-then-else" business logic, a neuron is a computational unit. Each layer contains a number of nodes/neurons and their mathematical values are computed with all the neurons in the next layer. See AI node and AI weights and biases.




Tracing a Neuron
The neural network is a pattern detection system. Sentences are turned into tokens that are parts of words, and then vectors, which are one-dimensional arrays. The vector arrays become a matrix that contains the neural network weights.

Millions and billions of neurons process each token to predict the next one, and training language models means constantly predicting what comes next (see AI training passes).

The World's Information = One Giant Matrix
Essentially, AI companies have taken all the information ever published online and turned it into the world's largest mathematical matrix. See AI secret sauce.

There Are Many Network Designs
The following diagrams from the Asimov Institute in the Netherlands reveal the variety of neural network architectures that have been created. For a neural network example that recognizes the letters of the alphabet, which is easier to grasp than other network architectures, see convolutional neural network.

















Neural Network Architectures
AI networks are one of the most researched areas of computing in the 21st century. The examples above from the Asimov Institute in the Netherlands reveal the variety of network architectures that have been created. (Images courtesy of Fjodor van Veen and Stefan Leijnen (2019). The Neural Network Zoo.)