AI training is accomplished by computing the input data forward and backward through the neural network. The forward pass is constantly predicting the next word or part of a word in a sentence, while the backward pass adjusts the numerical weights in the neurons to correct the prediction errors from the forward pass. See
AI weights and biases.
With large language models (LLMs), it can take weeks and months of forward and backward passes to fully train the model because there can be trillions of calculations that have to be made millions of times over and over.
Thousands of GPUs
The neural network is processed by modules with four to eight GPUs connected to memory and to each other within a server rack, and there can be thousands of racks connected via an optical network in a datacenter. xAI's Colossus datacenter is expected to have more than a hundred thousand GPUs in the future. The communications between the GPUs and between the GPUs and memory take the most time.
Forward Passes in Inference
When executing the AI application for the user (inference processing), there may be a dozen to hundreds of forward passes, based on the quantity of text elements (tokens) at the output side of the network. There are no backward passes in inference. See
neural network and
AI training vs. inference.