An AI deep learning model that is used in a wide range of applications, including language processing, content generation and answering questions. The transformer is a major advancement over the recurrent neural network (RNN). Initially presented in a 2017 paper by eight scientists at Google, entitled "Attention Is All You Need," transformers are able to understand the relationships between words that are far apart in a sentence much more efficiently than RNNs. Instead of using labeled data, transformers find the patterns in the data mathematically.
From Words to Tokens to Vectors
First cleaned by removing punctuation and symbols, the text is turned into "tokens," and the tokens are converted into vectors (mathematical representations). Important words are identified by using "attention mechanisms." To generate results, the tokens are decoded and the output is formatted back into readable text with the proper punctuation. See
GPT and
recurrent neural network.