Language Models

Admin

A language model is a machine learning model designed to represent the language domain. It can be used as a basis for a number of different language-based tasks, for instance:

Question answering
Semantic search
Summarization
and plenty of other tasks that operate on natural language.

In a domain like weather forecasting, it’s easy to see how past data helps a model to predict a future state. But how do you apply that to language?

How language modeling works
Language models determine word probability by analyzing text data. They interpret this data by feeding it through an algorithm that establishes rules for context in natural language. Then, the model applies these rules in language tasks to accurately predict or produce new sentences. The model essentially learns the features and characteristics of basic language and uses those features to understand new phrases.

Language Model Types

N-gram
Unigram
Exponential
Neural Network

Notable language models

Pathways Language Model (PaLM) 540 billion parameter model, from Google Research.
Generalist Language Model (GLaM) 1 trillion parameter model, from Google Research
Language Models for Dialog Applications (LaMDA) 137 billion parameter model from Google Research
Megatron-Turing NLG 530 billion parameter model, from Microsoft/Nvidia
DreamFusion/Imagen 3D image generation from Google Research
Get3D from Nvidia
MineClip from Nvidia
BLOOM: BigScience Large Open-science Open-access Multilingual Language Model with 176 billion parameters.
Generative pre-trained transformer (GPT)
GPT-2: Generative Pre-trained Transformer 2 with 1.5 billion parameters.
GPT-3: Generative Pre-trained Transformer 3, with the unprecedented size of 2048-token-long context and 175 billion parameters (requiring 800 GB of storage).
GPT-3.5/ChatGPT/InstructGPT from OpenAI
GPT-NeoX-20B: An Open-Source Autoregressive Language Model with 20 billion parameters.
BERT: Bidirectional Encoder Representations from Transformers (BERT)
OPT-175B by Meta AI: another 175-billion-parameter language model. It is available to the broader AI research community.
Point-E by OpenAI: a 3D model generator.
RT-1 by Google: a model for operating robots
ERNIE-Code by Baidu: a 560m parameter multilingual coding model
VALL-E text to speech synthesis based on 3-second speech sample. It was pre-trained on 60,000 hours of English speech from 7,000 unique speakers (dataset: LibriLight).

Blog Language Models

Language Models

Newsletter

Subscribe to get information for latest news and offers