Context window
A context window is the amount of text, code, or multimodal input a model can consider in one request.
Why it matters Context limits determine whether a model can handle large contracts, research packets, or codebases without splitting the work up.
In practice If your prompt and source material exceed the model context, quality drops or the request fails outright.
Token
A token is a small unit of text that models use for reading and generating content. Pricing and context limits are usually measured in tokens.
Why it matters Token counts directly affect API cost, latency, and whether a workload fits into a model at all.
In practice A long prompt with a large knowledge base chunk can become expensive even before the model writes any output.
LLM
LLM stands for large language model, a system trained on massive text or code corpora to predict and generate language.
Why it matters Most AI buying decisions start with model choice because model quality, cost, speed, and tool access shape the rest of the stack.
In practice Teams often compare frontier models for reasoning and cheaper models for background automation.
Multimodal model
A multimodal model can work across more than one input or output type, such as text, images, audio, or video.
Why it matters Multimodal capability matters when you need one system to interpret screenshots, documents, voice, or visual creative assets.
In practice A multimodal workflow might read a PDF, inspect charts, and answer questions in plain language.