What is Top-p (Nucleus Sampling)? — AI Glossary

Top-p, also known as nucleus sampling, is an alternative to temperature for controlling the randomness of AI model outputs. Instead of scaling all token probabilities uniformly like temperature does, top-p dynamically selects a subset of tokens to sample from. It sorts tokens by probability, then includes tokens from highest to lowest probability until the cumulative probability reaches the threshold p. Only tokens within this nucleus are considered for selection.

With top-p set to 0.9, the model considers only the tokens that make up the top 90% of the probability mass. If the model is very confident about the next word, this might include only 1-2 tokens. If the model is uncertain, it might include dozens of tokens. This dynamic behavior is what makes top-p different from top-k sampling (which always considers exactly k tokens) and gives it an advantage in adapting to different contexts within the same generation.

In practice, top-p and temperature are sometimes used together, though this can be confusing and many practitioners recommend adjusting only one at a time. Lower top-p values (0.1-0.5) produce more focused outputs, while higher values (0.9-1.0) allow more variety. Most API defaults set top-p to 1.0 (consider all tokens) and let temperature alone control randomness. When building applications, it is best to start with reasonable defaults and adjust based on the specific quality characteristics you need.

Real-World Examples

•Setting top-p to 0.1 for highly deterministic factual answers in a knowledge base chatbot

•Using top-p 0.9 for general conversation to allow natural variation in responses

•Combining temperature 0.5 with top-p 0.95 for balanced creative writing output

•OpenAI and Anthropic APIs exposing top-p as a configurable parameter for fine-tuned generation

Real-World Examples

•Setting top-p to 0.1 for highly deterministic factual answers in a knowledge base chatbot

•Using top-p 0.9 for general conversation to allow natural variation in responses

•Combining temperature 0.5 with top-p 0.95 for balanced creative writing output

•OpenAI and Anthropic APIs exposing top-p as a configurable parameter for fine-tuned generation

Top-p (Nucleus Sampling)

Real-World Examples

Related Terms

Prompt Engineering Mastery

Stop watching tutorials.
Start building.

Top-p (Nucleus Sampling)

Real-World Examples

Related Terms

Prompt Engineering Mastery

Stop watching tutorials.
Start building.

Real-World Examples

Related Terms

Prompt Engineering Mastery

Stop watching tutorials. Start building.

Real-World Examples

Related Terms

Prompt Engineering Mastery

Stop watching tutorials. Start building.

Stop watching tutorials.
Start building.

Stop watching tutorials.
Start building.