Question 1

What is Inference?

Accepted Answer

Inference is what happens when a trained AI model processes input to produce output — a prediction, a generated response, a classification. It's distinct from training (adjusting model weights) and fine-tuning (further training). When you send a prompt to Claude or GPT-4 and get a response, that's inference. Inference cost and latency are key operational concerns for AI agent systems that make many model calls.

Question 2

What is the difference between Inference and training?

Accepted Answer

Training adjusts the model's weights over a large dataset — it's slow and expensive. Inference runs the trained model to generate output — it's fast and done on-demand.

What is Inference?

Definition

Example

Inference vs training: What's the difference?

Related terms