Question 1

What is Knowledge Distillation in simple terms?

Accepted Answer

In simple terms, distillation is teaching a small model by having it learn from a big one — like a seasoned expert coaching a quick apprentice. The apprentice ends up nearly as good, but far cheaper.

Question 2

What is the difference between distillation and quantization?

Accepted Answer

Both make a model cheaper to run, but in different ways. Distillation trains a brand-new, smaller model (the student) to copy the behavior of a large one (the teacher) — you end up with a different, more compact model. Quantization keeps the same model but stores its internal numbers at lower precision, shrinking it without retraining a new network. One produces a new, smaller model by imitation; the other compresses an existing one by rounding. They are complementary, and along with pruning are frequently combined to make a model as small and fast as possible.

Question 3

How does distillation work?

Accepted Answer

A large "teacher" model is run over many examples, and a small "student" model is trained to reproduce the teacher's outputs. Crucially, the student learns from more than the teacher's final answers — it learns how confident the teacher was across all the options, which encodes subtle relationships (that two categories look alike, say) the student would never get from plain labels. Absorbing that richer signal is why a distilled student usually outperforms a same-sized model trained directly on raw data. The student ends up compact but skilled.

Question 4

What is distillation used for?

Accepted Answer

It is used to make powerful AI practical to deploy: shrinking a large, costly model into a lighter one that runs fast on phones, in browsers, or at lower cost on servers, while keeping most of its ability. It is widely used to produce the smaller, cheaper versions of large language models that companies serve at scale. Anywhere a big model is too slow or expensive for the job but you don't want to lose its skill, distillation is a leading option — often paired with quantization and pruning.

Knowledge Distillation (Distillation)

What is Knowledge Distillation in simple terms?

Knowledge Distillation explained

Real-world example of Knowledge Distillation

Frequently asked questions about Knowledge Distillation

What is the difference between distillation and quantization?

How does distillation work?

What is distillation used for?

Courses focused on Knowledge Distillation

The Art of Compressing LLMs: Pruning, Distillation, and Quantization

Knowledge Distillation (Distillation)

What is Knowledge Distillation in simple terms?

Knowledge Distillation explained

Real-world example of Knowledge Distillation

Frequently asked questions about Knowledge Distillation

What is the difference between distillation and quantization?

How does distillation work?

What is distillation used for?

Related terms

Courses focused on Knowledge Distillation

The Art of Compressing LLMs: Pruning, Distillation, and Quantization