DeepLearning.AI
In Quantization in Depth, you will build model quantization methods to shrink model weights to ¼ their original size while maintaining performance. This course will help you make your models more accessible and faster at inference time.
In this course, you will implement and customize linear quantization from scratch, studying the tradeoff between space and performance. You will build a general-purpose quantizer in PyTorch to quantize any open source model, compressing model weights from 32 bits to 8 bits and even 2 bits. This course provides the foundation to explore more advanced quantization methods.
Machine Learning Engineers
Professionals looking to deepen their understanding of linear quantization methods.
Data Scientists
Individuals interested in making their models more accessible and faster at inference time.
AI Enthusiasts
Learners who have completed the Quantization Fundamentals course and want to explore advanced quantization techniques.
Join this course to build and customize linear quantization functions, measure quantization errors, and compress model weights. Ideal for machine learning engineers, data scientists, and AI enthusiasts looking to advance their skills.
1 / 3
Basic understanding of machine learning concepts
Familiarity with PyTorch
Completion of Quantization Fundamentals course (recommended)
Marc Sun
ML Engineer, Hugging Face
Marc Sun is a Machine Learning Engineer at Hugging Face, working on the open source team. He is passionate about democratizing machine learning.
Younes Belkada
Instructor, DeepLearning.AI
Younes Belkada is an instructor at DeepLearning.AI. He is a Machine Learning Engineer at Hugging Face.
Cost
Free
Duration
Dates
Location