Mydra logo
Artificial Intelligence
Artificial Intelligence
DeepLearning.AI logo

DeepLearning.AI

Quantization in Depth

  • up to 1 hour
  • Intermediate

In Quantization in Depth, you will build model quantization methods to shrink model weights to ¼ their original size while maintaining performance. This course will help you make your models more accessible and faster at inference time.

  • Linear quantization
  • Model compression
  • Quantization error measurement
  • Weights packing
  • PyTorch

Overview

In this course, you will implement and customize linear quantization from scratch, studying the tradeoff between space and performance. You will build a general-purpose quantizer in PyTorch to quantize any open source model, compressing model weights from 32 bits to 8 bits and even 2 bits. This course provides the foundation to explore more advanced quantization methods.

  • Web Streamline Icon: https://streamlinehq.com
    Online
    course location
  • Layers 1 Streamline Icon: https://streamlinehq.com
    English
    course language
  • Self-paced
    course format
  • Live classes
    delivered online

Who is this course for?

Machine Learning Engineers

Professionals looking to deepen their understanding of linear quantization methods.

Data Scientists

Individuals interested in making their models more accessible and faster at inference time.

AI Enthusiasts

Learners who have completed the Quantization Fundamentals course and want to explore advanced quantization techniques.

Why should you take this course?

Artificial Intelligence

Join this course to build and customize linear quantization functions, measure quantization errors, and compress model weights. Ideal for machine learning engineers, data scientists, and AI enthusiasts looking to advance their skills.

Pre-Requisites

1 / 3

  • Basic understanding of machine learning concepts

  • Familiarity with PyTorch

  • Completion of Quantization Fundamentals course (recommended)

What will you learn?

Introduction to Quantization
Overview of quantization and its importance in model compression.
Linear Quantization Methods
Detailed exploration of linear quantization, including symmetric and asymmetric modes.
Granularities in Quantization
Understanding different granularities like per-tensor, per-channel, and per-group quantization.
Building a Quantizer in PyTorch
Step-by-step guide to building a general-purpose quantizer in PyTorch.
Quantization Error Measurement
Techniques to measure and balance the tradeoff between performance and space.
Advanced Quantization Techniques
Implementing weights packing to pack four 2-bit weights into a single 8-bit integer.
Case Studies and Applications
Real-world applications and case studies of quantization in machine learning models.
Future Directions in Quantization
Exploring advanced quantization methods and future research directions.

Meet your instructors

  • Marc Sun

    ML Engineer, Hugging Face

    Marc Sun is a Machine Learning Engineer at Hugging Face, working on the open source team. He is passionate about democratizing machine learning.

  • Younes Belkada

    Instructor, DeepLearning.AI

    Younes Belkada is an instructor at DeepLearning.AI. He is a Machine Learning Engineer at Hugging Face.

Upcoming cohorts

  • Dates

    start now

Free