Mydra logo
Artificial Intelligence
DeepLearning.AI logo

DeepLearning.AI

Quantization Fundamentals with Hugging Face

  • up to 1 hour
  • Beginner

Learn how to compress models with the Hugging Face Transformers library and the Quanto library. This course covers linear quantization and downcasting techniques to make generative AI models more efficient and accessible.

  • Model quantization
  • Linear quantization
  • Downcasting
  • Generative AI models
  • Hugging Face Transformers library

Overview

In this course, you will learn how to compress models using linear quantization and downcasting techniques. You will gain a foundation in quantization methods, enabling you to optimize generative AI models for better performance on various devices. By the end of the course, you will be able to apply these techniques to your own models, making them more efficient and accessible.

  • Web Streamline Icon: https://streamlinehq.com
    Online
    course location
  • Layers 1 Streamline Icon: https://streamlinehq.com
    English
    course language
  • Self-paced
    course format
  • Live classes
    delivered online

Who is this course for?

Machine Learning Enthusiasts

Individuals with a basic understanding of machine learning concepts who want to learn about model quantization.

AI Developers

Developers interested in optimizing generative AI models for better performance and efficiency.

Data Scientists

Data scientists looking to make AI models more accessible and efficient for various devices.

This course will teach you essential quantization techniques to optimize generative AI models, making them more efficient and accessible. Ideal for beginners and professionals looking to enhance their AI skills and improve model performance.

Pre-Requisites

1 / 2

  • Basic understanding of machine learning concepts

  • Some experience with PyTorch

What will you learn?

Introduction to Quantization
Learn the basics of model quantization and its importance in AI model optimization.
Linear Quantization
Understand linear quantization and how it can be applied to compress models using the Quanto library.
Downcasting with Transformers
Learn about downcasting and how to use the Transformers library to load models in the BFloat16 data type.
Practical Applications
Practice quantizing open source multimodal and language models to make them more efficient.

Meet your instructors

  • Younes Belkada

    Instructor, DeepLearning.AI

    Younes Belkada is an instructor at DeepLearning.AI, focusing on Machine Learning and Data Science topics.

  • Marc Sun

    Machine Learning Engineer, Hugging Face

    Marc Sun is a Machine Learning Engineer at Hugging Face, an open-source team dedicated to democratizing machine learning. He is also an instructor at DeepLearning.AI.

Upcoming cohorts

  • Dates

    start now

Free