In this workshop, Lewis Tunstall and Edward Beeching from Hugging Face will discuss a powerful alignment technique called Direct Preference Optimisation (DPO) which was used to train Zephyr ([ Ссылка ]) and is rapidly becoming the de facto method to boost the performance of open chat models.
By the end of this workshop, attendees will:
Understand the steps involved in fine-tuning LLMs for chat applications.
Learn the theory behind Direct Preference Optimisation and how to apply it in practice with the Hugging Face TRL library.
Know what metrics to consider when evaluating chat models.
Take a moment to register for our community forum:
[ Ссылка ]
Take a moment to register for our short courses here:
[ Ссылка ]
Workshop Notebooks:
Notebook #1:
[ Ссылка ]
Notebook #2:
[ Ссылка ]
Slides:
[ Ссылка ]
About DeepLearning.AI
DeepLearning.AI is an education technology company that is empowering the global workforce to build an AI-powered future through world-class education, hands-on training, and a collaborative community. Take your generative AI skills to the next level with short courses help you learn new skills, tools, and concepts efficiently.
About Hugging Face
Hugging Face is an AI company specializing in natural language processing (NLP) and machine learning, and is known for its open-source contributions and collaborative approach to AI research and development. The company is famous for developing the Transformers library, which offers a wide range of pretrained models and tools for a variety of NLP tasks, making it easier for researchers and developers to implement state-of-the-art AI solutions. Hugging Face also fosters a vibrant community for AI enthusiasts and professionals, providing a platform for sharing models, datasets, and research, which significantly contributes to the advancement of AI technology.
Speakers:
Lewis Tunstall, Machine Learning Engineer, Hugging Face
[ Ссылка ]
Edward Beeching, Research Scientist, Hugging Face
[ Ссылка ]
Ещё видео!