The largest model, Llama 3.1 405B, has arrived! Remember the $5k MacBook Pro? We’re about to push it to its limits and see if it can handle the heat from the newly released Llama models. Can Apple’s Silicon take on these AI behemoths? Plus, we’ll give you a sneak peek at Apple’s OpenELM. Buckle up for a fun and fascinating tech showdown!
Don’t forget to like, subscribe, and hit the bell icon for more awesome content!
Hardware Specs: 16‑inch MacBook Pro Apple M3 Max chip with 128GB unified memory
Benchmark: Generation speed for Meta-Llama-3.1-70B-Instruct-4bit is around 8.333-9.49 tokens per second on MLX.
Instructions:
pip install mlx-lm
mlx_lm.server --model mlx-community/Meta-Llama-3.1-70B-Instruct-4bit
The Open Source UI used in the video: [ Ссылка ]
#llama #llama3 #AppleSilicon #MacBookPro #OpenELM #AI #llm
Chapters
0:00 - Intro
0:26 - What is mlx-lm?
01:55 - Get Llama 3.1 Model !!!
04:53 - Let's Test Llama 3.1
05:55 - Llama 3.1 vs OpenELM
11:55 - Llama 3.1 8B vs Llama 3.1 70B
17:30 - Llama 3.1 405B on Macbook?
21:28 - Ending
Ещё видео!