MistralAI just released their first model, Mistral 7B and they wrestled reporting it to be the best 7B model to date!
MistralAI first got attention through the amount it raised on their seed round. Altough it seems like a very young company, we need to keep in mind that it was built by ex meta and deepmind employees.
The model is released with an Apache 2.0 license, making it the biggest model of its size to be completely open source for any uses including educational or commercial.
It comes in a base model and a fine-tuned model for chat. They report the model to be optimized for low latency, text summarization and classification, text completion and code completion
In their release blog, Mistral claims that the 7.3B parameter model outperforms other models of its size and even bigger. Reportedly performing better on a wide range of benchmarks than LLAMA2 7B and 13B, and approaching CodeLLAMA’s performance.
Mistral AI attribute their speed of inference to Grouped-query attention (GQA) and lower cost when handling longer sequences to Sliding Window Attention (SWA) as mentioned in their release blog.
This is an exciting development both because of the improvement in performance of a relatively smaller model and because this high quality model is made open and accessible for everyone to use.
Colab notebook: [ Ссылка ]
▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬
🖥️ Website: [ Ссылка ]
🐦 Twitter: [ Ссылка ]
🦾 Discord: [ Ссылка ]
▶️ Subscribe: [ Ссылка ]
🔥 We're hiring! Check our open roles: [ Ссылка ]
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#MachineLearning #deeplearning
00:00 What is Mistral 7b?
02:44 How can you start using it?
Ещё видео!