Mixtral (8x7B Instruct (MoE))
Mistral · ~47B parameters · 13B+ · Last reviewed 2026-03-20
Why we use it
MoE quality at lower active parameters per token - useful when routing selects experts for enterprise workloads.
License summary
Apache 2.0 for many Mixtral releases - MoE serving has distinct memory/active-parameter trade-offs; confirm license on the exact artifact.
Typical deployment profiles
- VPC, multi-GPU
- Datacenter / large clusters
Focus tags
- General
- Reasoning
Typical use cases
- High-quality teachers
- Hybrid routing targets
- Multi-tenant VPC