Llama 3.2 (1B Instruct)
Llama · ~1B parameters · Under 4B · Last reviewed 2026-03-20
Why we use it
Smallest widely documented Llama 3.2 tier for extreme latency budgets and on-device experiments before scaling to 3B.
License summary
Released under Meta’s Llama license with acceptable-use and attribution requirements; verify current terms on Meta’s site before redistribution.
Typical deployment profiles
- Edge / low footprint
- VPC, single GPU class
Focus tags
- General
Typical use cases
- Ultra-low-latency demos
- Embedding-sized pilots
- Edge smoke tests