TinyLlama (1.1B)
Llama · ~1.1B parameters · Under 4B · Last reviewed 2026-03-20
Why we use it
Ultra-small Llama-architecture student target for latency-critical demos and classroom-style compression exercises.
License summary
Apache 2.0 for the TinyLlama open release - confirm the checkpoint card you deploy matches that lineage.
Typical deployment profiles
- Edge / low footprint
- VPC, single GPU class
Focus tags
- General
Typical use cases
- Edge prototypes
- Distillation students
- CI smoke tests