Latency

Also known as: response time, time-to-first-token

End-to-end delay before users see results; SLMs often reduce latency versus large cloud APIs for repetitive, structured workloads.

See also

← Back to full glossary · View on index

Contact if you need a term added for a security or procurement review.