Latency
Also known as: response time, time-to-first-token
End-to-end delay before users see results; SLMs often reduce latency versus large cloud APIs for repetitive, structured workloads.
Also known as: response time, time-to-first-token
End-to-end delay before users see results; SLMs often reduce latency versus large cloud APIs for repetitive, structured workloads.
Contact if you need a term added for a security or procurement review.