Batch inference

Also known as: offline inference

Processing large queued inputs where latency per item matters less than total cost - common for document extraction or nightly scoring.

See also

slm infrastructure

← Back to full glossary · View on index

Contact if you need a term added for a security or procurement review.