Quantization
Also known as: INT8, INT4, weight quantization, post-training quantization
Reducing numeric precision of weights (and sometimes activations) to shrink memory footprint and speed up inference - common schemes include 8-bit and 4-bit formats with accuracy trade-offs.