AI Model Compression

Reducing model size and inference time with quantization, pruning, and knowledge distillation.