optimize AI inference for production