Deploying Large Language Models in Production