Over the past year, Large Language Models (LLM) have taken the world by storm. While it may not be complicated to cook up a demo using a commercial API such as OpenAI’s, it is a different story to engineer an LLM-based system that keeps costs down, achieves a sufficient degree of generalization, and, most importantly, does not send any user data externally. In this talk, I will give a walk-through of our journey to deploy an enterprise GenAI feature, named Flow Generation, that produces workflows from natural language requirements, all the way from our initial experiments to ensuring that we could support enough concurrent users. My team built this feature on top of the starcoder family of Code LLMs via supervised fine-tuning, and resorted to using Retrieval-Augmented Generation (RAG) in order to significantly reduce hallucination and enable customization across the thousands of users of the ServiceNow platform.