Shipping Enterprise GenAI: From Research to Production

Abstract

Over the past year, Large Language Models (LLM) have taken the world by storm. While it may not be complicated to cook up a demo using a commercial API such as OpenAI’s, it is a different story to engineer an LLM-based system that keeps costs down, achieves a sufficient degree of generalization, and, most importantly, does not send any user data externally. In this talk, I will give a walk-through of our journey to deploy an enterprise GenAI feature, named Flow Generation, that produces workflows from natural language requirements, all the way from our initial experiments to ensuring that we could support enough concurrent users. My team built this feature on top of the starcoder family of Code LLMs via supervised fine-tuning, and resorted to using Retrieval-Augmented Generation (RAG) in order to significantly reduce hallucination and enable customization across the thousands of users of the ServiceNow platform.

Date
Apr 26, 2024 2:00 PM — 2:30 PM
Location
Polytechnique Montreal
2500 Chem. de Polytechnique, Montréal, QC H3T 1J4
Orlando E Marquez
Orlando E Marquez
NLP Staff Applied Research Scientist - ServiceNow

Orlando is a Lead Applied Research Scientist with a strong background in software engineering. One of his passions is shipping state-of-the-art NLP to end-users through rigorous and careful experimentation. He currently leads the development of a Generative AI text-to-structure feature called Flow Generation, which seeks to reduce the time needed to build enterprise workflows. In the past, he has conducted applied research in semi-supervised learning for NLP, explainability, and error analysis. He holds a Bachelor of Software Engineering from the University of Waterloo and a Masters of Computer Science from the Université de Montréal (MILA).