Limitations of Large Language Models

Name: Limitations of Large Language Models
Start: 2024-04-26T10:30:00Z
End: 2024-04-26T11:00:00Z
Location: Polytechnique Montreal

Abstract

Large language models (LLMs) are becoming increasingly used in various downstream applications not only in natural language processing but also in various other domains including computer vision, reinforcement learning, and scientific discovery to name a few. This talk will focus on the limitations of using LLMs as task solvers. What are the effects of using LLMs as task solvers? What kind of knowledge can an LLM encode (and also what it cannot encode)? Can they efficiently use all the encoded knowledge while learning a downstream task? Are LLMs susceptible to the usual catastrophic forgetting while learning many tasks? How do we identify the biases that these LLMs encode and how do we eliminate those biases? Can we trust the explanations provided by LLMs? In this talk, I will present an overview of several research projects in my lab that attempt to answer all these questions. This talk will bring to light some of the current limitations of LLMs and how to move forward to build more intelligence systems.

Date

Apr 26, 2024 10:30 AM — 11:00 AM

Event

SEMLA LLMOps Day

Location

Polytechnique Montreal

2500 Chem. de Polytechnique, Montréal, QC H3T 1J4

Sarath Chandar

Assistant Professor - Polytechnique de Montréal

Sarath Chandar is an Assistant Professor at Polytechnique Montreal where he leads the Chandar Research Lab. He is also a core faculty member at Mila, the Quebec AI Institute. Sarath holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning. His research interests include lifelong learning, deep learning, optimization, reinforcement learning, and natural language processing. To promote research in lifelong learning, Sarath created the Conference on Lifelong Learning Agents (CoLLAs) in 2022 and served as a program chair for 2022 and 2023. He received his PhD from the University of Montreal and MS by research from the Indian Institute of Technology Madras.