Language models under the hood

Join us at the department’s research seminar on 27. February, with the talk Language models under the hood, given by our Associate professor Andrey Kutuzov from the Language Technology Group

Image may contain: Orange, Font, Graphics, Brand, Peach.

In the last few years, radical increase in the scale of deep neural
language models (both in terms of the size of the training data and the
size of the models themselves) has led to impressive achievements in
various natural language processing tasks. "Celebrity" models, like the
recently announced ChatGPT, are already sometimes described to as
"approaching artificial intelligence", although the reality can differ
from over-hyped media coverage.

In this talk, I will describe the foundations of the technology behind
large-scale language models. Two most important components behind their
success are 1) state-of-the-art deep learning architectures (in
particular, Transformer) and 2) the availability of tremendous amount of
textual data used to train such models. The interaction of these two
poses intricate theoretical and practical questions, also linked to
issues with unequal distribution of computing resources and biases in
training data.  Can we actually reach AI by simply training ever larger
language models?"

Published Feb. 17, 2023 11:24 AM - Last modified Feb. 27, 2023 9:59 AM