22.05.2025 (Thursday)

ST Singular learning theory and machine learning

regular seminar Simon Pepin Lehalleur ()

at:
14:00 - 16:00
KCL, Strand
room: K3.11
abstract:

Modern machine learning models are typically overparameterised and
singular: many different trained models ~achieve the same optimal
loss. Singular learning theory (SLT) is an approach to statistical
learning theory for such models developed by Sumio Watanabe, rooted in
Bayesian statistics and singularity theory.

In the first introductory lecture, the main player will be the local learning coefficient and its applications. I will give and motivate its definition from several interlocked perspectives: free energy and model selection in Bayesian statistics, singularity theory, information theory and statistical
physics. We will see how the LLC is estimated in practice using
(stochastic) MCMC techniques. I will then present empirical results
showing how the LLC captures important structural aspects of training
and generalisation, both in toy models and in LLMs.


In the second lecture, I will explain some of the Bayesian statistics
and singularity theory that go into the main results of SLT. I will
also discuss various caveats involved when applying SLT to
developmental interpretability of realistic deep learning models. I
will sketch some of the current research on bridging those gaps
between theory and practice, including extensions of the LLC (refined
LLC, susceptibilities), SLT for other models of computations such as
noisy Turing machines, and a dynamical interpretation of the LLC in
terms of jet schemes. This may also be an opportunity to relate SLT
with some other strands of modern deep learning research. Finally,
time permitting, I will argue for an SLT-inspired perspective on the
role of generalisation in AI safety.

Keywords: Bayesian statistics, singularities