Project Overview

Project Code: CIT 21

Project name:

Dual spaces of neural networks: Creating maps between the parameter space and the input space

TUM Department:

CIT - Informatics

TUM Chair / Institute:

Scientific Computing

Research area:

Machine Learning

Student background:

Computer ScienceMathematics

Further disciplines:

Participation also possible online only:

Planned project location:

Boltzmannstraße 3, 85748 Garching bei München.

Project Supervisor - Contact Details


Title:

Professor

Given name:

Felix

Family name:

Dietrich

E-mail:

felix.dietrich@tum.de

Phone:

+49 89 289 18609

Additional Project Supervisor - Contact Details


Title:

Mr

Given name:

Erik Lien

Family name:

Bolager

E-mail:

erik.bolager@tum.de

Phone:

+49 89 289 18634

Additional Project Supervisor - Contact Details


Title:

Given name:

Family name:

E-mail:

Phone:

Project Description


Project description:

Background:

Deep learning methods have proven successful in many different fields,
such as image generation (DALL-E, Midjourney), natural language
processing (GPT), and operator learning (DeepONet, FNO), to name a few.
The process of gradient optimization by backpropagation has been a key
factor in its success. However, this training process and the resulting
models are still hard to understand, with the networks often referred to
as "black-box models". Gaining a better understanding of the algorithms
and models of deep learning is an important scientific task.

To train neural networks means to find good parameters based on data.
There are some existing mathematical connections between the input space
(or data space) and the parameter space (or weight space), namely, they
are dual to each other [1]. With this in mind, a method called "sampling
where it matters" (SWIM) [2] was developed in the group where the
project will be conducted. SWIM makes the connection of data and
parameters even more explicit by taking an approach different from
backpropagation when constructing feed-forward networks: For each
neuron, one samples a pair of points from the data set and constructs
the weight and bias parameters using this pair. This means the parameter
space is directly related to the input space (specifically, the space of
pairs of input points). However, the seemingly reduced parameter space
is as expressive and flexible as regular neural networks (proven in
[2]). This means one can more explicitly explore the connection between
the input space and parameter space.

Specific project:

The focus of this work is to create a map from the resulting parameter
space found through iterative optimization (SGD, Adam optimizers) to the
parameter space explicitly defined by the input space sampling method
(SWIM, [2]). This must be done by identifying pairs of points in our
dataset that are important for the training. This allows us to gain an
understanding of the gradient optimization techniques and explain the
resulting model. It is also possible to create such mappings during
training iterations and see how the emphasis on different data points
changes over the iterations. Finally, we are also interested in the
inverse map, where we start with the weights constructed by SWIM and
improve upon such weights by mapping them to the parameter space induced
by optimization.

The initial groundwork has already been done in a previous thesis [3],
by testing simple methods to create maps for shallow neural networks and
low-dimensional input spaces. This was built upon through a research
internship, where algorithms from optimal transport theory were applied
to higher-dimensional input spaces. The student(s) will continue this
work by either extending methods or creating new strategies for deep
neural networks and higher-dimensional input spaces. Depending on the
student's preference, this can be achieved through principled and
mathematically rigorous methods, such as optimal transport theory, or
more heuristic approaches, such as computational experiments.

Outcomes for the student:

1) Gain understanding of the underlying structure of deep neural
networks and back-propagation, which is highly relevant in today's AI
landscape.

2) The chance to dive deep into the mathematical foundation of mapping
between probability spaces, such as optimal transport theory.

3) Make a meaningful contribution and advancement in explaining deep
learning and understanding its training phase and the resulting model.
Such understanding will be crucial for a world that uses more and more
machine learning and AI.

4) Conduct research to explore whether a future in academia is of
interest, and gain valuable experience in a scientific working environment.

Citations:
[1] L. Spek, T. Heeringa, C. Brune, "Duality for Neural Networks through
Reproducing Kernel Banach Spaces", arXiv, 2022.

[2] E. Bolager, I. Burak, C. Datar, Q. Sun, F. Dietrich, "Sampling
weights of deep neural networks", NeurIPS, 2023.

[3] Dhia Bouassida, TUM Bachelor thesis on "Converting Neural Networks
to Sampled Networks", 2023. https://mediatum.ub.tum.de/1726944

Working hours per week planned:

40

Prerequisites


Required study level minimum (at time of TUM PREP project start):

3 years of bachelor studies completed

Subject related:

Machine learning, in particular basics of neural networks. In addition calculus, linear algebra, and gradient optimization.

Preferable if the candidate has experience coding in Python and/or a mathematical background.

Other:

  • Keine Stichwörter