Disentangled Representation Learning with Applications in Sequence Data

Key information:

Student	Nikos Chazaridis
Academic Supervisors	Christine Evers, Tim Norman, Rafael Mestre, Mohammad Belal
Cohort	3
Pure Link	Active Project

Abstract:

From a young age humans are capable of dissecting complex ideas into concepts of higher granularity, understand underlying structural elements of objects and their attributes and discover ways to manipulate the environment around them. This suggests that complex concepts may be more comprehensible through a mechanism that breaks them down. In this project we explore whether it is possible to equip artificial intelligent agents with a mechanism of decomposing complex data into distinct views that are meaningful for the tasks they intend to achieve. This investigation is conducted through the lens of Representation Learning in the field of Machine learning and by specifically focusing on learning disentangled representations. Disentangled representations aims to decompose data into meaningful components that can be exploited by artificial intelligent agents to achieve their goals and provide explainability for humans.

The objective of this project is to explore how sequential data can be decomposed into representations associated with attributes that operate at different temporal rates. We posit that examining temporal dependencies can provide insights in discovering distinct patterns in sequences. To this end we focus on audio, specifically speech sequences. Speech signals are rich in information and carry attributes that operate at different temporal rates. For instance, information on the channel is present across the entire speech sequence while speaker attributes can be pinpointed at the utterance level. To achieve the above, we combine deep learning with dynamical systems theory, with the goal of modelling sequences with neural networks and performing disentanglement.

Contact: N.Chazaridis@soton.ac.uk