Understanding Risk Extrapolation (REx) and when it finds Invariant Relationships

Bachelor thesis (2022)

Authors

J.L. Hofland Electrical Engineering, Mathematics and Computer Science

Contributors

J.H. Krijthe Pattern Recognition and Bioinformatics - (mentor)

J. H. Krijthe Pattern Recognition and Bioinformatics - (mentor)

Jesse H. Krijthe Pattern Recognition and Bioinformatics - (mentor)

R.K.A. Karlsson Pattern Recognition and Bioinformatics - (mentor)

S.R. Bongers Pattern Recognition and Bioinformatics - (mentor)

Thomas Höllt Computer Graphics and Visualisation - (graduation committee member)

T. Höllt Computer Graphics and Visualisation - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Machine Learning Risk Extrapolation Invariant Relationships Domain shift Distributial shift

To reference this document use:

http://resolver.tudelft.nl/uuid:0a3234aa-9c7a-48ad-8d14-d75b050e6fc5

More Info

expand_more

Published Date

24-06-2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

Generalizing models for new unknown datasets is a common problem in machine learning. Algorithms that perform well for test instances with the same distribution as their training dataset often perform severely on new datasets with a different distribution. This problem is caused by distributional shifts between the training of the model and applying that model to a test domain. This paper addresses whether and in what situations Risk Extrapolation (REx) can tackle this problem of Out-Of-Distribution generalization by exploiting invariant relationships. These relationships are based on features that are invariant across all domains. By learning these relationships, REx aims to learn the concept of the problem we are trying to solve. We show in what situations REx can learn these invariant relationships and when it does not. We translate the definition of an invariant relationship into a homoscedastic synthetic dataset with either covariate, confounded, anti-causal, or hybrid shift. We expose REx to experiments in sample complexity, the number of training domains, and the training domain distance. We show that REx performs better for invariant prediction in situations with larger sample sizes and training domain distance and that if these criteria are met, REx performs equivalently in all four distributional shifts. We also compare REx to Invariant- and Empirical Risk Minimization and show that; REx is less sensitive and thus robust to the shifting of the average distributional variance in the training domains; REx asymptotically out-performs the methods in the more complex distributional shifts.

Files

Understanding_Risk_Extrapolati... (pdf)

(pdf | 1.91 Mb)

Unknown license