F.A. Oliehoek | TU Delft Repository

Policy Space Response Oracles

A Survey

Conference paper (2024) - A. Bighashdel (author), A. Bighashdel (author), Yongzhao Wang (author), Stephen McAleer (author), Rahul Savani (author), Rahul Savani (author), F.A. Oliehoek (author)

Game theory provides a mathematical way to study the interaction between multiple decision makers. However, classical game-theoretic analysis is limited in scalability due to the large number of strategies, precluding direct application to more complex scenarios. This survey prov ...

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

Journal article (2024) - R.A.N. Starre (author), M. Loog (author), E. Congeduti (author), E. Congeduti (author), F.A. Oliehoek (author)

Many methods for Model-based Reinforcement learning (MBRL) in Markov decision processes (MDPs) provide guarantees for both the accuracy of the model they can deliver and the learning efficiency. At the same time, state abstraction techniques allow for a reduction of the size of a ...

Teacher-apprentices RL (TARL)

Leveraging complex policy distribution through generative adversarial hypernetwork in reinforcement learning

Journal article (2023) - Shi Yuan Tang (author), Athirai A. Irissappane (author), F.A. Oliehoek (author), Jie Zhang (author)

Typically, a Reinforcement Learning (RL) algorithm focuses in learning a single deployable policy as the end product. Depending on the initialization methods and seed randomization, learning a single policy could possibly leads to convergence to different local optima across diff ...

Safety Guarantees in Multi-agent Learning via Trapping Regions

Journal article (2023) - A.T. Czechowski (author), F.A. Oliehoek (author)

One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently. This is in stark contrast to m ...

What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization

Conference paper (2023) - Z. MS Osika (author), J. Zatarain Salazar (author), Diederik M. Roijers (author), Diederik M. Roijers (author), F.A. Oliehoek (author), P.K. Murukannaiah (author)

We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms. As MOO is applied to solve diverse problems, approaches for analyzing the trade-offs offered by MOO algorithms are scattered across fie ...

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

Conference paper (2023) - M. Suau (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Reinforcement learning agents may sometimes develop habits that are effective
only when specific policies are followed. After an initial exploration phase in which
agents try out different actions, they eventually converge toward a particular policy.
When this occurs, ...

Safe Multi-agent Learning via Trapping Regions

Conference paper (2023) - A.T. Czechowski (author), F.A. Oliehoek (author)

One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently. This is in stark contrast to m ...

A Survey on Scenario Theory, Complexity, and Compression-Based Learning and Generalization

Journal article (2023) - Roberto Rocchetta (author), Alexander Mey (author), F.A. Oliehoek (author)

This work investigates formal generalization error bounds that apply to support vector machines (SVMs) in realizable and agnostic learning problems. We focus on recently observed parallels between probably approximately correct (PAC)-learning bounds, such as compression and compl ...

Influence-aware memory architectures for deep reinforcement learning in POMDPs

Journal article (2022) - M. Suau (author), J. He (author), E. Congeduti (author), R.A.N. Starre (author), A.T. Czechowski (author), F.A. Oliehoek (author)

Due to its perceptual limitations, an agent may have too little information about the environment to act optimally. In such cases, it is important to keep track of the action-observation history to uncover hidden state information. Recent deep reinforcement learning methods use r ...

Overcoming Traffic Sensors Malfunctions with Deep Learning

Conference paper (2022) - V. Catalán Pastor (author), E. Congeduti (author), E. Congeduti (author), F.A. Oliehoek (author)

Constant growth of cities and their rapid urbanization contribute significantly to an increase in traffic congestion, leading to high costs both in terms of time and fuel consumption. Intelligent Transportation Systems (ITSs) play an important role in managing traffic in urban ar ...

BADDr

Bayes-Adaptive Deep Dropout RL for POMDPs

Conference paper (2022) - Sammie Katt (author), Hai Nguyen (author), F.A. Oliehoek (author), Christopher Amato (author)

While reinforcement learning (RL) has made great advances in scalability, exploration and partial observability are still active research topics. In contrast, Bayesian RL (BRL) provides a principled answer to both state estimation and the exploration-exploitation trade-off, but s ...

A Cross-Field Review of State Abstraction for Markov Decision Processes

Conference paper (2022) - E. Congeduti (author), E. Congeduti (author), F.A. Oliehoek (author)

Complex real-world systems pose a significant challenge to decision making: an agent needs to explore a large environment, deal with incomplete or noisy information, generalize the experience and learn from feedback to act optimally. These processes demand vast representation cap ...

Multi Robot Surveillance and Planning in Limited Communication Environments

Conference paper (2022) - V. Inna Kedege (author), A.T. Czechowski (author), Ludo Stellingwerff (author), F.A. Oliehoek (author)

Distributed robots that survey and assist with search & rescue operations usually deal with unknown environments with limited communication. This paper focuses on distributed & cooperative multi-robot area coverage strategies of unknown environments, having constrained co ...

Influence-Augmented Local Simulators

A Scalable Solution for Fast Deep RL in Large Networked Systems

Conference paper (2022) - M. Suau (author), J. He (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Speeding up Deep Reinforcement Learning through Influence-Augmented Local Simulators

Conference paper (2022) - M. Suau (author), J. He (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL). The main limitation being the amount of data needed and the pace at which that data can be obtained. In this paper, we study how to build lightweight simul ...

Back to the Future

Solving Hidden Parameter MDPs with Hindsight

Conference paper (2022) - C.T. Ponnambalam (author), Danial Kamran (author), T. D. Simão (author), F.A. Oliehoek (author), M.T.J. Spaan (author)

Best-Response Bayesian Reinforcement Learning with Bayes-adaptive POMDPs for Centaurs

Conference paper (2022) - M.M. Celikok (author), F.A. Oliehoek (author), Samuel Kaski (author), Samuel Kaski (author)

Centaurs are half-human, half-AI decision-makers where the AI's goal is to complement the human. To do so, the AI must be able to recognize the goals and constraints of the human and have the means to help them. We present a novel formulation of the interaction between the human ...

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

Conference paper (2022) - M. Suau (author), J. He (author), Mustafa Mert Çelikok (author), M.T.J. Spaan (author), F.A. Oliehoek (author)

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning. Many real-world problems, however, exhibit overly complex dynamics, which makes their full-scale simulation computationally slow. In this paper, we sh ...

Model-Based Reinforcement Learning with State Abstraction: A Survey

Conference paper (2022) - R.A.N. Starre (author), M. Loog (author), F.A. Oliehoek (author)

Model-based reinforcement learning methods are promising since they can increase sample efficiency while simultaneously improving generalizability. Learning can also be made more efficient through state abstraction, which delivers more compact models. Model-based reinforcement le ...

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

Conference paper (2022) - M. Peschl (author), A. Zgonnikov (author), F.A. Oliehoek (author), L. Cavalcante Siebert (author)

Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficul ...