C.M. Jonker | TU Delft Repository

The Impact of Initial Start Distribution Mismatch on Policy Evaluation in Behavior-agnostic Reinforcement Learning

Bachelor thesis (2024) - T. Sabău (author) , F.A. Oliehoek (mentor) , S.R. Bongers (mentor) , C.M. Jonker (graduation committee member)

Behavior-agnostic reinforcement learning is a rapidly expanding research area focusing on developing algorithms capable of learning effective policies without explicit knowledge of the environment's dynamics or specific behavior policies. It proposes robust techniques to perform ...

The Effect of State-visitation Mismatch on Off-policy Performance in Behaviour-agnostic Reinforcement Learning

Bachelor thesis (2024) - K.Y. Chen (author) , S.R. Bongers (mentor) , F.A. Oliehoek (mentor) , C.M. Jonker (graduation committee member)

Off-policy evaluation has some key problems with one of them being the “curse of horizon”. With recent breakthroughs [1] [2], new estimators have emerged that utilise importance sampling of the individual state-action pairs and reward rather than over the whole trajectory. With t ...

Use of sample-splitting and cross-fitting techniques to mitigate the risks of double-dipping in behaviour-agnostic reinforcement learning

Comparative Analysis

Bachelor thesis (2024) - Y. Aslan (author) , S.R. Bongers (mentor) , F.A. Oliehoek (mentor) , C.M. Jonker (graduation committee member)

This paper addresses the issue of double-dipping in off-policy evaluation (OPE) in behaviour-agnostic reinforcement learning, where the same dataset is used for both training and estimation, leading to overfitting and inflated performance metrics especially for variance. We intro ...

Impact of State Visitation Mismatch Methods on the Performance of On-Policy Reinforcement Learning

Bachelor thesis (2024) - H. Cho (author) , F.A. Oliehoek (mentor) , S.R. Bongers (mentor) , C.M. Jonker (graduation committee member)

In the field of reinforcement learning (RL), effectively leveraging behavior-agnostic data to train and evaluate policies without explicit knowledge of the behavior policies that generated the data is a significant challenge. This research investigates the impact of state visitat ...

SimuDICE: Offline Policy Optimization Through Iterative World Model Updates and DICE Estimation

Bachelor thesis (2024) - C. Brita (author) , F.A. Oliehoek (mentor) , S.R. Bongers (mentor) , C.M. Jonker (graduation committee member)

In offline reinforcement learning, deriving a policy from a pre-collected set of experiences is challenging due to the limited sample size and the mismatched state-action distribution between the target policy and the behavioral policy that generated the data. Learning a dynamic ...

A comparative study of Theory of Mind mechanisms and tasks between human and AI

Master thesis (2023) - J.W. van Rhenen (author) , M.L. Tielman (mentor) , C.M. Jonker (graduation committee member)

In order to develop artificial agents that can understand social interactions at a near-human level, it is required that these agents develop an artificial Theory of Mind; the ability to infer the mental state of others. However, developing this artificial Theory of Mind is a hig ...

Achieving Perceptually-Acceptable Early Renders in Spectral Progressive Rendering by Introducing Bias

Bachelor thesis (2022) - T.M. Đào (author) , E. Eisemann (mentor) , M. van de Ruit (mentor) , C.M. Jonker (graduation committee member)

Spectral Monte-Carlo rendering can simulate advanced light phenomena (e.g., dispersion, caustics, or iridescence), but require significantly more samples compared to trichromatic rendering to obtain noise-free images. Therefore, its progressive variant typically exhibits an extr ...

Ambient Light Caching via Approximate Photon Mapping

Bachelor thesis (2022) - P. Makridis (author) , M. van de Ruit (mentor) , E. Eisemann (mentor) , C.M. Jonker (graduation committee member)

Indirect illumination is an essential part of realistic computer-generated imagery. However, accurate calculation of indirect illumination comes at high compute costs. To this end, we replace lengthy indirect illumination paths by employing an ambient light cache based on photon ...

Efficient Emitter Sampling for Spectral Path Tracing

Bachelor thesis (2022) - P.A. Deshmukh (author) , M. van de Ruit (mentor) , E. Eisemann (mentor) , C.M. Jonker (graduation committee member)

Spectral Monte-Carlo methods are powerful physically-based techniques for simulating wavelength-dependent phenomena such as dispersion. However, compared to tristimulus rendering, they involve sampling the spectral domain, which adds substantial overhead, requiring significantly ...

Identifying and Visualizing Computational Hotspots in Path Tracing

Bachelor thesis (2022) - M. Sundarrajan (author) , M. van de Ruit (mentor) , E. Eisemann (mentor) , C.M. Jonker (graduation committee member)

Path tracing is a well-known light transport algorithm used to render photo-realistic images. However, it is an expensive algorithm with an active area of research for improving its efficiency. In our work, we present a method to measure and visualize the regions of high computat ...

Efficient direct lighting calculations for area lights with light portals

Bachelor thesis (2022) - J. Romeu Huidobro (author) , M. van de Ruit (mentor) , E. Eisemann (mentor) , C.M. Jonker (graduation committee member)

Direct lighting calculation is an essential part of photorealistic rendering. Standard importance sampling techniques converge slowly in scenes where a light source is only visible through small openings as visibility is not considered. This problem is often addressed by manuall ...

Exploring the Effect of Automation Failure on the Human’s Trustworthiness in Human-Agent Teamwork

Master thesis (2022) - N.H. Bouman (author) , M.L. Tielman (mentor) , C. Centeio Jorge (mentor) , C.M. Jonker (graduation committee member) , J. Yang (graduation committee member)

Collaboration in teams composed of both humans and automations has an interdependent nature, which demands calibrated trust among all the teammembers. For building suitable autonomous teammates, we need to study how trust and trustworthiness function in such teams. In particular, ...

Adversarial Traffic Modifications for the Network Intrusion Detection Domain

A Practical Adversarial Network Traffic Crafting Approach

Master thesis (2021) - M. Simidžioski (author) , S.E. Verwer (mentor) , C.M. Jonker (graduation committee member) , D.A. Vos (graduation committee member)

Adversarial attacks pose a risk to machine learning (ML)-based network intrusion detection systems (NIDS). In this manner, it is of great significance to explore to what degree these methods can be viably utilized by potential adversaries. The majority of adversarial techniques a ...

Adversarial attacks pose a risk to machine learning (ML)-based network intrusion detection systems (NIDS). In this manner, it is of great significance to explore to what degree these methods can be viably utilized by potential adversaries. The majority of adversarial techniques are designed for unconstrained domains such as the image recognition domain, where these methods apply alterations to the pixels in a picture. Therefore, the applicability of these techniques to the NIDS domain is very limited. Related work on adversarial techniques for NIDS generally considers feature-space techniques, which cannot be applied in a practical situation since only the extracted network traffic features are modified and not the actual network traffic. To solve these limitations, a traffic-space approach for creating adversarial examples for evading ML-based NIDS is proposed and assessed with several classification models. The proposed constrained adversarial crafting method is based on the Iterative Fast Gradient Sign Method (IFGSM) and is called the Constrained Iterative Fast Gradient Sign Method (CIFGSM). A constraint set is added as a penalty term to the loss function of the optimization to ensure that the adversarial values remain within the valid space. Additionally, an L2 regularization term is used to minimize the distance between the original and adversarial network traffic samples. The proposed method is evaluated and shown to be an effective way for generating realistic and practical adversarial evasion packets. To achieve this, network packet components and their characteristics are defined as a constraint set which can be used for the optimization task and a custom adversarial loss function is created that encapsulates the different elements of this optimization problem. Furthermore, multiple models are evaluated to test the transferability of this method. Conclusively, the proposed method is evaluated in a realistic scenario, where adversarial packet captures are crafted and examined. Where other state-of-the art works only modify the network traffic features in feature-space or on a connection level only and do not apply their method in a real world scenario, this work modifies the packet captures on a per-packet level which is subsequently used to evaluate flow based classification models.

Assessing the performance of the TDNN-BLSTM architecture for phoneme recognition of English speech

Bachelor thesis (2021) - I.A. Klom (author) , O.E. Scharenborg (mentor) , S. Feng (mentor) , C.M. Jonker (graduation committee member)

This research studies the Projected Bidirectional Long Short-Term Memory Time Delayed Neural Network (TDNN-BLSTM) model for English phoneme recognition. It contributes to the field of phoneme recognition by analyzing the performance of the TDNN-BLSTM model based on the TIMIT corp ...

Evaluation of phoneme recognition through TDNN-OPGRU on Mandarin speech

Bachelor thesis (2021) - J. van der Tang (author) , S. Feng (mentor) , O.E. Scharenborg (mentor) , C.M. Jonker (graduation committee member)

This research expands past research on implementing the TDNN-OPGRU network for Automatic Phoneme Recognition on Dutch speech by implementing and testing the TDNN-OPGRU network on Mandarin speech. The goal of this research is to investigate the performance of the TDNN-OPGRU archit ...

Evaluating the performance of TDNN-BLSTM on Mandarin read and spontaneous speech

Bachelor thesis (2021) - M. Chiroşca (author) , S. Feng (mentor) , O.E. Scharenborg (mentor) , C.M. Jonker (graduation committee member)

A limitation of current ASR systems is the so-called out-of-vocabulary words. The solution to overcome this limitation is to use APR systems. Previous research on Dutch APR systems identified Time Delayed Bidirectional Long-Short Term Memory Neural Network (TDNN-BLSTM) as one of ...

Smart Teddy: Elderly monitoring and support system using ambient intelligence

Human Interaction and Integration

Bachelor thesis (2021) - L.E. Croes (author) , S. Haggerty (author) , Z. Al-Ars (mentor) , Zaid Al-Ars (mentor) , Hani Al-ers (mentor) , C.M. Jonker (graduation committee member) , Matthias Moller (graduation committee member) , M. Möller (graduation committee member) , M. Moller (graduation committee member) , Matthias Möller (graduation committee member)

In September 2018, the Smart Teddy project was founded by a group of researchers within the Hague University of Applied Sciences1 in the Netherlands. The Smart Teddy project is a multidisciplinary project aiming to create an interactive system, using a teddy bear as a focus point ...

In September 2018, the Smart Teddy project was founded by a group of researchers within the Hague University of Applied Sciences1 in the Netherlands. The Smart Teddy project is a multidisciplinary project aiming to create an interactive system, using a teddy bear as a focus point, which collects the data needed in order to enable seniors with dementia to live independently for a longer period of time. Over the last three years, three prototypes of the Smart Teddy have been developed. The Smart Teddy project was introduced as a final project for students following the BSc program Electrical Engineering at the Delft University of Technology. Starting in April 2021, a team of six students attempted to further develop the Smart Teddy over the course of 11 weeks. This thesis contains the Human Interaction & Integration subdomain of the Smart Teddy thesis project, where Human Interaction refers to the aspects of the Teddy that encourage interaction with the user, and Integration refers to the combination of all subdomains into one fully functioning prototype. In this thesis, the design choices, implementation methods and verification are discussed. The contribution to the prototype regarding Human Interaction & Integration are the addition of a movement system using pneumatics, the implementation of a flexible touch sensor, the ability for the Teddy to produce audio, to communicate wirelessly with the Base Station, and for all components in the Teddy to communicate with the main controller. The final prototype has been implemented using the Raspberry Pi Pico microcontroller, which was mounted on a custom PCB. All controls are provided by the Pico, and uses I2C, SPI, UART, and analog and digital inputs to communicate with the sensors and actuators. These sensors and actuators were implemented using off-the-shelf breakout boards and drivers, to allow for fast design and test iterations. The movement of the Teddy has been implemented using air pumps and molded silicon rubber, and the tail wagging is implemented using the same principle used for soft robotic grippers. The final prototype is fully functional and meets 16 of the 20 requirements - the requirement concerning speech recognition and the noise produced by the pumps have not been met.

The Smart Teddy Project

Design of a data acquisition system to monitor seniors with dementia and detect dangerous situations

Bachelor thesis (2021) - A. Hamo (author) , T.N. van der Spijk (author) , Z. Al-Ars (mentor) , Zaid Al-Ars (mentor) , Hani Al-ers (graduation committee member) , C.M. Jonker (graduation committee member) , Matthias Möller (graduation committee member) , M. Moller (graduation committee member) , M. Möller (graduation committee member) , Matthias Moller (graduation committee member)

The amount of people dealing with dementia is rising globally. The amount of caretakers is, however, not. Therefore, technological aids are needed to support people dealing with dementia and relieve the stress on their caretakers. Current solutions provide tracking of people with ...

Exploring the effects of conditioning Independent Q-Learners on the sufficient plan-time statistic for Dec-POMDPs

Master thesis (2020) - A.V. Mandersloot (author) , F.A. Oliehoek (mentor) , A.T. Czechowski (graduation committee member) , C.M. Jonker (graduation committee member) , M.M. de Weerdt (graduation committee member)

The Decentralized Partially Observable Markov Decision Process is a commonly used framework to formally model scenarios in which multiple agents must collaborate using local information. A key difficulty in a Dec-POMDP is that in order to coordinate successfully, an agent must de ...

Modeling Artificial Personalities through the Expression of Emotions in Narrative Games

Master thesis (2019) - M.J. Otte (author) , D.J. Broekens (mentor) , Rafael Bidarra (graduation committee member) , C.M. Jonker (graduation committee member)

Personality modeling is important in order to create character variation in
games. Character variation favors replayability and is an important aspect
of game design. The eect of articial personalities through the expression of
emotions is evaluated in this research. ...