SB

S. Bratus

1 records found

We investigate the generalization performance of predictive models in model-based reinforcement learning when trained using maximum likelihood estimation (MLE) versus proper value equivalence (PVE) loss functions. While the more conventional MLE loss aims to fit models to predict ...