J. Gao | TU Delft Repository

Deep Reinforcement Learning for Ride-hailing Systems

An experimental study on optimizing matching radius for ride-hailing systems using Deep Reinforcement Learning

Master thesis (2024) - H. Zhao (author), J. Gao (mentor), M. Mai (mentor), O. Cats (graduation committee member), Oded Cats (graduation committee member), J. Yang (graduation committee member)

In the field of public transportation, environmentally friendly and convenient transportation modes are the future trends. The ride-hailing services is an important component of them. However, current ride-hailing systems, particularly the matching systems, still have issues rela ...

In the field of public transportation, environmentally friendly and convenient transportation modes are the future trends. The ride-hailing services is an important component of them. However, current ride-hailing systems, particularly the matching systems, still have issues related to low system efficiency and bad user experience. Although existing ride-hailing rider-driver matching system can allocate travel demands and drivers to a certain extent, they still have deficiencies in certain scenarios. For example, they cannot ensure effective rider-driver matching during peak hours, or they cannot find a good balance between pick-up distance and matching rate. As Reinforcement learning (RL) has been proven in many studies to be applicable and effective in solving complex and dynamic optimization problems. This study aims to explore how Reinforcement Learning (RL) can be adapted to the ride-hailing matching system to optimize system efficiency and user experience through a dynamic matching radius policy. The research objective of this study is to simulate an actual ride-hailing system and use RL to train a policy. This policy can output an optimized dynamic matching radius in real-time based on real-time rider-driver demand-supply relationship, hence achieving a higher matching rate, a shorter average pick-up distance, and a higher driver utilization rate of the ride-hailing system.
Adapting Reinforcement Learning (RL) to optimize the ride-hailing system's matching radius has several difficulties and challenges due to the uncertainties in the real-world rider-hailing market. Traditional approaches are normally static, solving the matching problem at specific times through mathematical models. However, these methods often perform inconsistently when dealing with fluctuating ride-hailing supply-demand relationships, particularly during peak hours. On the other hand, the dynamics and complexity of the ride-hailing market and the ride-hailing environment also make it difficult to model the ride-hailing system. The ride-hailing market is easily affected by many variables, such as weather conditions and local traffic conditions. When quantitatively optimizing the matching radius of the ride-hailing matching system, it is critical to reasonably control irrelevant variables. To address these challenges, this study models the ride-hailing matching problem as a Markov Decision Process (MDP). Based on the defined MDP, a ride-hailing matching simulator is developed. Some assumptions and simplifications are also made to ensure high realism while reasonably controlling irrelevant variables and uncertainties. Multi-replay-buffer Deep Deterministic Policy Gradient (MDDPG) algorithm is then applied to handle the optimization problem of the ride-hailing matching radius. Through the interactions between the MDDPG agent and the developed simulator, feedback rewards are received for the agent to improve the policy. The proposed method is then validated in a case study showcasing the application of the developed simulator and the RL algorithm in a real-world scenario in Austin, Texas. The case study includes an analysis of the current ride-hailing market in Austin, how to apply the simulator based on it, the implementation details of the RL algorithm, and the resulting performance improvements. The results of the case study show that the actions obtained from the proposed method outperform all the baselines in multiple scenarios, highlighting the benefits of using Reinforcement Learning to improve ride-hailing efficiency and user experience.
To conclude, the optimization method proposed in this study applies an advanced Reinforcement Learning approach to the ride-hailing system, successfully improving overall efficiency and user experience. The results of this research demonstrate the potential of Reinforcement Learning in optimizing ride-hailing matching systems, offering a promising direction for further exploration. This study lays a solid foundation for future research to build upon, encouraging the development of more optimization methods with RL technologies that can enhance the effectiveness and adaptability of ride-hailing system in increasingly complex and dynamic environments.

An integrated approach to implementing opportunity charging for electric buses

A case study of Rotterdam

Master thesis (2024) - S.B. van Noort (author), J. Gao (mentor), B. van Arem (graduation committee member), Gonçalo Correia (graduation committee member), Wijnand Veeneman (graduation committee member), J Spoelstra (graduation committee member)

The introduction of electric buses (EBs) is required in order to achieve global greenhouse emission goals. However, operating an electric bus fleet (EBF) requires careful consideration of both the mobility and energy systems in charging strategies. In literature and in real life ...

The introduction of electric buses (EBs) is required in order to achieve global greenhouse emission goals. However, operating an electric bus fleet (EBF) requires careful consideration of both the mobility and energy systems in charging strategies. In literature and in real life many cases show careful consideration of only one of the systems and an oversimplification of the other, resulting in failed implementations. This paper focuses on the need for additional charging of EBs during the day using on-route opportunity charging during the on- and offboarding of passengers in residential neighbourhoods. This research aims to provide an integrated approach to bus service design, including both the mobility and energy systems. Additionally, this paper presents insights into the effect of traffic management on the feasibility of on-route opportunity charging. Two simulation tools are used for the integrated decision-making approach: SUMO and Gaia. The Simulation of Urban MObility (SUMO) software provides results regarding the energy consumption behaviour and charging needs of the EBs. Gaia provides results regarding the transformer load rate in the residential distribution grid. This paper proposes five strategies with a varying configuration of charging capacities, charging times and the introduction of traffic management via traffic priority. The five strategies are tested against three requirements: two from the mobility system and one from the energy system. The research presented uses a case study of bus line 36 in Rotterdam, the Netherlands. The results of this research show that residential neighbourhoods are often unable to handle chargers with the maximum capacity of the buses in the case study of 450 kW. A mid-range charger of 200 kW is possible in most cases. With a 200 kW charger, the on- and offboarding time of passengers is insufficient to complete daily operations. Therefore, additional charging time is necessary. Without the introduction of traffic priority, the timetable cannot be completed. With the introduction of traffic priority, travel time and energy consumption are reduced by 9.8% and 4.5%, respectively, and daily operations can be completed while satisfying the requirements of the mobility and energy systems. The results presented in this thesis suggest that an EB route design with opportunity charging is impossible without considering both mobility and energy systems simultaneously. From the mobility perspective, opportunity charging is feasible only if the charging infrastructure can meet the buses' energy needs without disrupting the schedule. From the energy perspective, it is essential that opportunity charging is implemented where and when sufficient charging capacity is available. The introduction of traffic priority allows for the satisfaction of all requirements and displays an example of the connection between mobility and energy systems.

Determining the Location of Charging Station for a One-Way Electric Car-Sharing System Under Demand Uncertainty

Master thesis (2024) - Zhaoyu YAN YAN (author), O. Cats (graduation committee member), Oded Cats (graduation committee member), J. Gao (mentor), M.Y. Maknoon (mentor)

This thesis explores the optimization of charging station location within a one-way electric car-sharing system, addressing the inherent challenges of demand uncertainty. We introduce a novel deep learning-based stochastic programming framework (LMSP Framework) to tackle this iss ...

This thesis explores the optimization of charging station location within a one-way electric car-sharing system, addressing the inherent challenges of demand uncertainty. We introduce a novel deep learning-based stochastic programming framework (LMSP Framework) to tackle this issue. This framework integrates two key components:
1. A deep learning model (LSTM-MLP-MDN), composed of Long Short-Term Memory (LSTM), Multilayer Perceptron (MLP), and Mixture Density Network (MDN) architectures, which predicts the probability distribution of traffic demand using historical data.
2. A two-stage stochastic programming model, designed to strategically optimize the locations of charging stations and the initial number of vehicles at each station under demand uncertainty.
The primary goal of this framework is to improve the profitability and operational efficiency of the car-sharing system by optimizing both the location and capacity of charging stations, effectively solving the Charging Station Location Problem (CSLP).
We validate the effectiveness, adaptability, and feasibility of our framework through a comprehensive case study in Manhattan, utilizing historical traffic data to ensure the reliability of the deployment plan. Our results demonstrate that integrating deep learning techniques with stochastic programming significantly enhances both the accuracy of demand forecasting and the consistency of resource allocation in the optimization process for charging station locations. Specifically, the LMSP Framework achieves higher operational efficiency in the short term, as evidenced by superior metrics like Demand Satisfaction Ratio (DSR) and Charging Station Utilization Rate (CSU) compared to traditional methods. This ensures more balanced and efficient resource allocation across different demand scenarios. However, traditional approaches tend to perform better in short-term financial indicators, such as profit and Return on Investment (ROI), as they employ more aggressive resource allocation strategies based on higher demand forecasts. Despite this, the LMSP Framework's focus on operational efficiency positions it as a more viable option for long-term profitability and user satisfaction, offering a sustainable solution for urban mobility.
Furthermore, our analysis provides valuable recommendations for future charging station deployments. These findings have important implications for the planning and operation of electric car-sharing systems, potentially contributing to more sustainable and efficient urban mobility solutions for all stakeholders involved.

Demand responsive transport to replace a fixed-line bus service

A case study in Voorne-Putten Rozenburg

Master thesis (2024) - L. Krudde (author), N. Yorke-Smith (mentor), J. Gao (mentor), P.K. Murukannaiah (graduation committee member), Willem Jan Tempelaar (graduation committee member), Marco Hennipman (graduation committee member)

Conventional fixed-line bus services are ineffective and costly in areas of low demand. Bus operators are motivated to scale down their service to reduce costs. Consequently, many rural areas in the Netherlands suffer from diminishing accessibility to jobs, hospitals, and educati ...

Conventional fixed-line bus services are ineffective and costly in areas of low demand. Bus operators are motivated to scale down their service to reduce costs. Consequently, many rural areas in the Netherlands suffer from diminishing accessibility to jobs, hospitals, and education using public transport. Demand responsive transport (DRT) systems can offer a solution, but have been struggling in adoption by travellers because the need to make a reservation is perceived as an obstacle. Furthermore, the performance of DRT systems is complex and requires evaluation before implementation. In this context, this thesis considers the replacement of three low-demand bus lines by a DRT system in Voorne-Putten Rozenburg, Netherlands. Two scenarios are formulated: the replacement of a single bus line and the replacement of all three low-demand bus lines. The objective of this thesis is to evaluate the performance of a DRT system in serving the travel demand currently served by the bus lines. To obtain the travel demand, individual trips from the existing bus line were generated from aggregated data on boarding and alighting in 2023. The travel demand in a DRT system was evaluated using an agent-based simulation approach. To address the perceived obstacle of making reservations, the DRT system also considered on-demand requests and a proposed stop-based request where the user can request the DRT service physically at the bus stop with the consequence that the DRT system initially only knows the users' origin and not the destination. The main results suggest that a DRT system is an improvement over fixed-line buses in terms of waiting time and travel time, and CO₂ emissions when considering current travel demand in both scenarios. However, the costs are comparable. In the case that travel demand increases more than twofold, fixed-line buses are to be preferred. Considering the different request methods, only a small difference in user times was observed. This thesis concludes that replacing the bus lines with a DRT system can lead to an increase in accessibility. The proposed stop-based request method was shown to be a feasible alternative to the request methods that require the use of an app or phone call and could potentially lower the obstacle to using DRT.