Machine learning approaches exploring the optimal number of driver profiles based on naturalistic driving data
More Info
expand_more
Abstract
Driver behavior analytics is an important concept that plays a significant role in the understanding of road crashes. This paper investigates the optimal number of driver profiles to understand the most important characteristics that differentiate drivers and extract useful insights on the value of using different clustering approaches in profile recognition. To this end, two Machine Learning clustering algorithms, the K-Means and OPTICS algorithms, are applied on driving data from a large naturalistic experiment using almost 18 K trips recorded from 130 drivers. The results revealed 3 profiles, the less risky drivers, the modest drivers and the more aggressive drivers. Clustering was based on 3 important driving behavior characteristics, namely the number of speeding, headway and harsh events per 100 km. The less risky drivers profile was revealed by both algorithms, whereas drivers of higher aggressiveness are distinguished by K-Means based on the driving feature that dominates the rest. The OPTICS algorithm showed that many drivers, especially the aggressive ones, present unique behavior that cannot be grouped together with other drivers. The interpretability of driver profiles resulting from the application of these unsupervised learning techniques is worsened as the number of clusters increases. The association between driver profiles and individual characteristics leads to the conclusion that aggressiveness is mainly driven by personality traits and less by specific characteristics such as gender, age or past accident history. The results of this study can be potentially used to develop profile-specific applications that provide feedback to drivers and reduce their crash risk.