Modern systems generate a tremendous amount of data, making manual investigations infeasible, hence requiring automating the process of analysis. However, running automated log analysis pipelines is far from straightforward, due to the changing nature of software ecosystems cause
...
Modern systems generate a tremendous amount of data, making manual investigations infeasible, hence requiring automating the process of analysis. However, running automated log analysis pipelines is far from straightforward, due to the changing nature of software ecosystems caused by the constant need to adapt to user requirements. In practice, these are comprised of a series of steps that collectively aim at turning raw logs into actionable insights. The first step is log parsing which aims to abstract away from raw logs toward structured information. Log parsing is paramount, as it influences the performance of all subsequent downstream tasks that rely on its output. Although previous works have investigated the performance of log parsing, given the increase in data heterogeneity witnessed over the past decades, the validity of current estimates is questionable, as there is a lack of understanding of how log parsing methods perform in modern contexts. Consequently, we investigate the field and, in the process, we discover that misleading metrics are adopted, which produce incomplete performance estimates. Furthermore, motivated by an industry use case within the infrastructure of a large international financial institution, we discover that the current log parsing paradigm is not aligned with what is required in practice. Consequently, to address these current limitations, in this work we contribute with the following. We (1) evaluate the field of log parsing, (2) propose a new log parsing paradigm and create a benchmark dataset to facilitate future research, and (3) propose and evaluate a machine learning model that solves log parsing within the new paradigm.