Key points are not available for this paper at this time.
Automated log analysis is widely applied in modern software-intensive systems to ensure resilience and sustainability, where log parsing is a vital initial step, converting unstructured logs into structured data for downstream analysis. However, traditional log parsing algorithms are designed to process logs within a single domain. As cross-domain dependencies and interactions between sub-modules of software systems increase, these algorithms struggle to handle the challenges posed by multi-domain log inputs, which results in a significant decline in parsing accuracy when facing heterogeneous logs. Additionally, current solutions for heterogeneous log parsing require extensive manual labeling efforts. In this paper, we propose Domain-aware Parser (DA-Parser), a framework that consists of a domain-aware head to identify the source domains of heterogeneous logs and then converts the multi-domain log parsing problem into a series of single-domain parsing problems. The domain-aware head is pretrained using a corpus of logs from 16 domains, which allows for the classification of the source domains of most heterogeneous log set without additional human labeling. Source domain tags predicted by the domain-aware head serve as a constraint to limit the template extraction process to logs from the same domain. Empirical evaluation is conducted on a multi-domain dataset containing logs from 7 domains. DA-Parser can be integrated with existing single-domain algorithms and are compatible with them, achieving superior parsing accuracy with an average of 9.26% improvement compared with single-domain algorithms.
Tao et al. (Thu,) studied this question.