What question did this study set out to answer?

This research aims to create a cost-effective method for generating synthetic troubleshooting dialogues using a knowledge graph.

March 17, 2026Open Access

Knowledge-Graph-Guided Synthetic Dialogue Generation for Network Troubleshooting

Key Points

This research aims to create a cost-effective method for generating synthetic troubleshooting dialogues using a knowledge graph.
Constructed a Domain Knowledge Graph from historical support tickets and manuals.
Implemented transformer-based neural generation with explicit control signals.
Used named-entity recognition to extract network-specific entities.
Evaluated across five network domains with a simulated deployment.
Achieved 0% hallucinations compared to a 24.7% baseline.
Obtained a BLEU score of 0.44 ± 0.02 with 94.3% factual accuracy.
Observed a 27.1% reduction in Mean Time To Repair (MTTR).
Improved first-call remediation success rates by 15.2%.
Reduced costs by 50-200× compared to manual annotation.

Abstract

This paper presents a novel knowledge-graph-guided approach for generating synthetic troubleshooting dialogues in network operations environments. Traditional methods for creating training data for conversational AI systems require costly expert annotation, often costing 50-200× more than automated approaches. Our technique constructs a Domain Knowledge Graph (DKG) from historical support tickets and technical manuals, encoding symptom-fault-diagnostic-remediation relationships. Using this structured knowledge with transformer-based generation and explicit control signals, we generate synthetic dialogues that maintain logical diagnostic flow while eliminating hallucinations. KEY FINDINGS: • Zero hallucinations when validated against knowledge graph (0% vs 24.7% baseline) • BLEU score of 0.44 ± 0.02 with factual accuracy of 94.3% ± 1.1% • 27.1% reduction in Mean Time To Repair (MTTR) in simulated deployment • 15.2% improvement in first-call remediation success rates • 50-200× cost reduction compared to manual expert annotation • Generation speed of 165ms per dialogue (~6,000 dialogues/hour) TECHNICAL CONTRIBUTIONS: The system integrates named-entity recognition (NER) for extracting network-specific entities, a domain knowledge graph with 2,177 nodes (543 symptoms, 218 faults, 1,089 diagnostics, 327 remediations), transformer-based neural generation with control signals for domain/severity/length, and multi-stage validation ensuring factual correctness. Evaluated across five network domains (RAN, Core, Transport, Access, IP) with 1,000 test dialogues and a 3-month simulated deployment with 50 operators, demonstrating statistical significance (p < 0.001) for all operational improvements. APPLICATIONS: • Training conversational AI for network troubleshooting • Generating synthetic training data for specialized technical domains • Reducing dependency on expensive expert annotation • Improving AI assistant performance in telecom operations This work establishes a paradigm for generating high-fidelity training data in domains where authentic data remains scarce due to confidentiality constraints or annotation costs.

Read Full Paperexternally

AIに質問

Bookmark

View Full Paper