Your dashboards are green. Latency is stable. Error rates are low. But users are complaining - something feels off. In this talk, we’ll explore how to detect the invisible. We'll go beyond traditional observability - logs, metrics, and alerts - and dive into the "gray zone" where failures begin to form before they make a sound. Through the story of a real-time voting system that silently degraded under load, we'll show how fuzzy signals, behavioral patterns, and anomaly detection helped us identify issues before they turned into outages. We’ll walk through practical techniques like monitoring acknowledgment gaps, tracking reconnect patterns, and applying statistical deviations to uncover weak signals early. If you’ve ever heard “it works for me” while your users struggle, this talk is for you.
Room: Room 2
Mon, Oct 27th, 15:40 - 16:10