Ever wondered how hyperscale cloud providers keep you informed during outages? In this deep-dive session from Microsoft Ignite, Rick Claus (Azure Engineering) and Tajinder Pal Singh Ahluwalia (Azure Marketing) reveal the secrets behind real-time notifications, transparency principles, and outage readiness strategies.
What You’ll Learn:
- How Azure communicates during incidents using speed, accuracy, and transparency
- Setting up Azure Service Health for personalized alerts
- Understanding how we use "Brain" to identify issues and craft updates
- Best practices for outage readiness and resource-level monitoring
- Post-incident reviews (PIRs) and live retrospectives for continuous improvement
Resources & Links:
📚 Azure Service Health: https://learn.microsoft.com/azure/service-health/overview
📺 AIR Videos: https://aka.ms/air/videos
Speakers:
Rick Claus – Principal Cloud Advocate
Tajinder Pal Singh Ahluwalia – Azure Infrastructure Team
Hashtags:
#Azure #CloudReliability #msIgnite #DevOps #IncidentManagement #AzureServiceHealth #CloudComputing #ResilienceEngineering #MicrosoftDeveloper #itPro
✅ Chapter Markers
00:00 - Welcome & Session Overview
02:15 - Why Outage Communication Matters
06:40 - The Five Principles: Speed, Accuracy, Discoverability, Parity, Transparency
12:30 - AI-Powered Alerts: Meet Brain
18:45 - Azure Service Health Deep Dive
28:10 - Preparing for Incidents: Best Practices
36:50 - During an Outage: Communication at Scale
45:20 - Post-Incident Reviews & Live Retrospectives
52:00 - Lessons Learned from Azure Front Door Outage
59:30 - Resources & Next Steps