Managing Incidents in Monitor

Step 1: View active incidents in the Incidents tab

Click the Incidents tab in the Monitor sidebar to see all ongoing and historical incidents. Active incidents are highlighted at the top with a red badge showing how long they have been open.

Incidents are automatically created the moment a monitor first fails
Filter by monitor, date range, or severity to find specific incidents
The incident list shows duration, affected monitors, and current status
Red badges indicate ongoing issues; gray badges indicate resolved incidents

Step 2: Open an incident to see its timeline and affected monitors

Click any incident to open its detail view. The incident page shows a minute-by-minute timeline of check results, error messages returned by the target, and which check locations detected the failure.

The timeline graph visualizes when the service went down and came back
Error details (HTTP status code, response body snippet) are logged for each failed check
Multiple affected monitors are listed if a single alert rule covers several services
Response time trends before the incident can reveal performance degradation patterns

Step 3: Add incident notes and updates

Use the Add Note field to post updates during an active incident. Notes are timestamped and visible to all team members who have access to Monitor. Status page subscribers can also receive these updates automatically.

Record the root cause hypothesis as soon as it is identified
Document every action taken (restart, rollback, config change) with a note
Notes posted to linked status pages keep your users informed in real time
Team members are notified of new notes via their configured channels

Step 4: Resolve the incident when service is restored

Once the monitor returns to an Up state, the incident is automatically marked as resolved. You can also resolve it manually if you have confirmed the issue is fixed before the next check runs.

Click Resolve Incident to close it immediately
Add a resolution note explaining what fixed the issue
Notification channels will receive a "service restored" alert
Status page subscribers see the incident move to Resolved status

Step 5: Review the post-incident report

After an incident resolves, StackBloom Monitor generates a post-incident report summarizing the total downtime, impacted services, and a full timeline of events. Use this report for team post-mortems and reliability improvements.

Reports include total downtime duration and time-to-detect
Export the report as PDF or CSV for stakeholder communication
Use SLA tracking to see how the incident affected your uptime percentage
Historical reports are retained indefinitely for compliance purposes

💡 Tip: Add notes throughout the incident — not just at resolution. Capturing your investigation steps in real time makes post-mortem analysis much easier and helps your team respond faster to similar issues in the future.

Step 1: View active incidents in the Incidents tab

Step 2: Open an incident to see its timeline and affected monitors

Step 3: Add incident notes and updates

Step 4: Resolve the incident when service is restored

Step 5: Review the post-incident report

Next Steps