Blast Radius Analysis
Before-the-fact impact analysis for cloud changes — every dependent resource, severity classification, risk score, and cost impact.
Every cloud change has consequences. Deleting a schedule might leave 20 servers running 24/7. Removing an autoscaler could lock instances at their current size during a traffic spike. Rightsizing a database might break the three services that depend on it.
Blast radius analysis answers one question: "What happens if I do this?" — before the action is taken. It identifies every resource that will be affected, how severe the impact is, and what it will cost.
A Real-World Example
Consider a schedule called "Business Hours" that stops 15 EC2 instances and 2 resource groups (containing 8 more instances) every night at 7 PM and starts them at 7 AM. This saves 42% on compute costs — about $2,400/month.
When a user attempts to delete this schedule, the following occurs:
- A confirmation dialog appears: "Are you sure you want to delete Business Hours?"
- Inside the dialog, a "View Blast Radius" link is shown
- Clicking it opens the architecture canvas with the blast radius overlay
- The view shows:
- 15 resources marked as affected — they will run 24/7 without the schedule
- 2 resource groups (expandable to see 8 individual members)
- A cost panel showing: 42% savings -> 0% with the message "Deleting this schedule removes all automated start/stop rules. Resources will run 24/7, losing 42% in savings."
- Per-resource cost breakdown showing each instance's monthly cost and projected loss
- The user reviews the impact, clicks "Back to Schedules", and decides not to delete
How It Works Under the Hood
ZopNight's discoverer service maintains a dependency graph of all resources across connected cloud accounts. This graph is built during resource discovery by analyzing relationships like "this load balancer routes traffic to these EC2 instances" or "this RDS database is accessed by these Lambda functions".
When a blast radius analysis is requested, the system follows these steps:
- Resolve the target — identifies the entity being acted on (a resource, a schedule, or an autoscaler policy)
- Find connected resources — traverses the dependency graph to find all resources within 1 hop of the target. For schedules, this includes directly attached resources and members of attached resource groups.
- Classify impact — each connected resource is assigned an impact level (affected, warning, or safe) based on the operation being performed and the type of relationship between the resources.
- Compute risk score — a numeric score (0-100) is calculated based on how many resources are affected, their environment (production vs dev), and how many teams are impacted. As of v2, the risk score is computed client-side from the backend's impact data — see
computeRiskScore.js. API consumers can replicate the formula using the components described below. - Return results — the frontend renders the graph with the target at the center and connected resources arranged in a radial layout, color-coded by impact level.
Where Blast Radius Appears
Blast radius appears automatically in three places. There is no separate page to navigate to — it is embedded in the workflow wherever an action is taken.
1. Recommendations
When ZopNight identifies a cost optimization opportunity (idle server, oversized database, orphaned disk), the blast radius can be viewed before acting on the recommendation.
Example: ZopNight recommends rightsizing an RDS instance from db.r5.2xlarge to db.r5.xlarge.
- Click "View Blast Radius" in the recommendation drawer
- The graph shows 3 connected resources:
- An ECS service that reads from this database — marked as warning (may experience brief interruption during resize)
- A Lambda function with read-only access — marked as warning
- A CloudWatch alarm monitoring the database — marked as safe
- Risk score: 35/100 (moderate — dev environment, 2 services affected)
- Estimated savings shown: $145/month
ZopNight classifies impact using a comprehensive resource-type behavior map covering 150+ types across AWS, GCP, and Azure. For example, it knows that resizing an RDS instance requires a restart (so dependent services see a brief outage), but updating an S3 bucket policy is an online operation (no impact on consumers).
2. Autoscaler Policies
When pausing, removing, or re-applying an autoscaler policy, a confirmation dialog shows the blast radius of that action.
Example: Removing an autoscaler policy from a production Auto Scaling Group.
- The confirmation dialog shows "View Blast Radius"
- The graph shows 4 EC2 instances managed by this policy:
- All 4 marked as warning — they lose auto-scaling protection
- Cost impact panel shows:
- Current instances: 3 (min: 2, max: 10)
- After removal: stays at 3 but will not scale up during traffic spikes or down during quiet periods
3. Schedules
Schedules control when cloud resources start and stop. Changing or deleting a schedule directly affects the cost savings for every resource attached to it. The schedule blast radius is the most detailed implementation, with full cost impact analysis.
Example: Updating a schedule from "weekdays 9-5" to "weekdays 9-9".
- Edit the schedule, change the cron times, click "Update"
- A confirmation dialog appears with "View Blast Radius"
- The graph shows:
- 10 directly attached resources (EC2, RDS, ECS) — all marked as warning
- 1 resource group ("Dev Servers") with 5 members — click to expand and see individual resources
- Cost impact panel shows:
- Current Savings: 52% (resources stopped 12 hours/day)
- After Update: 36% (resources stopped only 8 hours/day)
- Message: "Updated schedule reduces savings by 16%. Resources will run for more hours per week."
- Click any resource node to see its floating info card:
- Monthly Cost: $156.00
- Current Savings: $81.12/mo
- After Savings: $56.16/mo (yellow — savings decreased)
Resource Groups in the Graph
If a schedule has resource groups attached, they appear as a single node with a grid icon and a button showing the member count (e.g., "5 resources"). Click the resource group node to expand it — member resources fan out in an arc, each showing their own cost and impact details. Click again to collapse.
Schedule Impact Classification
| Operation | Impact Level | What Happens |
|---|---|---|
modify (update) | warning | Start/stop timing changes. Resources follow the new schedule. Savings % may increase or decrease depending on the new hours. |
delete | affected | Schedule is permanently removed. All attached resources lose their automated start/stop rules and will run continuously (24/7). All savings from this schedule are lost. |
Schedule Cost Data
The cost panel in the schedule blast radius shows the current savings percentage and what it will be after the change. This is calculated from the schedule's cron grid (how many hours per week resources are stopped) combined with per-resource monthly cost from the aggregator service.
Understanding Impact Levels
Every connected resource in the blast radius is assigned one of three impact levels. The level depends on both the operation being performed and the type of relationship between the resources.
| Level | Color | What It Means | Real-World Example |
|---|---|---|---|
| affected | Red | This resource will be directly and significantly impacted. Data loss, service outage, or permanent state change is likely. | Deleting an ECS cluster destroys all its services. Deleting a schedule means resources run 24/7. |
| warning | Yellow | This resource may experience a temporary disruption or behavior change. The change is usually recoverable. | Resizing a database causes a brief restart — connected services see a momentary outage. Updating a schedule changes when resources start/stop. |
| safe | Green | This resource is unlikely to be affected. The operation can be performed without disrupting this resource. | Modifying an S3 bucket's lifecycle policy does not affect the Lambda functions that read from it. |
How Classification Works (For Recommendations)
For recommendation-based blast radius, ZopNight uses the same resource-type behavior map referenced above to categorize every supported cloud resource into one of three modification behaviors:
| Behavior | Impact | Meaning | Resource Examples |
|---|---|---|---|
onlineModify | safe | Resource can be modified without any downtime or restart | S3 buckets, CloudFront distributions, IAM policies, security groups |
restartModify | warning | Modification requires a restart or brief outage | RDS instances, EC2 instances (instance type change), ElastiCache clusters |
poolModify | warning | Modification affects a pool of resources (rolling update) | EKS/GKE node pools, Auto Scaling Groups, ECS services |
Risk Score
The right panel of the blast radius view shows a risk score gauge from 0-100. This score provides a quick assessment of the overall risk without reading every individual resource's impact.
| Score Range | Level (UI badge) | Meaning |
|---|---|---|
0-25 | low | Few resources affected, mostly non-production. Safe to proceed in most cases. |
26-50 | medium | Several resources affected or some production resources involved. Review before proceeding. |
51-75 | high | Many resources affected, production environment, or cross-team impact. Proceed with caution. |
76-100 | critical | Widespread impact across production. Consider scheduling a maintenance window. |
How the Score Is Calculated
The score is the sum of three components, each with a cap:
| Component | Max Points | How It Works |
|---|---|---|
| Impact | 60 | Weighted impact ratio (affected = 1.0, warning = 0.5, safe = 0.0) multiplied by 50, plus a bonus based on the number of impacted resources: +4 pts if more than 2, +7 pts if more than 5, +10 pts if more than 10. |
| Environment | 15 | Detected from resource tags (env, environment). production = 15 pts, staging = 8 pts, dev = 3 pts, unknown = 5 pts. Set to 0 if no resources are impacted. |
| Ownership | 20 | Teams affected: 1 team = 5 pts, 2-3 teams = 10 pts, 4+ teams = 15 pts. Active schedules on the target resource add +5 pts. |
After summing all components, a pause operation applies a 0.4 multiplier to the final score (since pausing is less disruptive than deleting). The result is capped at 100.
Using the Blast Radius Graph
The blast radius opens as a three-panel overlay on the architecture canvas:
Left Panel — Resource List
- Lists all connected resources with impact badges
- Search bar to find specific resources by name
- Click a resource to zoom to it on the graph and show its info card
- For resource groups: clicking a member auto-expands its parent group in the graph
Center — Interactive Graph
- Target node at the center (clock icon for schedules, resource type icon for others)
- Connected resources arranged in a radial layout, grouped by impact level
- Resource groups shown as expandable nodes — click to reveal members in an arc
- Floating info card — click any node to see details: provider, region, status, connection type, cost, and impact reason
- Impact legend at the bottom — click to filter by impact level (toggle affected/warning/safe)
- Zoom controls — scroll to zoom, drag to pan
- Collapse All button — resets all expanded groups
Right Panel — Summary
- Risk score gauge — 0-100 speedometer visualization
- Stat cards — Connected count, Affected count, Warning count
- Schedule Cost Impact (schedules only) — current savings % -> after savings % with contextual message
- Category breakdown — pie chart of affected resources by type (Compute, Database, Storage, etc.)
- Target details — provider, region, status, cloud account
- Ownership — teams, resource groups, and schedules associated with the target
API Reference
/orgs/{orgID}/resources/blast-radiusCompute blast radius for a target entity. Returns all connected resources with impact classification, risk score, and relationship details.
Query Parameters
| Parameter | Required | Description |
|---|---|---|
target_type | Yes | The type of entity being analyzed: resource (for recommendations), autoscaler_policy (for autoscaler actions), or schedule (for schedule update/delete) |
target_id | Yes | The unique identifier — resource UID, autoscaler policy ID, or schedule ID |
operation | No (defaults to delete) | The action being evaluated: delete, stop, modify, pause, archive, or remove. If omitted, defaults to delete. |
curl -H "Authorization: Bearer <token>" \
"https://api.zopnight.com/orgs/{orgId}/resources/blast-radius?target_type=schedule&target_id=d084b3d1-0cdf-4853-88bc-17f58ad7f8ac&operation=modify"{
"data": {
"targetType": "schedule",
"targetId": "d084b3d1-0cdf-4853-88bc-17f58ad7f8ac",
"targetName": "Business Hours",
"targetDetails": {
"timezone": "America/New_York",
"resourceCount": "5",
"groupCount": "1"
},
"operation": "modify",
"connectedEntities": [
{
"entityType": "resource",
"entityId": "i-014371492badb4769",
"entityName": "prod-web-server",
"entityDetails": {
"resourceType": "ec2",
"provider": "aws",
"region": "us-east-1",
"status": "running"
},
"relationship": {
"edgeType": "schedule_direct",
"direction": "incoming",
"description": "Directly attached to this schedule"
},
"impact": {
"level": "warning",
"reason": "The schedule is being updated. This resource's start and stop times will change to match the new schedule."
}
},
{
"entityType": "resource_group",
"entityId": "0851cb01-2107-45b7-80c5-935b870a96c9",
"entityName": "Dev Servers",
"entityDetails": {
"memberCount": "3",
"memberUIDs": "ns-1,ns-2,ns-3",
"memberDetails": "[{\"uid\":\"ns-1\",\"name\":\"prometheus\",\"type\":\"gke-namespace\"}]"
},
"relationship": {
"edgeType": "schedule_group",
"direction": "incoming",
"description": "Resource group attached to this schedule"
},
"impact": {
"level": "warning",
"reason": "The schedule is being updated. (3 member resources in this group)"
}
},
{
"entityType": "resource",
"entityId": "ns-1",
"entityName": "prometheus",
"entityDetails": {
"resourceType": "gke-namespace",
"provider": "gcp",
"region": "us-central1",
"status": "active",
"parentGroupId": "0851cb01-2107-45b7-80c5-935b870a96c9",
"parentGroupName": "Dev Servers"
},
"relationship": {
"edgeType": "schedule_group",
"direction": "incoming",
"description": "Via group: Dev Servers"
},
"impact": {
"level": "warning",
"reason": "The schedule is being updated. This resource's start and stop times will change to match the new schedule."
}
}
],
"summary": {
"totalEntities": 6,
"byImpact": { "affected": 0, "warning": 6, "safe": 0 }
},
"targetTags": { "env": "production", "team": "platform" }
}
}Current Limitations
- 1-hop traversal only — blast radius shows direct neighbors, not cascading effects (e.g., if Service A depends on Database B which depends on Storage C, deleting C only shows B as affected, not A)
- Metadata-based relationships — connections are inferred from cloud metadata (tags, ARN references, subnet placement), not from actual network traffic. VPC flow log integration is planned.
- No redundancy awareness — the system does not know if a load balancer has 5 other healthy targets, so it may overestimate impact on individual targets
- Cost data requires billing — per-resource cost in the schedule blast radius requires cloud billing integration. Without it, the graph and impact classification still work, but dollar amounts will not appear.