Documentation

Blast Radius Analysis

Before-the-fact impact analysis for cloud changes — every dependent resource, severity classification, risk score, and cost impact.

Every cloud change has consequences. Deleting a schedule might leave 20 servers running 24/7. Removing an autoscaler could lock instances at their current size during a traffic spike. Rightsizing a database might break the three services that depend on it.

Blast radius analysis answers one question: "What happens if I do this?" — before the action is taken. It identifies every resource that will be affected, how severe the impact is, and what it will cost.

A Real-World Example

Consider a schedule called "Business Hours" that stops 15 EC2 instances and 2 resource groups (containing 8 more instances) every night at 7 PM and starts them at 7 AM. This saves 42% on compute costs — about $2,400/month.

When a user attempts to delete this schedule, the following occurs:

  1. A confirmation dialog appears: "Are you sure you want to delete Business Hours?"
  2. Inside the dialog, a "View Blast Radius" link is shown
  3. Clicking it opens the architecture canvas with the blast radius overlay
  4. The view shows:
    • 15 resources marked as affected — they will run 24/7 without the schedule
    • 2 resource groups (expandable to see 8 individual members)
    • A cost panel showing: 42% savings -> 0% with the message "Deleting this schedule removes all automated start/stop rules. Resources will run 24/7, losing 42% in savings."
    • Per-resource cost breakdown showing each instance's monthly cost and projected loss
  5. The user reviews the impact, clicks "Back to Schedules", and decides not to delete

How It Works Under the Hood

ZopNight's discoverer service maintains a dependency graph of all resources across connected cloud accounts. This graph is built during resource discovery by analyzing relationships like "this load balancer routes traffic to these EC2 instances" or "this RDS database is accessed by these Lambda functions".

When a blast radius analysis is requested, the system follows these steps:

  1. Resolve the target — identifies the entity being acted on (a resource, a schedule, or an autoscaler policy)
  2. Find connected resources — traverses the dependency graph to find all resources within 1 hop of the target. For schedules, this includes directly attached resources and members of attached resource groups.
  3. Classify impact — each connected resource is assigned an impact level (affected, warning, or safe) based on the operation being performed and the type of relationship between the resources.
  4. Compute risk score — a numeric score (0-100) is calculated based on how many resources are affected, their environment (production vs dev), and how many teams are impacted. As of v2, the risk score is computed client-side from the backend's impact data — see computeRiskScore.js. API consumers can replicate the formula using the components described below.
  5. Return results — the frontend renders the graph with the target at the center and connected resources arranged in a radial layout, color-coded by impact level.

Where Blast Radius Appears

Blast radius appears automatically in three places. There is no separate page to navigate to — it is embedded in the workflow wherever an action is taken.

1. Recommendations

When ZopNight identifies a cost optimization opportunity (idle server, oversized database, orphaned disk), the blast radius can be viewed before acting on the recommendation.

Example: ZopNight recommends rightsizing an RDS instance from db.r5.2xlarge to db.r5.xlarge.

  • Click "View Blast Radius" in the recommendation drawer
  • The graph shows 3 connected resources:
    • An ECS service that reads from this database — marked as warning (may experience brief interruption during resize)
    • A Lambda function with read-only access — marked as warning
    • A CloudWatch alarm monitoring the database — marked as safe
  • Risk score: 35/100 (moderate — dev environment, 2 services affected)
  • Estimated savings shown: $145/month

ZopNight classifies impact using a comprehensive resource-type behavior map covering 150+ types across AWS, GCP, and Azure. For example, it knows that resizing an RDS instance requires a restart (so dependent services see a brief outage), but updating an S3 bucket policy is an online operation (no impact on consumers).

2. Autoscaler Policies

When pausing, removing, or re-applying an autoscaler policy, a confirmation dialog shows the blast radius of that action.

Example: Removing an autoscaler policy from a production Auto Scaling Group.

  • The confirmation dialog shows "View Blast Radius"
  • The graph shows 4 EC2 instances managed by this policy:
    • All 4 marked as warning — they lose auto-scaling protection
  • Cost impact panel shows:
    • Current instances: 3 (min: 2, max: 10)
    • After removal: stays at 3 but will not scale up during traffic spikes or down during quiet periods

3. Schedules

Schedules control when cloud resources start and stop. Changing or deleting a schedule directly affects the cost savings for every resource attached to it. The schedule blast radius is the most detailed implementation, with full cost impact analysis.

Example: Updating a schedule from "weekdays 9-5" to "weekdays 9-9".

  • Edit the schedule, change the cron times, click "Update"
  • A confirmation dialog appears with "View Blast Radius"
  • The graph shows:
    • 10 directly attached resources (EC2, RDS, ECS) — all marked as warning
    • 1 resource group ("Dev Servers") with 5 members — click to expand and see individual resources
  • Cost impact panel shows:
    • Current Savings: 52% (resources stopped 12 hours/day)
    • After Update: 36% (resources stopped only 8 hours/day)
    • Message: "Updated schedule reduces savings by 16%. Resources will run for more hours per week."
  • Click any resource node to see its floating info card:
    • Monthly Cost: $156.00
    • Current Savings: $81.12/mo
    • After Savings: $56.16/mo (yellow — savings decreased)

Resource Groups in the Graph

If a schedule has resource groups attached, they appear as a single node with a grid icon and a button showing the member count (e.g., "5 resources"). Click the resource group node to expand it — member resources fan out in an arc, each showing their own cost and impact details. Click again to collapse.

Schedule Impact Classification

OperationImpact LevelWhat Happens
modify (update)warningStart/stop timing changes. Resources follow the new schedule. Savings % may increase or decrease depending on the new hours.
deleteaffectedSchedule is permanently removed. All attached resources lose their automated start/stop rules and will run continuously (24/7). All savings from this schedule are lost.

Schedule Cost Data

The cost panel in the schedule blast radius shows the current savings percentage and what it will be after the change. This is calculated from the schedule's cron grid (how many hours per week resources are stopped) combined with per-resource monthly cost from the aggregator service.

Understanding Impact Levels

Every connected resource in the blast radius is assigned one of three impact levels. The level depends on both the operation being performed and the type of relationship between the resources.

LevelColorWhat It MeansReal-World Example
affectedRedThis resource will be directly and significantly impacted. Data loss, service outage, or permanent state change is likely.Deleting an ECS cluster destroys all its services. Deleting a schedule means resources run 24/7.
warningYellowThis resource may experience a temporary disruption or behavior change. The change is usually recoverable.Resizing a database causes a brief restart — connected services see a momentary outage. Updating a schedule changes when resources start/stop.
safeGreenThis resource is unlikely to be affected. The operation can be performed without disrupting this resource.Modifying an S3 bucket's lifecycle policy does not affect the Lambda functions that read from it.

How Classification Works (For Recommendations)

For recommendation-based blast radius, ZopNight uses the same resource-type behavior map referenced above to categorize every supported cloud resource into one of three modification behaviors:

BehaviorImpactMeaningResource Examples
onlineModifysafeResource can be modified without any downtime or restartS3 buckets, CloudFront distributions, IAM policies, security groups
restartModifywarningModification requires a restart or brief outageRDS instances, EC2 instances (instance type change), ElastiCache clusters
poolModifywarningModification affects a pool of resources (rolling update)EKS/GKE node pools, Auto Scaling Groups, ECS services

Risk Score

The right panel of the blast radius view shows a risk score gauge from 0-100. This score provides a quick assessment of the overall risk without reading every individual resource's impact.

Score RangeLevel (UI badge)Meaning
0-25lowFew resources affected, mostly non-production. Safe to proceed in most cases.
26-50mediumSeveral resources affected or some production resources involved. Review before proceeding.
51-75highMany resources affected, production environment, or cross-team impact. Proceed with caution.
76-100criticalWidespread impact across production. Consider scheduling a maintenance window.

How the Score Is Calculated

The score is the sum of three components, each with a cap:

ComponentMax PointsHow It Works
Impact60Weighted impact ratio (affected = 1.0, warning = 0.5, safe = 0.0) multiplied by 50, plus a bonus based on the number of impacted resources: +4 pts if more than 2, +7 pts if more than 5, +10 pts if more than 10.
Environment15Detected from resource tags (env, environment). production = 15 pts, staging = 8 pts, dev = 3 pts, unknown = 5 pts. Set to 0 if no resources are impacted.
Ownership20Teams affected: 1 team = 5 pts, 2-3 teams = 10 pts, 4+ teams = 15 pts. Active schedules on the target resource add +5 pts.

After summing all components, a pause operation applies a 0.4 multiplier to the final score (since pausing is less disruptive than deleting). The result is capped at 100.

Using the Blast Radius Graph

The blast radius opens as a three-panel overlay on the architecture canvas:

Left Panel — Resource List

  • Lists all connected resources with impact badges
  • Search bar to find specific resources by name
  • Click a resource to zoom to it on the graph and show its info card
  • For resource groups: clicking a member auto-expands its parent group in the graph

Center — Interactive Graph

  • Target node at the center (clock icon for schedules, resource type icon for others)
  • Connected resources arranged in a radial layout, grouped by impact level
  • Resource groups shown as expandable nodes — click to reveal members in an arc
  • Floating info card — click any node to see details: provider, region, status, connection type, cost, and impact reason
  • Impact legend at the bottom — click to filter by impact level (toggle affected/warning/safe)
  • Zoom controls — scroll to zoom, drag to pan
  • Collapse All button — resets all expanded groups

Right Panel — Summary

  • Risk score gauge — 0-100 speedometer visualization
  • Stat cards — Connected count, Affected count, Warning count
  • Schedule Cost Impact (schedules only) — current savings % -> after savings % with contextual message
  • Category breakdown — pie chart of affected resources by type (Compute, Database, Storage, etc.)
  • Target details — provider, region, status, cloud account
  • Ownership — teams, resource groups, and schedules associated with the target

API Reference

GET/orgs/{orgID}/resources/blast-radius

Compute blast radius for a target entity. Returns all connected resources with impact classification, risk score, and relationship details.

Query Parameters

ParameterRequiredDescription
target_typeYesThe type of entity being analyzed: resource (for recommendations), autoscaler_policy (for autoscaler actions), or schedule (for schedule update/delete)
target_idYesThe unique identifier — resource UID, autoscaler policy ID, or schedule ID
operationNo (defaults to delete)The action being evaluated: delete, stop, modify, pause, archive, or remove. If omitted, defaults to delete.
Example — Schedule Modify · bash
curl -H "Authorization: Bearer <token>" \
"https://api.zopnight.com/orgs/{orgId}/resources/blast-radius?target_type=schedule&target_id=d084b3d1-0cdf-4853-88bc-17f58ad7f8ac&operation=modify"
Response · json
{
"data": {
  "targetType": "schedule",
  "targetId": "d084b3d1-0cdf-4853-88bc-17f58ad7f8ac",
  "targetName": "Business Hours",
  "targetDetails": {
    "timezone": "America/New_York",
    "resourceCount": "5",
    "groupCount": "1"
  },
  "operation": "modify",
  "connectedEntities": [
    {
      "entityType": "resource",
      "entityId": "i-014371492badb4769",
      "entityName": "prod-web-server",
      "entityDetails": {
        "resourceType": "ec2",
        "provider": "aws",
        "region": "us-east-1",
        "status": "running"
      },
      "relationship": {
        "edgeType": "schedule_direct",
        "direction": "incoming",
        "description": "Directly attached to this schedule"
      },
      "impact": {
        "level": "warning",
        "reason": "The schedule is being updated. This resource's start and stop times will change to match the new schedule."
      }
    },
    {
      "entityType": "resource_group",
      "entityId": "0851cb01-2107-45b7-80c5-935b870a96c9",
      "entityName": "Dev Servers",
      "entityDetails": {
        "memberCount": "3",
        "memberUIDs": "ns-1,ns-2,ns-3",
        "memberDetails": "[{\"uid\":\"ns-1\",\"name\":\"prometheus\",\"type\":\"gke-namespace\"}]"
      },
      "relationship": {
        "edgeType": "schedule_group",
        "direction": "incoming",
        "description": "Resource group attached to this schedule"
      },
      "impact": {
        "level": "warning",
        "reason": "The schedule is being updated. (3 member resources in this group)"
      }
    },
    {
      "entityType": "resource",
      "entityId": "ns-1",
      "entityName": "prometheus",
      "entityDetails": {
        "resourceType": "gke-namespace",
        "provider": "gcp",
        "region": "us-central1",
        "status": "active",
        "parentGroupId": "0851cb01-2107-45b7-80c5-935b870a96c9",
        "parentGroupName": "Dev Servers"
      },
      "relationship": {
        "edgeType": "schedule_group",
        "direction": "incoming",
        "description": "Via group: Dev Servers"
      },
      "impact": {
        "level": "warning",
        "reason": "The schedule is being updated. This resource's start and stop times will change to match the new schedule."
      }
    }
  ],
  "summary": {
    "totalEntities": 6,
    "byImpact": { "affected": 0, "warning": 6, "safe": 0 }
  },
  "targetTags": { "env": "production", "team": "platform" }
}
}

Current Limitations

  • 1-hop traversal only — blast radius shows direct neighbors, not cascading effects (e.g., if Service A depends on Database B which depends on Storage C, deleting C only shows B as affected, not A)
  • Metadata-based relationships — connections are inferred from cloud metadata (tags, ARN references, subnet placement), not from actual network traffic. VPC flow log integration is planned.
  • No redundancy awareness — the system does not know if a load balancer has 5 other healthy targets, so it may overestimate impact on individual targets
  • Cost data requires billing — per-resource cost in the schedule blast radius requires cloud billing integration. Without it, the graph and impact classification still work, but dollar amounts will not appear.