Difficulty: Advanced Time Investment: 3-4 hours Prerequisites: Understanding of security principles, IaC, agent skills framework
The Problem: Security is often treated as:
The Cost:
The Solution: Agent skills can automate security analysis:
The shift: From “security review before production” → “security built-in from the start”
Bolt-On Security (traditional):
Design → Implement → Security review → Fix issues → Deploy
Problem: Security issues found late (high cost to fix)
Secure-by-Design (with agent skills):
Design → Threat model (automated) → Implement with guardrails → Deploy
Benefit: Security issues prevented early (low cost)
threat-modelerPurpose: Automates STRIDE analysis during the design phase
Triggers when:
Workflow:
graph TD
A[Architecture diagram provided] --> B[Parse diagram]
B --> C[Identify components, data flows, trust boundaries]
C --> D[Run STRIDE analysis]
D --> E[Generate Risk Register]
E --> F{High risks found?}
F -->|Yes| G[Suggest mitigations]
F -->|No| H[Document as low-risk]
G --> I[Present to architect]
H --> I
STRIDE Framework (Microsoft):
| Threat | What It Targets | Example |
|---|---|---|
| Spoofing | Authentication | Attacker impersonates legitimate user |
| Tampering | Integrity | Attacker modifies data in transit or at rest |
| Repudiation | Non-repudiation | User denies performing an action (no audit logs) |
| Information Disclosure | Confidentiality | Sensitive data exposed (PII, API keys) |
| Denial of Service | Availability | Attacker overwhelms system, making it unavailable |
| Elevation of Privilege | Authorization | Attacker gains admin access |
/threat-modeler/
├── SKILL.md # Instructions
├── references/
│ ├── stride-patterns/
│ │ ├── web-api.md # Common threats for REST APIs
│ │ ├── kubernetes.md # Common threats for K8s deployments
│ │ ├── serverless.md # Common threats for Lambda, etc.
│ │ └── oidc-flows.md # Common threats for OAuth/OIDC
│ └── mitigations.md # Standard mitigation patterns
└── scripts/
├── parse_diagram.py # Extract components from Mermaid/PlantUML
└── stride_analysis.py # Run STRIDE checks
Input (architecture description):
We're building a web API that:
- Accepts user uploads (images, up to 10MB)
- Stores uploads in S3
- Processes images using a Lambda function
- Updates metadata in PostgreSQL
- Returns signed URLs for download
Agent with threat-modeler skill:
I'll perform a STRIDE analysis on this architecture.
Components identified:
1. Web API (entry point)
2. S3 bucket (storage)
3. Lambda function (processing)
4. PostgreSQL (metadata)
5. Signed URL generator
Threat analysis:
SPOOFING:
- Risk: Unauthenticated users upload malicious files
- Severity: HIGH
- Mitigation: Require authentication (JWT, API key)
TAMPERING:
- Risk: Files modified in S3 after upload
- Severity: MEDIUM
- Mitigation: Enable S3 versioning, use object lock
INFORMATION DISCLOSURE:
- Risk: S3 bucket publicly accessible
- Severity: HIGH
- Mitigation: Ensure bucket policy blocks public access
- Risk: Signed URLs don't expire
- Severity: MEDIUM
- Mitigation: Set expiration (e.g., 1 hour)
DENIAL OF SERVICE:
- Risk: Attacker uploads large files (>10MB) to exhaust storage
- Severity: MEDIUM
- Mitigation: Enforce size limits at API layer, rate limit uploads
ELEVATION OF PRIVILEGE:
- Risk: Lambda has overly broad IAM permissions
- Severity: HIGH
- Mitigation: Use least-privilege IAM role (only S3 + RDS access)
RISK REGISTER:
High: 3 risks
Medium: 3 risks
Low: 0 risks
Recommended actions:
1. Add authentication to API endpoint
2. Review S3 bucket policy (ensure private)
3. Audit Lambda IAM role (remove unnecessary permissions)
4. Set expiration on signed URLs
Impact: Security analysis that would take a human 30-60 minutes is done in seconds.
iac-security-auditorPurpose: Real-time security scanning of Infrastructure-as-Code (Terraform, CloudFormation, Kubernetes manifests)
Triggers when:
Workflow:
graph TD
A[IaC file modified] --> B[Run security scanner]
B --> C{Violations found?}
C -->|No| D[Approve]
C -->|Yes| E[Check against safe defaults]
E --> F[Auto-fix if possible]
F --> G[Report violations]
G --> H[Suggest fixes]
/iac-security-auditor/
├── SKILL.md
├── references/
│ ├── safe-defaults.json # Company-specific secure configs
│ └── violations-library.md # Common violations and fixes
└── scripts/
├── scan_terraform.sh # Wrapper for tfsec
├── scan_cloudformation.sh # Wrapper for checkov
└── auto_fix.py # Attempts to fix common violations
Input (Terraform file):
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t2.micro"
security_group_ids = [aws_security_group.web.id]
}
resource "aws_security_group" "web" {
ingress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
Agent with iac-security-auditor skill:
I've scanned your Terraform configuration. Found 4 security violations:
CRITICAL:
1. S3 bucket has no encryption
- Resource: aws_s3_bucket.data
- Issue: server_side_encryption_configuration is missing
- Fix: Enable AES256 or KMS encryption
2. Security group allows all traffic from internet
- Resource: aws_security_group.web
- Issue: 0.0.0.0/0 on all ports (0-65535)
- Fix: Restrict to specific ports (e.g., 80, 443)
HIGH:
3. S3 bucket versioning not enabled
- Resource: aws_s3_bucket.data
- Issue: versioning block missing
- Fix: Enable versioning for data recovery
4. EC2 instance has no monitoring
- Resource: aws_instance.web
- Issue: monitoring = false (default)
- Fix: Enable CloudWatch monitoring
Checking against your company's safe defaults...
- S3 encryption: Required KMS key = arn:aws:kms:us-east-1:123456789:key/abc
- Security groups: Default deny all, explicit allow only
I can auto-fix issues 1, 3, and 4. Issue 2 requires your input
(which ports do you need open?).
Shall I apply the fixes?
Auto-fixed code:
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
versioning {
enabled = true
}
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "aws:kms"
kms_master_key_id = "arn:aws:kms:us-east-1:123456789:key/abc"
}
}
}
}
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t2.micro"
monitoring = true
security_group_ids = [aws_security_group.web.id]
}
resource "aws_security_group" "web" {
# NEEDS YOUR INPUT: Which ports should be open?
# Current config allows ALL ports from internet (INSECURE)
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
description = "HTTP - PLEASE VERIFY THIS IS NEEDED"
}
}
| Violation | Risk | Fix |
|---|---|---|
| S3 bucket public access | Information disclosure | Set block_public_acls = true |
| Unencrypted EBS volumes | Data breach | Enable encrypted = true |
| RDS no backup | Data loss | Set backup_retention_period > 0 |
| Security group 0.0.0.0/0 | Unauthorised access | Restrict to specific IPs/CIDRs |
| IAM wildcard permissions | Privilege escalation | Use least-privilege policies |
| Violation | Risk | Fix |
|---|---|---|
| Containers run as root | Privilege escalation | Set runAsNonRoot: true |
| No resource limits | DoS | Set resources.limits |
| Secrets in env vars | Information disclosure | Use secret volumes instead |
| Host network enabled | Network exposure | Set hostNetwork: false |
Setup: No skill, just your brain
Task: Threat model a simple login flow
User → Web App → Auth Service → Database
Time it: How long does it take to identify threats across all STRIDE categories?
Baseline: Expect 20-30 minutes for a thorough analysis
Setup: Build the threat-modeler skill (or use Claude with manual prompts)
Task: Same login flow
Prompt:
Perform a STRIDE threat model on this architecture:
User submits credentials → Web App validates → Auth Service checks against Database → Returns JWT
Identify threats in all STRIDE categories.
Observe:
Setup: Install tfsec or checkov
Task: Scan a Terraform file with intentional vulnerabilities
Run:
tfsec .
Observe: How many violations? Can you fix them manually vs. with AI assistance?
With agent skill: Ask agent to “scan this Terraform file and fix security issues”
Compare: Manual fixing vs. AI-assisted fixing (speed, accuracy)
Problem: Scanner flags everything, including non-issues Solution: Tune rules to your environment; suppress known-safe patterns
Problem: Scanning legacy infrastructure → 1000+ violations Solution: Start with new projects; grandfather old violations, fix incrementally
Problem: Agent “fixes” security issue but breaks functionality Solution: Always review auto-fixes; require human approval for critical changes
Problem: New attack vectors emerge, skill doesn’t catch them Solution: Update references/ regularly (quarterly security reviews)
Block PRs with critical violations:
# .github/workflows/security-scan.yml
name: IaC Security Scan
on: [pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run tfsec
uses: aquasecurity/tfsec-action@v1.0.0
with:
soft_fail: false # Block PR if critical issues found
Not all violations are equal. Prioritise by:
Risk Score = Severity × Exploitability × Data Sensitivity
Examples:
- Public S3 bucket with PII: CRITICAL (10/10)
- Missing CloudWatch monitoring: LOW (2/10)
- Unencrypted dev database: MEDIUM (5/10)
Agent skill can auto-calculate risk scores and prioritise fixes.
Map security controls to compliance frameworks:
{
"violation": "S3 bucket not encrypted",
"compliance_failures": [
"GDPR Article 32 (Data Security)",
"PCI-DSS 3.4 (Encryption at Rest)",
"SOC 2 CC6.7 (Encryption)"
]
}
Benefit: Automated compliance reporting (auditors love this).
Security automation becomes critical at scale, but the approach evolves as teams grow. Here’s how to scale from departmental tooling to enterprise infrastructure.
Approach: Scripts + CI/CD checks
What works:
Limitations:
When to evolve: When security engineer spends >50% of time triaging violations, or compliance audit requires centralised reporting
Approach: Central security policy repository + observability dashboard
What changes:
Infrastructure requirements:
Governance patterns:
Common challenges:
When to evolve: When managing 200+ developers, or multi-cloud/hybrid infrastructure
Approach: Security platform with advanced automation, federated enforcement
What this looks like:
encryption = true to S3 bucket)Advanced capabilities:
Enterprise considerations:
Platform maturity requirements:
This is rare: Most organisations don’t reach this scale. If you’re here, security is a platform product with dedicated team.
Anti-pattern 1: Block Everything
Anti-pattern 2: No Exception Process
Anti-pattern 3: Stale Policies
Anti-pattern 4: No Metrics
Cost of manual security review (50-200 developers):
Cost of automated security (50-200 developers):
Break-even: After 3-6 months, automation pays for itself by freeing security engineer for higher-value work (architecture reviews, incident response)
Security automation shifts security left (design phase) and scales security expertise.
Without automation:
With automation (threat-modeler + iac-security-auditor):
ROI Calculation:
Start:
The goal: Make security default, not an afterthought. Agents enforce security standards so architects can focus on novel threats.