A practical guide to AI/ML tools for Technical Architects. Organised by category with trade-off analysis.
| Tool | Type | Key Features | Best For | Trade-offs |
|---|---|---|---|---|
| Cursor | AI-native IDE (VSCode fork) | Composer mode (multi-file edits), voice-to-code, context-aware | Architects who want control + velocity | ✅ High control, iterative feedback ❌ Paid product, vendor lock-in |
| Claude Code | CLI agent | Agent skills, MCP integration, terminal-based | Architects building governance automation | ✅ Extensible (skills), runs locally ❌ Command-line only, learning curve |
| GitHub Copilot | Autocomplete plugin | Inline suggestions, works in any IDE | Quick productivity boost, low friction | ✅ IDE-agnostic, low learning curve ❌ Line-level only (no multi-file) |
| Replit Agent | Cloud IDE | Full-stack app generation, hosting included | Rapid prototyping, demos | ✅ Zero setup, instant deploy ❌ Cloud-only, less control |
| Devin | Autonomous agent | Fully autonomous (plans, codes, tests, deploys) | Experimental, high-risk projects | ✅ Minimal human input ❌ Low control, unpredictable |
Architect’s Decision Framework:
| Provider | Models | Strengths | Weaknesses | Pricing |
|---|---|---|---|---|
| Anthropic | Claude Opus 4.5, Sonnet 4.5, Haiku | Long context (200k), tool use, safety | Not best at math/code (vs GPT-4) | ££££ (Opus), £££ (Sonnet), £ (Haiku) |
| OpenAI | GPT-4o, GPT-4 Turbo, GPT-3.5 | Code generation, reasoning, broad training | Shorter context (128k), less safe | ££££ (GPT-4), ££ (GPT-3.5) |
| Gemini 2.0, Gemini Pro | Multimodal (video, audio), fast inference | Newer, less battle-tested | £££ (2.0), ££ (Pro) | |
| Open Source | Llama 3.1, Mixtral, Qwen | Free, runs locally, no data sent to cloud | Lower quality, need infrastructure | Free (compute cost only) |
Key Insight: Model “best for coding” changes monthly. Follow community (Twitter, Reddit) to track current leader.
Architect’s Decision:
| Database | Type | Best For | Trade-offs |
|---|---|---|---|
| Pinecone | Managed, cloud-native | Production, scale quickly | ✅ Easy setup, auto-scaling ❌ Vendor lock-in, cost at scale |
| Weaviate | Open-source, self-hosted | Need control, hybrid search (vector + keyword) | ✅ Flexible, open-source ❌ Operational overhead, slower setup |
| pgvector | Postgres extension | Already using Postgres, simple stack | ✅ No new infrastructure ❌ Less optimised for scale |
| Chroma | In-memory, local | Prototyping, small datasets | ✅ Fast setup, free ❌ Not production-ready, no persistence |
| Qdrant | Open-source, Rust-based | High performance, self-hosted | ✅ Fast, efficient ❌ Smaller community, newer tool |
Architect’s Decision:
| Platform | Type | Best For | Trade-offs |
|---|---|---|---|
| AWS SageMaker | Managed, cloud | AWS-native, integrated with AWS services | ✅ Built-in features, auto-scaling ❌ Vendor lock-in, higher cost |
| Azure ML | Managed, cloud | Azure-native, enterprise integration | ✅ Enterprise features, security ❌ Vendor lock-in, complex UI |
| GCP Vertex AI | Managed, cloud | GCP-native, good for BigQuery integration | ✅ Integrated ML tools ❌ Vendor lock-in, newer platform |
| MLflow | Open-source | Experiment tracking, model registry | ✅ Vendor-neutral, free ❌ No compute (need to integrate) |
| Kubeflow | Open-source, Kubernetes | Full ML pipelines on K8s | ✅ Cloud-agnostic, powerful ❌ High complexity, steep learning curve |
Architect’s Decision:
| Tool | Purpose | Best For | Trade-offs |
|---|---|---|---|
| .cursorrules | Project conventions (Cursor-specific) | Coding standards, architecture decisions | ✅ Simple (one file), built-in to Cursor ❌ Cursor-only, limited to text rules |
| .clinerules | Project conventions (Claude Code) | Similar to .cursorrules but for Claude Code | ✅ Works with CLI agent ❌ Claude Code-only |
| Claude Projects | Persistent context (Anthropic) | Project-specific context across sessions | ✅ Cross-session memory ❌ Anthropic-only, cloud-based |
| ChatGPT Custom Instructions | User-level context (OpenAI) | Personal preferences, global rules | ✅ Works across all chats ❌ User-level (not project-level) |
| MCP Servers | Data connectivity | Connect LM to Slack, GitHub, databases | ✅ Real-time data access ❌ Requires setup, security config |
Architect’s Decision:
| Tool | Coverage | Best For | Trade-offs |
|---|---|---|---|
| tfsec | Terraform | Terraform-specific, fast | ✅ Fast, detailed reports ❌ Terraform-only |
| Checkov | Terraform, CloudFormation, K8s, ARM | Multi-platform IaC | ✅ Broad coverage, active development ❌ Slower than tfsec |
| Terrascan | Terraform, K8s, Docker | Policy-as-code | ✅ Custom policies ❌ Smaller community |
| Snyk IaC | Multi-platform | Enterprise, integrated with Snyk ecosystem | ✅ Enterprise features, good UI ❌ Paid product |
Architect’s Decision:
| Tool | Type | Best For | Trade-offs |
|---|---|---|---|
| Microsoft Threat Modeling Tool | GUI-based | Windows users, STRIDE methodology | ✅ Free, structured STRIDE ❌ Windows-only, manual process |
| OWASP Threat Dragon | Web/desktop | Cross-platform, open-source | ✅ Open-source, multi-platform ❌ Less mature than MS tool |
| IriusRisk | Enterprise platform | Large orgs, compliance tracking | ✅ Automated, enterprise features ❌ Expensive, complex setup |
| Agent Skills | AI-powered | Automated STRIDE during design phase | ✅ Fast, integrated into workflow ❌ Requires building the skill |
Architect’s Decision:
| Tool | Purpose | Best For | Trade-offs |
|---|---|---|---|
| OpenAPI | REST API schemas | REST APIs, code generation | ✅ Industry standard, tooling ecosystem ❌ Verbose for simple APIs |
| GraphQL | API query language + schema | Flexible queries, strong typing | ✅ Client-driven queries ❌ More complex backend |
| Avro | Schema for events | Kafka, event-driven systems | ✅ Schema evolution support ❌ Requires schema registry |
| Protobuf | Binary serialization + schema | gRPC, high-performance systems | ✅ Compact, fast ❌ Not human-readable |
| OpenAPI Diff | Breaking change detection | CI/CD validation | ✅ Automated detection ❌ CLI-only, basic reports |
| Pact | Consumer-driven contracts | Microservices integration testing | ✅ Consumer-driven, good for testing ❌ Setup overhead |
Architect’s Decision:
| Tool | Type | Best For | Trade-offs |
|---|---|---|---|
| OPA (Open Policy Agent) | Policy engine | Enforcing architectural rules in code | ✅ Flexible, declarative policies ❌ Learning curve (Rego language) |
| Conftest | OPA for configs | Testing IaC, K8s manifests | ✅ Easy integration with CI/CD ❌ Limited to static analysis |
| Kyverno | Kubernetes-native | K8s policy enforcement | ✅ Simpler than OPA for K8s ❌ K8s-only |
Architect’s Decision:
| Tool | Focus | Strengths | Gaps |
|---|---|---|---|
| vFunction | Architectural drift detection | Visualises true architecture vs. intended, monitors drift in real-time | Observability-focused (tells you drift happened, doesn’t prevent it) |
| CodeRabbit | AI peer review | Deep, context-aware reviews beyond syntax | Recommender, not enforcer (no comprehension verification gate) |
| Qodo (Codium) | AI test generation + review | Good at generating tests, coverage analysis | Focuses on test coverage, not architectural comprehension |
| Traycer | Intent-based review | Detects when code “veers off intent,” monitors modularity | Still prioritises velocity over understanding |
| SonarQube | Code quality + security | Established platform, wide language support | Focuses on bugs/vulnerabilities, not architectural patterns |
Architect’s Decision:
| Tool | Purpose | Best For | Trade-offs |
|---|---|---|---|
| madge | Dependency graph (JS/TS) | Visualizing module dependencies, circular detection | ✅ Fast, simple ❌ JavaScript-only |
| dependency-cruiser | Advanced dependency rules | Enforcing architectural boundaries (layers, modules) | ✅ Powerful rule engine ❌ Complex configuration |
| jscodeshift | AST transformations (JS/TS) | Automated refactoring, codemod creation | ✅ Powerful for migrations ❌ JavaScript-only, steep learning curve |
| semgrep | Multi-language AST analysis | Finding anti-patterns, enforcing custom rules | ✅ Works across many languages ❌ Less architectural focus (more security) |
Architect’s Decision:
| Provider | Model | Input | Output | Use Case |
|---|---|---|---|---|
| Anthropic | Claude Opus 4.5 | £15 | £75 | Complex reasoning, long context |
| Anthropic | Claude Sonnet 4.5 | £3 | £15 | Balanced (most use cases) |
| Anthropic | Claude Haiku | £0.25 | £1.25 | High-volume, simple tasks |
| OpenAI | GPT-4o | £5 | £15 | Code generation, general |
| OpenAI | GPT-3.5 Turbo | £0.50 | £1.50 | Cost-sensitive, simple tasks |
| Gemini 2.0 | £4 | £12 | Multimodal, fast inference |
Cost Optimization Tips:
Development: Cursor (AI IDE)
Context: .cursorrules (conventions)
LLM: Claude Sonnet (balanced cost/quality)
RAG: pgvector (simple, no new infrastructure)
Development: Claude Code (CLI agent)
Context: Agent Skills (ADR, security, contracts)
LLM: Claude Opus (safety, long context)
RAG: Weaviate (self-hosted, control)
IaC Security: Checkov (multi-platform)
Schema: OpenAPI + Pact (contract-driven)
MLOps: AWS SageMaker (managed)
LLM: OpenAI GPT-4o (code generation)
Vector DB: Pinecone (managed, scalable)
Monitoring: MLflow (experiment tracking)
Development: VSCode + GitHub Copilot
LLM: Llama 3.1 (self-hosted)
RAG: Chroma → Qdrant (free, self-hosted)
MLOps: MLflow (free) + Kubernetes
IaC Security: tfsec (free)
When evaluating a new AI/ML tool, ask:
Model capabilities change monthly. This comparison is a snapshot (January 2026).
How to stay updated:
Red flags (tool might be declining):
No “one size fits all” tool. Your stack should match:
Start simple:
Iterate: Re-evaluate tools quarterly. The landscape changes fast.