Mastering AWS Costs: A CTO's Guide to FinOps

Cloud spend scales with product success—until it scales faster. A practical framework for cost visibility, accountability, and control.

Every growing product eventually hits the same inflection point: AWS spend starts climbing faster than revenue. The bill is not the problem—the gap between what you spend and what you understand about that spend is. FinOps gives engineering and finance a shared operating model for cost decisions without throttling delivery.

Why Cost Discipline Breaks Down

Cost management fails when it lives in a spreadsheet that finance maintains and engineering never sees. The people who make architecture decisions—engineers—rarely see cost data in context. The people who see the invoice—finance—cannot evaluate whether a given line item is waste or investment. FinOps bridges this by embedding cost awareness into engineering workflows, not by adding approval gates that slow down shipping.

The tension is real. Optimise too aggressively and you throttle delivery velocity—teams cannot ship features when every resource request triggers a procurement cycle. Ignore costs and you erode the margin that funds next quarter’s roadmap. The right approach treats cloud spend as an engineering metric: visible, measured, and discussed in the same forums where architecture choices happen.

The FinOps Operating Model

FinOps is not a tool or a dashboard. It is an operating model built on three iterative phases: Inform, Optimise, Operate. Most teams try to skip to Optimise—buying Savings Plans or right-sizing instances—without building the visibility that tells them whether those changes are working. That sequence is backwards.

Inform: Visibility comes before action. Tag every resource with team, service, and environment. Use AWS Cost Explorer and Cost and Usage Reports (CUR) piped to Athena for SQL access to granular billing data. Build dashboards that engineering leads actually check—if nobody looks at the data, the tooling is theatre.

Optimise: Start with the highest-impact, lowest-risk changes. Right-size instances using CloudWatch metrics—look for sustained CPU below 20% or memory below 30%. Adopt Savings Plans over Reserved Instances for flexibility. Clean up unattached EBS volumes, idle load balancers, and forgotten NAT gateways. A single unused NAT gateway in eu-west-1 costs roughly €33/month for doing nothing.

Operate: The hard part is not finding savings—it is keeping them. Set budget alerts per team. Include cost impact in architecture review templates. Run a monthly cost review where engineering leads present their team’s spend delta and explain the drivers. Make it a 15-minute standing agenda item, not a quarterly panic.

Tagging: Boring and Non-Negotiable

Untagged resources are invisible resources. If you cannot attribute spend to a team or service, every cost conversation devolves into guesswork. Enforce tagging at the CI/CD layer: reject deployments that create resources without the required tag set. Use AWS Organizations tag policies to prevent non-compliant tags at the API level.

Mastering AWS Costs: A CTO's Guide to FinOps

Why Cost Discipline Breaks Down

The FinOps Operating Model

Tagging: Boring and Non-Negotiable

AI Spend Is Making Cloud Waste Trend Up Again

Cross-Account Access Without Regret: Patterns That Don’t Become a Security Incident

Kubernetes Cost Control: Requests, Limits, and the Traps That Inflate Bills

Reserved Capacity Without Lock-In Risk

Decision Framework

Implementation Notes

Failure Modes