How a Leading Trading Company Cut Their Azure Costs by 50% with FinOps
In our introductory blog on how FinOps can save your cloud program, we showed how this approach to managing cloud can help with some of the biggest cloud headaches:
- Cloud spend spiralling out of control
- Not seeing the cost savings from your data centre lift-and-shift
- Spending too much on licenses and services that you can’t control
Today, I’m going to take you through a real-world example of how Contino used FinOps to save a leading trading company 50% on their cloud bill.
I’ll quickly remind you of the key principles of FinOps and how it can save you money before diving into the case study.
What Is FinOps and Cloud Financial Management?
FinOps is an approach to managing and operating cloud spend by breaking down the silos between engineering, finance and procurement. FinOps is meant to drive a shift in culture in cloud financial management, akin to how DevOps and SRE have driven a cultural change in engineering.
The cloud operates in a fundamentally different way to on-premises. It’s decentralised and scalable, making it tricky to monitor and predict a cloud spend that is constantly fluctuating and difficult to predict.
FinOps helps your finance teams to manage the shift from CapEx to OpEx spending, to monitor and track your fluctuating cloud spend as well as architect your platform, your teams and your processes with finance in mind.
If you want to know more, I’ve written in detail about the principles and benefits of FinOps elsewhere.
So how does it help you in real-life? We recently started a FinOps engagement at a major trading company and are on track to cut their cloud bill by a massive 50%.
Let me take you through how we did it.
Cutting Cloud Bills by 50% with FinOps at a Leading Trading Company
The client is a leading trading company that is two years into a separation from its parent company. As part of this split, many old IT systems have to be replaced with new systems.
In order to take advantage of the compelling business benefits of the cloud, the organisation is redeploying its entire trading platform onto Microsoft Azure public cloud.
However, the organisation is unfamiliar with best practices for managing the cloud and their monthly costs began spiralling out of control!
They brought in the Contino team to help establish FinOps and cloud financial management best practices and regain some control over their cloud costs so they can get the most benefit out of their new cloud platform.
Contino conducted an Azure cost optimisation assessment (in line with Microsoft CAF principles) over a four week period, culminating in a comprehensive Cloud Cost Management Report for the organisation.
The aim of the cloud assessment was to understand the organisation’s cloud workloads, as well as its development practices and utilisation of infrastructure in order to regain control of overall cloud spend.
The assessment flagged that the rapid development of the platform was encouraging poor cloud practices:
- Prioritising deadlines over efficiency: time to release has been prioritised over cloud efficiencies and overall infrastructure spend
- Over-consuming databases: High consumption of premium tier DTU SQL databases including associated development activities
- Unnecessary copies of databases: multiple copies of large PROD SQL databases kept live and available with huge database instances > 1TB forcing higher cost tiers
- Replicating data centre consumption patterns: VMs running 24/7 with little load are replicating expensive datacenter patterns in the cloud
- Unused services: services are left operational and at full-scale when unused or unloaded without capitalising on efficiencies offered by Microsoft
Contino put together a comprehensive FinOps strategy to address these issues incorporating:
- Improve cloud spend monitoring
- Optimise Azure resource usage and placement
- Short-term tactical optimisation
- Long-term strategic optimisation
Let's dive into the details.
1. Improving Cloud Spend Monitoring
Contino recommended an effective approach to cloud spend monitoring which will allow the organisation to both monitor and address cloud resource costs, helping to identify current issues and prevent future unplanned growth in costs.
- Pair the right workloads with the right subscriptions: non-production workloads are great candidates to run within a cheaper DEV/TEST assigned subscription
- Make use of reserved instances: this could save 20-70% for pre-planned, long-running workloads for compute and SQL databases
- Automate limits on resource sizes: use Azure Policy and Management Groups will restrict resource sizes for developer usage
2. Optimising Azure Resource Usage and Placement
Next, was to assess how to place workloads and determine which services and tier size would make leveraging the cloud shared responsibility model more cost efficient for the organisation.
- Use cloud-native services over traditional services: as-a-Services workloads are more cost-efficient
- Start small then scale: provisioning the minimum size resource then scaling to suit load saves significant spend over maxed out resources running 24/7
- Refactor applications to optimise them for the cloud: enable workloads to take advantage of the native Azure services further exploiting the shared responsibility model to keep costs to a minimum.
3. Short-term Tactical Optimisation
The consumption model highlighted the way in which the organisation can make immediate tactical changes which can be applied.
- Switch off/downsize instances that are not in use: at least ~30% PCM can be saved by better working practices
- Optimize unnecessarily large databases sizes: optimising large SQL databases will save ~15% per month
- Replace VMs with AppServices: this offers a more sustainable and maintenance free host for applications in the cloud saving at least 10% PCM on overall cloud spend.
4. Long-term Strategic Optimisation
Contino investigated a set of working practices that will reap cost savings over the long-term. These are:
- Predict production workloads: enable the most efficient long-term spend reductions using vCore reserved instances
- Automation and scaling: estimated ~30% cost saving for non-production workloads by downsizing/switching off un-/under-utilised resources
- Purchasing DEV/TEST SQL databases: DEV/TEST SQL databases on a vCore model can save ~66% spend compared to a stock equivalent DTU SQL database running 24/7
- Using a one-year reserved instance alongside DEV/TEST pricing: can save ~40% over the standard vCore pay as you go spend.
- Optimise billing: use Azure Cost Management along with Billing can help to analyse, manage and optimise the cost reduction.
How Much Money Will FinOps Save the Client?
Contino estimated that the above measures would save at least 50% on cloud spend for the client based on Q1/Q2 (2020) utilisation patterns. This equates to ~$1m a year savings!
And so what has been achieved so far?
Contino is half-way through implementing the key deliverables from the report. We have:
- Created infrastructure-as-code to enable autoscaling
- Set up scaling for dev, test, UAT environments
- Programmatically scaled down database sizes
- Automated shut down of VMs over the weekend
This has resulted in cost savings of ~$45k PCM. That equates to $540k savings per year!
We are on track to deliver 50% reduction in cloud spend by the end of the engagement.