Cloud Migration Strategy: 8 Steps to Migrating 1000s of Applications to the Cloud Successfully
Over the last few years, I’ve been engaged in a number of cloud migration programs at an enterprise scale. In this time, I’ve seen many migration programs fall apart for reasons that could have been avoided.
Some of the most common mistakes I’ve come across include:
- Leadership not defining the “why” behind the cloud migration
- Adopting waterfall model and bureaucratic processes, prolonging the process
- Not engaging a partner to enable the required skill sets
- Treating the cloud as just another data center
This blog is my lens into how I structure a cloud migration strategy for success for an enterprise with hundreds and thousands of applications, ensuring all of the above mistakes are avoided.
Cloud Migration Best Practices
1. Know your WHY
Before getting started on any cloud migration journey, it’s important to know the “why”.
Understanding why you’re taking the cloud leap is imperative for a program to succeed and has consequences for design and implementation decisions, such as cost, treatment plans, timelines, roadmaps and team structures.
An example of a “why” statement could be similar to one of the below examples:
- Reduce costs and increase efficiencies: the organisation is migrating to the cloud as part of a data center exit strategy. We are aiming for a fast and inexpensive solution.
- Drive digital innovation: the organisation is uplifting and modernising strategic applications by taking advantage of cloud serverless infrastructure.
So, for example, if you decide that your ‘why’ is to exit from your data center, your focus will be on costs and efficiencies. In that case, I would advise customers to set a budget for cloud target spending on compute and storage, per department and to consolidate billing to enable inter-departmental charges.
Start by working with the organisation on defining “why” and sticking to that decision, educating all stakeholders on the benefits of cloud and how to maximise your migration effort return on investment.
2. Find your source of truth
In an enterprise world, engineering teams have hundreds and thousands of applications that would be targeted for cloud migration or a “data center exit” plan.
Ideally, such applications would be well defined in a CMDB type database, however, we do not live in an ideal world and realistically, servers are shared between teams, applications and various tasks and scripts.
In order to migrate effectively, you need to get a sound grasp on the reality of your IT estate on the ground: a source of truth!
My source of truth has always been a combination of technology leaders, SMEs, CMDB and engineering teams.
I start by interrogating CMDBs and cross reference with hosting providers’ CMDB records, which are usually more accurate. I run these findings by the SMEs who would know what servers their applications are running on and the business owners who know which servers they are paying for.
The end result is an accurate server to application mapping which can be used for migration.
3. Understand different treatment plans
When implementing an AWS migration, there are six different treatment plans, applicable to all applications targeted in a program. Each treatment plan comes with its own efforts, TCO, ROI and skill sets.
One of the key tasks of the discovery squad (see point four) is selecting the right treatment plan for the right application(s).
Here is a brief summary of the six approaches and their strengths and weaknesses.
Rehosting is the fastest way of moving an application from an existing data center or hosting facility to AWS. It's ideal for legacy applications that are not adaptable to modern ways of working and implementations.
Some organisations may consider rehosting their applications as a stepping stone to the cloud while they figure out how to maximise cloud benefits.
Rehosting is not always a recommended approach as it does not address technical debt, particularly related to security and availability, such as the ability to cluster across multiple geographical locations for higher availability. Nor does it allow for you to make full use of the scalability and flexibility that makes the cloud so powerful.
Replatforming provides the ability to uplift a platform where the application operates on a more modern operating system for example, or uplifting a relational database to run on a managed service, such as RDS.
Replatforming is almost always a better solution than rehosting and benefits from the automation capabilities, such as creating automated, repeatable templates for processes such as installation or patching.
This provides higher RTOs and RPOs in case of an incident or disaster and reduces the amount of work required by an engineering team to restore service.
Repurchasing an application is ideal when the engineering or SMEs do not have the relevant skills to maintain an application and the application vendor provides a SaaS option.
I would recommend repurchasing applications that are mission critical, do not provide business value by nature, but still require stringent SLAs, which can be offloaded to a service provider.
An example of these is the corporate ERP platform where the custom implementation can be ported to the SaaS provider, if the TCO of the repurchase scenario is comparable to the TCO of a replatform or a refactor treatment.
This is the ideal target state for any application moving to the cloud. It provides maximum benefits while operating at minimal cost. It also requires most effort as engineering teams would need to rebuild an application to operate on serverless technologies.
There are significant benefits to refactoring an application, which include not having to maintain a run-time platform for example. In addition, you get full benefits of instant cloud scalability, hyper performance and elasticity.
I always advise my customers to re-architect their applications on serverless platforms where possible. Ideal candidates are usually in-house developed applications and custom built applications where the source code is available.
Ask yourself this, how many times have you come across a server or an application that has not been accessed, used, or has been down for the last few months without anyone noticing.
Such applications are ideal candidates for retirement. Communicating the current state, utilisation and benefits of such applications using data points helps a cloud migration program make a strong recommendation to retiring such application and reducing run-time costs on prem or in a cloud environment.
Retaining—keeping workloads on-premises—is a necessary evil in a number of regulated industries or critical infrastructure operators, such as oil and gas, for example.
Some applications have to sit in an “air gap” with no access to any outer world communication as they control operational technologies.
Recently, AWS released Outposts, a hybrid cloud service that provides the ability to retain applications on-prem while still gaining some benefits of the AWS management suite.
Cloud migration is a big, fat DRAGON of a problem
We explore the four dragons that are most likely to burn your transformation to the ground and the tools to slay them!
We're sharing our tried-and-tested framework for successful cloud migrations under the difficult, murky, shifting conditions of enterprise IT.
4. Build a discovery squad
A discovery squad is a cross-functional team dedicated to mapping, defining and tracking the various elements of your cloud migration.
It is critical to the successful migration of an enterprise to AWS!
Discovery is an ongoing task in a larger program and its job is to keep a healthy backlog of applications ready to handover to engineering teams, depending on the treatment plan.
The discovery process has to be lean and agile but maintain accuracy as it’s the front door for migration. It can also use a variety of tools available in the market to speed up a process.
The discovery squad responsibilities are as below:
- Identify applications, their relevant owners, vendors, stakeholders and SMEs
- Discover application details, according to an agreed template with engineering teams where all relevant details are captured
- Define a treatment plan for a certain application
- Define the support model and landing zone, such as a self service account or an internal SRE managed landing zone
- Design a target state architecture, relevant to the application
- Provide the business with a TCO when required
- Allocate the application to the relevant team depending on skill sets and treatment plan
- Place the application on the migration roadmap
- Communicate to the business owners the application upcoming migration, timelines in order to allocate SMEs time for testing
- Define the relevant cut over approach and test suite
- Agree with stakeholders on hypercare arrangement, depending on who will be supporting the application.
A well defined target state architecture and accurate TCO figures are essential for a successful platform migration.
5. Build a decision tree and a roadmap
To assist with consistency of migration treatment plans (lesson learnt), consider building a cloud decider for your application treatment plans. This will help the discovery squad achieve maximum consistency in treatment plans and provide visibility to the enterprise board on why the application certain treatment plan was decided on.
A cloud decider for treatment plans is a flowchart-type document that takes in consideration the application state, current architecture and abilities as well as people and processes relevant, tying the holy trinity of IT—people, process, technology—into a single lens:.
At a high level, a cloud decider would look like the below. (Note: this is made for this blog only and is a simplified version of what a real-life decision tree would look like).
Another very helpful tool is the roadmap.
When working with hundreds or thousands of applications, targeted to the cloud, consider building a roadmap. Roadmaps are a great way to communicate to stakeholders, SMEs and business owners when work on an application is due to commence, the amount of time expected to be spent on it and the resources needed to work on this application. A simple cloud migration roadmap might look like this:
6. Set up your teams according to your treatment plans
I find it best to set up teams based on application treatment plans and departmental engagement. The reason behind that is the skill set required for each application treatment plan is different.
Re-factoring teams are best embedded into the organisation business units. They should not operate in silos.
Re-factoring teams are reverse engineering applications or porting application code into a serverless framework. They require access to business context and functional testing. In a refactoring team, consider a structure similar to the below:
- Lead developer
- X amount of developers, depending on the application requirement
- Functional tester
- Technical business analyst (depending on the scope and size of the application)
Re-platforming and Re-hosting teams are building infrastructure-as-code, pipelines, and landing zones\platforms. They can operate in an engineering function provided they have access to non-functional testers or business teams where required to keep their processes as Agile as possible.
This is also applicable to a shared services responsibility model or an SRE model, but that’s not relevant to this blog.
Stay agile, while maintaining regular team and squad cadences. Keep architecture modular for a more agile approach while reserving architecture at scale to the discovery squad. This enables delivery teams to adapt and transform as they see fit while staying inside approved guidelines.
7. Communication is key
One of the biggest showstoppers I come across in larger migration programs is lack of access to resources by business and IT teams.
Establishing a roadmap provides visibility but does not guarantee clear communication with the stakeholders. Also, roadmaps change often due to complexity.
Updating everyone with a cloud business office forum on a monthly or quarterly basis provides the business and service owners the ability to forecast when resources are required to migrate an application.
It also provides owners with the ability to feedback changes to the roadmap due to SME availability or other factors, such as technical requirement changes and modernisation requirements.
Establish a monthly or quarterly Cloud Forum and replay roadmap status and current activities at an “Epic” level.
Now to execute! Start by addressing your landing zone, identity management and networks. Those are the most important technical deliverables an organisation should get right from day zero because they affect how you move forward and remediation activities to those can be lengthy and expensive.
Start by having a landing zone that complies with your internal security requirements as well as industry standards and best practices.
My recommendation is to start with a landing zone design and implementation prior to considering migrating a workload to the cloud. A compliant landing zone removes the undifferentiated heavy lifting required, with compliance and governance policies, implemented at an environment scale, that enables you to create or migrate applications into the cloud with automatic compliance to best practices and regulatory boards.
Identity management is the cornerstone of cloud security. Having the right identity management and access controls allows an organisation to manage assets and resources to best practices while implementing secure, least privilege basis authorisation.
As one of the starting tasks, I would federate my identity management and consolidate it with whatever directory service as my source of truth while maintaining an authorisation groups mapping with my cloud IAM service for all services.
Networks are the glue of the universe and the foundation for all things security, performance, reliability and accessibility. Getting your networks right is essential at the start.
One of the first lessons learnt I’ve gathered is the need for a Direct Connect to my cloud provider to minimise latency and maximise security and performance.
A hub-and-spoke model in an AWS implementation using Transit Gateway at the core of network termination offloads much of the hard work and provides enterprise grade stability and performance. If cost is an issue, a failover to a VPN solution can be implemented, as long as the primary circuit is a Direct Connect.
In conclusion, structuring a cloud migration program is certainly not a simple task but can be very rewarding.
Always start by defining your why, because it’s at the heart of every decision being made by discovery, engineering and support.
Find your source of truth and reconcile information with stakeholders for a seamless decision making framework.
Understand your treatment plan and build your delivery teams accordingly, setting up a strong discovery squad. This is foundational and can make or break a migration program.
Engaging a partner for migration is something that most organisations tend to consider after failing their first attempt. Consider bringing a partner to explore your options and structure your ways of working as well as accelerate your migration plans.