8 Massive Cloud Migration Challenges 
As more and more enterprises jump on the cloud bandwagon, many will discover that the journey to cloud is not always a smooth ride. The same challenges arise time and time again, often preventing organisations from reaping the real benefits of the cloud.
Are you making the most of what the cloud has to offer? Or have your cloud initiatives come to a grinding halt in the face of these challenges?
8 Cloud Migration Challenges for 2020
In this post, we’ll go over the biggest enterprise cloud migration challenges and how to successfully overcome them.
Challenge #1: What Cloud Provider Do I Choose?
When kick-starting the journey towards enterprise cloud migration, organisations often get the buy-in to progress with a Cloud Centre of Excellence or to build an entire product in a public cloud provider. However, many of these organisations will struggle at the next step: which basket do I put my eggs in? i.e. what cloud provider do I choose?
Traditionally, organisations do not have the enterprise support structures to onboard providers like Amazon and Google (not the case with Microsoft, who usually have well established support relationships). Instead, they decide to either progress with an RFP or they choose the supplier that’s easier to onboard. Both of these approaches fail to cater for the impact on engineering and the developer experience.
Contino’s approach to overcoming this challenge has been to avoid the marketing gospel behind the major cloud providers. Instead, we look at organisational factors that are going to make cloud adoption at enterprise scale successful.
Perhaps you already have a small team of engineers who are trained in a certain cloud provider. Perhaps you have a product that needs developing that could best use the services provided by a certain cloud provider. Or perhaps the regions in which your business operates and your customers are based are best aligned to the regional availability of a certain cloud provider.
In all cases, choosing one initial provider to prove out your organisational maturity for adoption at scale is critical. Which brings us onto our next challenge: trying to take on too much cloud at once!
Challenge #2: Multi-Cloud - The Temptation of Trying to Do Too Much Too Soon
This is a challenge that we seem to come across time and time again. Most organisations at the initial stages of their cloud adoption journey are trying to do too much too soon. These organisations can be quick to jump towards a multi-cloud approach to avoid vendor lock-in. However, by adopting a multi-cloud strategy from day one, an organisation burdens itself with multiple cloud providers, with sub-par engineering standards and lack the organisational maturity to support their adoption at scale.
A multi-cloud approach definitely has its merits, especially within regulated environments. However, this decision should be made based on your engineering and organisational maturity and experience as opposed to the marketing-driven fears around vendor lock-in. Too often, organisations fail to understand the regulator’s stance around cloud adoption and make decisions based on emotional factors rather than practical regulatory concerns.
The Contino approach is to focus on one cloud provider first in order to build an organisation’s tech intensity (tech adoption ^ tech capabilities) and prove out the ability to host product grade workloads in-line with the organisational and regulatory frameworks that surround it. Once an organisation has the people and process aspects in place and proven to some degree, slotting in additional cloud platforms merely becomes a question of scale and technical capability as opposed to organisational ability.
Additionally, as organisations start ramping up their multi-cloud efforts significantly in 2020, due to either increased regulatory pressures or the need for a specialist cloud platform for certain use cases, being able to do one cloud really well prior to scaling to another, using the lessons learnt, becomes of utmost importance.
The end goal is for the second cloud to be as good as the first, but not immediately. This way you are moving towards cloud neutrality in the long-term, but in an incremental fashion that reduces risk and minimises cost.
Challenge #3: Multi-Cloud - Engineering Down to the Lowest Common Denominator
When adopting multiple cloud providers, organisations often end up limiting the success of these providers by trying to shoehorn a common set of approaches and technology choices that fail to capitalise on the promised benefits of cloud adoption at scale. This often leads to expensive cloud solutions, a sub-par engineering experience, and above all introduces a level of technical complexity that could have been avoided completely.
This is a challenge that, at Contino, we tackle with an engineering-first mindset. Firstly, we recommend evaluating every cloud provider at face value with the best they have to offer; avoiding the common mistake of implementing identical solutions across cloud providers. Secondly, we pair said evaluation with a set of engineering principles and practices that are cloud agnostic, providing guidance and best practices for any engineers who want to consume any cloud platform. These principles are frequently evaluated to ensure they aid and simplify the ability for engineering teams to adopt multi-cloud platforms.
Here are a few examples:
- Bad Principle: Building a set of standardised platform services across multiple cloud providers on IaaS, due to the perceived lack of lock-in that this approach provides.
- Bad Practice: Building a Redis cache service on AWS - Azure and Google using infrastructure as service; providing a “consistent” service for application team.
- Good Principle: Every cloud platform will be provisioned using our chosen infrastructure-as-code platform, (eg. Terraform), assuming it provides the integration required to build the products needed. This will be complemented with the cloud provider’s native provisioning APIs.
- Good Practice: Use Terraform to provision Azure Cache, AWS Elasticache and Google Cloud Memorystore for Redis, simplifying the initial build and future operations.
Challenge #4: Cloud Brokers - Multi-Cloud Managers
It seems strange to have this on the list of challenges and blockers to wide-scale cloud adoption as we approach 2020 but it still manages to lurk around. As a natural continuation from the point above, the next step organisations often see as vital (once they’ve forced their engineering teams into a rigid multi-cloud framework) is to look at multi-cloud management brokers. These provide yet another abstraction framework and an inefficient API set to target in order to provide a “service catalogue”.
The history of cloud brokers has shown that this typically ends in either an expensive bill from the broker or a convoluted engineering mesh that hinders scalability and often leads to frustration within the engineering teams.
As a continuation of the above approach, providing engineering teams with a loosely coupled framework that lets them explore and consume the best that cloud providers have to offer has proven to be the only approach that scales. This can then be complemented with certain domain-specific tools that enable a more effective governance model, without hindering engineering creativity.
- Bad example: In order for an organisation to provide a standardised service catalogue for all their internal cloud consumers, they use a multi-cloud broker to create templated service catalogue items for engineering teams to consume.
- Good example: In order for an organisation’s FinOps team to effectively manage, review and optimise their financial posture around cloud consumption, they use multi-cloud cost management tools like Cloudability.
- Good example: In order for an organisation’s Risk, Governance and Security function to effectively manage, review and report on their compliance posture across multiple cloud providers, they use a multi-cloud compliance and governance tool.
- Good example: In order for an organisation to simplify the set of tools used for infrastructure and platform provisioning, they use Terraform to create a set of approved internal modules and use a private registry to share a common set of modules with engineering teams.
Challenge #5: Cloud Security - Lift and Shift Techniques from On-Premises
The illusory truth effect tells us that if you say something enough times, even if it’s false, people will believe you. This seems to have been the case when it comes to cloud security.
As organisations have woken up to the importance of security when consuming cloud at scale, security teams have continued to tell us that the on-premises approach to cloud security is the safest approach.
In reality, applying this data centre thinking to the cloud does nothing to improve an organisation’s security posture.
Instead, it leaves behind a cloud environment that isn’t suitable, or flexible enough, for engineering teams to consume due to the restrictive perimeter-based policies that are in place. Additionally, these traditional security approaches bring with them solutions that aren’t designed to use the native services that cloud service providers have to offer. Defaulting to an IaaS based deployment approach results in a bill at the end of the month that negates the business case for cloud.
Our approach to cloud security enables an organisation to hit refresh on years of antiquated practices and adopt modern practices. Not only does this encapsulate modern tools, it also spans the skill sets that are required within the team as well as the engineering techniques leveraged throughout the entire stack being built. Some of Contino’s engineering techniques to tackle cloud security, amongst many others, include:
- Policy-as-Code: Having your environment defined as code has a plethora of advantages – one of them being the ability to define an organisation’s guardrails into a policy engine (as code) and then subsequently enforcing said policies across the estate and proactively preventing any possible violation.
- Identity-based and least-privilege security: With the increased number of services and devices that need managing, simply relying on your security perimeter isn’t enough (this is often a practice that is heavily relied upon with on-prem). Identity and least-privilege based approaches force users and services to rely on techniques such as MFA and granular role-based access controls, and have a route to live that is consistent and well managed. Additionally, modern cloud-based identity providers are capable of learning and adapting to user behaviours, providing risk-based scores on the access being granted.
Challenge #6: Data Governance Approach - Leaving Your ‘Digital Oil’ On-Prem
While the initial use cases for cloud consumption were primarily focused on scalable compute and modern application development environments that weren’t feasible on-premises, any long-term enterprise use cases for cloud now have data as a core pillar.
The challenge comes from the fact that the (on-premises) data is scattered, unorganised and unusable. So, although enterprises have vast amounts of data, the business value remains untapped.
With all of this data, it’s extremely difficult to balance democratisation of insight whilst maintaining governance and compliance.
We apply the same rigour of DevOps automation, CI/CD, and infrastructure-as-code across data operations, data pipelines and data landing zones. Our approach builds backwards based on measurable business need and data maturity. We don’t just build a data lake or data platform for the sake of it. This enables us to focus on holistic data transformation, starting from data strategy, to data operating model, governance and upskilling. As my colleague Mark Pybus always says: “big data is worthless but big knowledge is priceless”.
Challenge #7: Build vs Buy - Confusing Undifferentiated-Heavy-Lifting with Business Value
Often accompanied by the marketing gospel of multi-cloud is the fear message of vendor lock-in. This usually leads organisations down the path of building a raft of platform services using infrastructure-as-a-service as the common denominator and effectively treating their cloud providers as glorified data centres. This is usually known in the industry as undifferentiated heavy lifting. A direct consequence of the above is in cloud not demonstrating the initial ROIs that were promised, and hindering scalability since you need a substantial team to manage said infrastructure.
Whilst adhering to an organisation's architectural standards and the loosely-coupled engineering framework mentioned above, organisations should realise that vendor lock-in (to a certain extent) is unavoidable. We advocate using the native services that a cloud provider has to offer; the speed and agility gained by doing so far outweighs the potential risk of lock-in. Additionally, the three major cloud providers have a consistent set of managed services that an organisation can use as their chosen platform services without the need to build them from scratch e.g. Redis, Kubernetes, SQL.
Challenge #8: Changing Your Operating Model
By and large, one of the biggest mistakes that organisations make, and the hardest challenge to overcome, is the lack of an operating model that enables an organisation to scale their cloud ambitions across their business.
Too often, organisations fail to cater for the engineering experience that is required in a cloud operating model. By adopting their existing on-prem operating models around security, service delivery, change and release management etc., they hamstring the success of their cloud adoption. This usually leads to a cloud environment that is no faster to deploy than on-prem, impacting the mean time to deployment. It can also result in a higher attrition rate of engineering talent due to an organisational model that fails to provide an optimal engineering experience.
Spanning all the functional domains of an organisation, a cloud operating model has to cater for the various phases of cloud adoption as well the buy-in required from the various domains (e.g. People & Talent, Finance, Engineering, Ops, Security, Risk, Data etc.).
Contino’s view when it comes to a cloud operating model is to take a product-focused, rather than a project-focused approach. We take an end-to-end slice of an organisation’s domain and look at how this can be incorporated into the development of a product lifecycle. We do this with short, measurable projects called Lighthouse Projects that serve as a model for similar projects within the wider digital transformation initiative. This allows an organisation to ensure no stones were left unturned. More on that here.
Additionally, our years of experience have taught us that there are certain aspects/models that an organisation will have to inevitably move towards in order to guarantee success in scaling their cloud adoption journey. These include the following:
- Product-not project-focused: Use the mindset of product focused thinking and allow an organisation to focus on an end-to-end value stream. Not only does this provide an organisation with a better ROI, but it also gives your engineering teams a sense of purpose. Here’s an example of how Contino adopted this mindset with Direct Line Group to help launch their new InsureTech, Darwin.
- Site Reliability Engineering for Cloud Operations: By focusing on the core aspect of reliability from day one and using metrics to link business outcomes to day to day engineering activities, you can provide an organisation’s engineering team with a sense of ownership and purpose around the products they design, build and run. Not only does this circumvent the traditional IT challenges of a team throwing a product over the fence to the IT Run team, it also further drives the notion of product focused thinking by aligning SREs to product teams. More on this here.
- FinOps: Cloud cost management has always been a key pillar when it comes to architecting a cloud environment well. However, with cloud spend in 2019 and 2020 showing no sign of stopping paired with the directly implied shift from Capex to Opex, doing a quarterly review of your cloud consumption is no longer enough. FinOps is meant to drive a shift in culture in cloud financial management, akin to DevOps driving a change in engineering. Combining aspects of financial management, cloud operations and the core disciplines of procurement, FinOps aids organisations in driving a more efficient cloud cost management and ROI framework whilst aligning your cloud adoption journey to business value.
- DevSecOps: Building an enduring DevSecOps capability with the authority to make design decisions enables the accelerated delivery of your cloud products. This capability will reside either as an extension of your cloud operations function or as a virtual function that spans across all product teams. Building an effective Cloud Security function is of paramount importance and by no means a trivial task, and as such shouldn’t be taken lightly.
- DataOps: The importance of data and data engineering teams should not be overlooked. In order to reap the benefits of modern engineering practices, you need a combination of a strong data operating model and the capability to apply DevOps techniques to your data. Implementing a strong governance model that codifies data management, quality, discoverability and access policies as part of the data lifecycle and data pipelines is critical in allowing the data to be exploited. Utilising data at scale in this way provides innovation across an organisation, delivering “the data-driven business”, and allowing the truly transformative abilities inherent in modern Machine Learning and Artificial Intelligence approaches to flourish.
Creating Antifragile Systems: Site Reliability Engineering for the Enterprise
Enterprises need to be thinking less like a business-business and more like a tech-business. Enter Site Reliability Engineering or SRE. SRE is a data-driven approach to IT that ensures that even wildly complex distributed IT systems are healthy. It can help you to turn fragile enterprise systems into antifragile ones that get better the more they are shocked! Download this white paper to discover everything you need to know about creating antifragile systems using SRE!
So in summary, for an organisation to successfully scale their cloud adoption in 2020, they have to:
- Cut through the marketing FUD and ensure that their cloud/multi-cloud strategy is based on an organisation’s engineering discipline and maturity
- Embrace the best that each cloud provider has to offer and provide their engineering teams with a loosely coupled framework to adopt these whilst leaving room for engineering creativity
- Focus less time on building platforms and environments that are long considered undifferentiated heavy lifting
- Ensure that your cloud operating model is fit for purpose and caters for modern ways of working in order to scale an organisation’s functional domains for cloud adoption, namely cloud operations, DevSecOps and DataOps.