Skip to content
  • About Us
  • Our Services
  • Case Studies
  • Content Hub
  • Blog
  • Join Us
  • Contact Us
platform engineer
Fernando Villalba

Good Platform Engineering: What It Is and How to Make It Work For You

Picture this. You are a new developer in an organisation and you are ready to get started.

A colleague points you to a wiki and a series of repositories that are poorly documented and sprawled throughout multiple folders and organisations. You spend a whole week trying to make sense of it all.

After digging around, you finally get your hands on some code and you realise what your colleagues have been doing is copying code from an existing service and extrapolating a “template” from it, often leaving details from a previous service in the next one.

Finally, you’re ready to start testing it in infrastructure. However, you don’t have access to the infrastructure and you have to ask an operations team to provision it for you. You may be faced with waiting weeks to get your infrastructure ready.

You could of course create the infrastructure yourself if given access to your cloud provider. But then how do you ensure that you have created the infrastructure according to your company’s guidelines? Do you do it by clicking on the console? Or do you use Terraform? What if you don’t know how to use Terraform?

And what about Kubernetes to deploy your services? What if you know very little about it? How can you ensure that you can deploy to Kubernetes with minimal knowledge and without breaking anything there?

In this post, we’ll explore the options you have available to you and introduce platform engineering as a solution for the above challenges.

What Is Platform Engineering?

Platform engineering aims to give developers in an organisation the capacity to self-serve their infrastructure needs, as well as deploy, operate, browse, monitor and template all services available in your company.

The discipline is focused on providing good visibility of the relationship between services, documentation and who manages them. Platform engineers do this by creating an abstraction layer, the platform (AKA internal developer platform), that’s easy to use and requires minimal knowledge of infrastructure from developers.

This abstraction layer is very important because infrastructure is complex. You may have some developers who are great at using Kubernetes, but even then this is a burden and cognitive load that reduces the amount of meaningful work they would be able to deliver.

Even if all your teams were great at both coding and infrastructure and had limitless energy to implement it all themselves, you may end up with a situation where everyone is doing things in very different ways. A platform helps by standardising all services and tools and bringing order from chaos.

How Does Platform Engineering Solve Key Software Engineering Challenges?

platform engineering

In the early days of software engineering, developers wrote applications and operation teams were tasked with deploying, running, operating and monitoring them. The DevOps movement was instrumental in shifting the responsibility left so operations enabled developers to deploy and be responsible for their code. However, creating new infrastructure and other responsibilities was still very much the domain of operation teams, who had to do it on demand to serve the developers.

Even when the developers were given leeway to create their own infrastructure you had a whole new set of problems. How can you expect all your developers to understand infrastructure as well as you? How can you teach all of them how to work with Kubernetes and AWS?

And what if you had so much money that you could afford to hire those rare few who understand both everything about infrastructure and coding. How do you manage to keep their work coordinated and standardised? How do you visualise what everyone is working on? How do you set guardrails and uphold compliance without being overly restrictive?

Platform engineering aims to solve this by creating and maintaining a platform that developers can use to self-serve all their needs without depending on other teams. A platform team sets guardrails and creates a platform that abstracts all the complexity, and the developers are consumers.

Instead of a messy sprawl of services, you have a neat catalogue that shows the documentation and all its metadata (coding language, author, quality, repository, pipeline and much more) and instead of everyone provisioning infrastructure in opinionated ways or your operations team running around doing it, the platform does it for you according to the guidelines the platform team specifies. Templates are generated for all common services and constraints are implemented in the platform. Platform engineering allows you to set guardrails and compliance, but it’s not so much about forcing standards, rather regulating autonomy.

Instead of an operations team running around doing repetitive tasks, the platform does it for you and your operations team can work on making the platform better serve the needs of your developers.

10 Principles of Good Platform Engineering

Good platform engineering should aim to create a platform that satisfies all, if not most of the following requirements:

1. Create a product (the platform) that everyone loves to use:

  • Has an elegant API.
  • Intuitive user interface (UI). The UI should offer observability, relationship between services and help guide and teach new users, but it must be possible to do anything and everything in code.
  • Good command line interface (CLI).
  • Easy to apply changes by use of simple manifest files—making it easy for developers to follow a GitOps workflow and apply all changes from their integrated development environment (IDE).
  • Teams can easily contribute to its extensibility.
  • It serves the developers more than it serves operations—developers are the clients of the platform team.
  • Treated as a product with a product owner to ensure the above are always true.

2. Centralises standards, documentation, templates and infrastructure in one place without making it into a rigid process.

3. Implements security and compliance implicitly in all the services and infrastructure you create with it.

4. Provides plenty of options that developers can pick from and makes it easy to add additional ones to support future business needs and edge cases.

5. Provides observability, such as logging, metrics, relationship between services, and/or can easily link you where to find it.

6. Abstracts away complexity and reduces cognitive load ideally simplifying the underlining Kubernetes platform and cloud provider using something like PAWS or OAM

7. It allows your developers to create cloud infrastructure you whitelist, with your own guardrails, and it ensures that it stays the way it was declared in the config file.

8. You can define role-based access control (RBAC) for your users for different levels of permissions.

9. You can extend the capabilities of the platform with community driven plugins to suit your needs and expand the observability and functionality of your services.

10. Your developers can quickly get an end-to-end environment via the platform for a new service. Including pipelines, repositories, config templates, best practices, etc.


Creating Antifragile Systems: Site Reliability Engineering for the Enterprise

SRE is a data-driven approach to IT that ensures that even wildly complex distributed IT systems are healthy.

It can help you to turn fragile enterprise systems into antifragile ones that get better the more they are shocked!

Download this white paper to discover everything you need to know about creating antifragile systems using SRE!


4 Benefits of Platform Engineering

  1. It makes undesirable patterns harder to implement for your developer teams. For example by having your platform create 12 factor app services or favour trunk based development workflows or keep everything in code.
  2. It will help you achieve business outcomes dramatically faster because your devs can focus exclusively on developing and don’t need a dedicated team to provision infrastructure for them.
  3. It can help solve or mitigate organisational problems once the platform becomes the source of truth to find documentation, infrastructure, templates and state of your services.
  4. It dramatically reduces the cognitive load of your developers.

4 Patterns to Avoid With Platform Engineering

Here I describe some patterns that can sometimes be confused with platform engineering, but in reality they are more of a hacky implementation or a partial solution. These do not amount to a good internal developer platform and do not meet all the principles outlined above.

1. Using Jenkins as a platform

Using Jenkins as a platform is not the best idea. Jenkins allows you to do many things, but it should only be used as a CI tool (or ideally not at all as there are far simpler CI tools now).

Just because you can use a swiss army knife to cut your steak, doesn’t mean you should. I recommend this article written by Port on the topic.

2. Templating Terraform and Kubernetes

Creating Helm packages, Terraform modules and other templates and then passing them to the developers is not a bad practice and it is certainly better than nothing. But this in itself is not platform engineering because you still need to ensure that your developers understand and work with infrastructure as code (IaC) tooling and Kubernetes, and these require time to master and understand.

You also have very little control over what your developers will do with those templates, meaning it is harder to implement best practices, compliance, etc.

To be clear, doing this is great, but doing that alone is not enough to have an internal developer platform.

3. Setting up Guardrails and Account Vending in your Cloud Provider

Using something like Control Tower and Account Factory for Terraform (AFT) and then giving the power to developers to vend accounts is also not great platform engineering for a couple of reasons:

  1. It is harder to enforce patterns of working and templating even if guardrails restrict what you can do.
  2. Some cloud providers are not the most intuitive or easy to use even for infrastructure professionals, much less for developers.

Account vending and guardrails are useful and you should have a process for this, but in itself is not platform engineering. That being said, there are situations where giving an account or project to play with is handy, especially if a team needs to try multiple options before deciding on something.

4. Creating Other Sandboxes

Other than creating sandbox accounts or projects in your cloud provider, there are really cool tools like vcluster where you can create a cluster inside a cluster. You then hand this to a team or developer and they can play as much as they want within the confines of this sandbox. There’s also a set of tools defined as environment as a service, these are also not really platforms but a way to let your teams play around with ideas, or test environments.

Being able to do this can be a very good strategy for experimentation and certain types of work patterns, but it in itself is not really good platform engineering.

Does Your Company Need Platform Engineering?

Talking about having all of the above is easy; doing it is trickier. First you must decide if you actually need to create a platform for your team/s. If you are a small company chances are you may be able to use simpler tools more suitable for your needs.

For example you could use Heroku, or tools like Render, Vercel, Netlify, Railway or even subsets of bigger platforms like GCP Cloud Run. These tools make it very easy to deploy and monitor your workloads and are well suited to fit the needs of most smaller companies. In this sense you are already using a platform, albeit one with reduced features, this is often referred to as Platform as a Service (PaaS), but this is not what we usually refer to when we talk about internal developer platforms.

PaaS have the advantage of being very easy to use and get started with but they are limited in terms of how you can customise them to fit your corporate needs.

If you are a big organisation with many teams working on large distributed projects that require complexity, a platform that can abstract away infrastructure complexity can help you save thousands of dollars and hours in the long run whilst delivering much better results for your customers.

Platform engineering always requires a dedicated and competent team, it’s not as easy as implementing some tools and you are done. They are complex, need to be configured and lots of decisions need to be made about how to go about it. In other words, platform engineering doesn’t completely eliminate complexity; it puts the burden of simplifying it on the platform team, and depending on what tools that team picks, it will be more or less hard to achieve this.

Defining Internal Developer Platform (IDP)

This article exclusively focuses on service catalogues, infrastructure and workloads orchestration part of the platform. Essentially the tools that allow you to browse, explore and create the workloads.

Components that would also be part of your platform, such as observability, source control and continuous integration tools are not explored in detail in this article but it is implicitly assumed you would use these in conjunction with them.

Service Catalogues and Internal Developer Platforms

When researching for this topic I was trying to find categories for the various tools that you can use to build a platform but it has proven a bit elusive as definitions are still dynamic at this stage and various tools approach the goal differently.

Two terms that are going around when defining platforms are service catalogues and internal developer platforms (IDP) and the definitions I found online were as follows:

Service Catalogues enable you to browse, visualise and template the services in your organisation whereas the definition of internal developer platform was to create, operate and maintain your infrastructure.

However, the term internal developer platform (IDP), originally coined by Humanitec, seems to have evolved to encompass the entire platform, including the service catalogue and the documentation for this has not been changed to reflect this yet as of the writing of this article.

Humanitec was originally branded as an internal development platform and in their website as of the writing of this post, many other tools are listed as one. Even then I was a little confused because many of the tools listed only loosely followed the definition outlined there. This is however no longer the case and now Humanitec is defining itself a platform orchestrator.

Humanitec’s team was kind enough to try and clarify for me and even give me access to an early draft of a blog post they are planning to release explaining everything in detail.

It seems that the current updated definition of internal development platform has been embraced by Gartner, hence I will refer to the platform as an internal development platform here and I will try to explain how to build your own.

Kubernetes Abstraction Layer

It’s worth mentioning that generally speaking modern platform engineering and its tooling leverage Kubernetes and abstract its complexity from developers so if you are not using Kubernetes as your underlying infrastructure for services, perhaps you should consider it if you are thinking of creating your own IDP.

Using these tools doesn’t free you from having to learn Kubernetes, but at least you will only need Kubernetes experts in the platform team. It is highly desirable that your platform can also create resources outside of Kubernetes. For this, your platform can leverage anything from pulumi, bespoke resource drivers, Terraform or ideally Crossplane.

For example, Humanitec allows you to create resources outside of Kubernetes, like databases, and a few others, but at its core it is designed to make Kubernetes deployments easier. This doesn’t mean that Humanitec is purely an abstraction of Kubernetes, but it does make it easier to use and manage environments with. Mia Platform is another example of an IDP product that’s a Kubernetes abstraction.

To Be Continued

Check out Platform Engineering: Creating your Internal Developer Platform (Part 2) where we explore how to create your internal developer platform with both open source and commercial tools, as well as exploring the pros and cons of some of the choices you can use to built your IDP.

More Articles

green software engineering

Green Coding: The Secrets to Unlocking More Sustainable Software Engineering

14 September 2022 by Michael Zajer

Sustainability and Google Cloud: 5 Key Areas to Help Improve Your Sustainability Posture

8 September 2022 by Contino
Cloud security

Cloud Controls Matrix: How to Secure Your Journey to the Cloud

24 August 2022 by Kevin Davies