Table of Contents

Candies and the Cloud

Coming from a background of mostly Terraform, some CloudFormation/SAM, and a sample of CDK, I took a look at Pulumi (Python) - a comparable Infrastructure-as-code solution using general purpose programming languages rather than its own - by way of an example project. This article is about my experiences with it. Other people have already covered the general case of going from Terraform to Pulumi pretty well. Because of this I am only looking at the bits that stood out to me in this case. Short version - Pulumi is really nice, especially for intermediate features like programming logic and data manipulation.

The Sample Project

I decided to go for an AWS EC2 Autoscaling group running a ‘hello world’ webpage, in an existing (default) VPC, with a DNS entry and load balancer handling HTTP redirect to HTTPS and TLS termination, reachable via SSM session Manager. This use case is broadly similar to my historically most popular Terraform module and I felt that this would be about the right degree of complexity to give a reasonable comparison with it. I chose to use Python due to familiarity and expected ease of use although several other languages are supported:

  • TypeScript & JavaScript (Node.js)
  • Python
  • Go
  • C#, VB, F# (.NET)
  • Java
  • Pulumi YAML

Just Different

Some things are ‘just different’, not necessarily ‘better’ or ‘worse’:

Initialising the Project

I had initially started with a ‘plain’ main.py one-file in a directory Python project which I then had to import into a properly initialized Pulumi project (pulumi new thing) after I realised that this is required in order to ensure that config and supporting directories are in the right locations. Terraform these days winds up in a similar situation, partly by convention but partly by design with terraform init (or tofu init). There was also the fun of Python virtual environment and adding an ‘import’ to my main.pyfor my variables seemed a little ‘different’ coming from Terraform but this was all quite minor.

Specifying Resources

To start with, describing the resources for my stack was very like Terraform- The same resources with the same attributes but a slightly different syntax. Internal referencing was also very similar.

Debugging the Errors

Obviously my initial stack didn’t work, but it was relatively straightforward to understand the error message, check the relevant documentation, update, and move on to the next error. Arguably this was clearer than Terraform error messages of the past but those have improved a lot in recent versions.

Speed of Operation

The speed at which resources were reported as created was comparable to Terraform, and again like Terraform, resources were created in dependency order with a realtime status update. If (part of) the stack broke, like Terraform, the rest remained and it was much quicker to fix and get on with than SAM/CloudFormation- no need (in most cases) to wait for a Rollback or being forced to delete the whole stack (and wait for that to finish) and then redeploy- a familiar experience when developing CloudFormation stacks.

Conditional Resource Creation

In order to conditionally create a resource, Pulumi and Terraform can both use a variable value to determine whether to create a resource or not.

In Pulumi Python this can be a simple if or if not:

if variables.create_dns_record:

or

if not variables.create_dns_record:

In Terraform the ternary has to explicitly evaluate the condition, e.g

count = var.role_policy != null ? 1 : 0

or

count = local.hostport_whitelisted ? 1 : 0

Outputs on Conditionally Created Resources

In order to render an output on a conditional resource in Terraform, I have to combine it with something that will always exist and/or create a list and then render from that, e.g. (simple examples!):

output "service_dns_entry" {
  value       = join("", aws_route53_record.bastion_service.*.name)
}

output "policy_example_for_parent_account_empty_if_not_used" {
  value = [
    local.assume_role_yes_bool ? local.sample_policies_for_parent_account : ""
  ]
}

In Pulumi Python I can simply reference the same variable as for the creation of the resource:

# Export the DNS Record
if variables.create_dns_record:
    pulumi.export("dnsRecord", dns_record.fqdn)

This is a welcome relief- the Terraform way is a clear legacy influenced behaviour that is always confusing for newcomers.

The Surprises

Dependency Graph

Terraform manages its own dependency graph. Pulumi(Python) does not seem to do so in the same way, and so I had to move some resources around in my file in order to satisfy dependencies at the Python level - I couldn’t reference an attribute of something that had not ‘yet’ been declared, starting from the top of my Python file. Not as clean as Terraform where resources can be grouped logically together without this concern but not the end of the world. The actual deployment dependency was handled a little differently.

Tagging Resources

With modern Terraform it is possible in AWS (not usually other providers) to apply provider default tags - a standard set of tags applied to every resource that supports them. I did not find that for Pulumi in the simple case although I suppose I could have created a custom wrapper/function or explored Pulumi’s automation API. Here I specified tags for each resource individually. Of course this then meant removing them again from those resources that don’t support tags. The big surprise was the different formats needed for different resources.

I knew I would need to do something special for my Autoscaling group tags as it’s always a special case with "propagate_at_launch": "True". In all, starting with my variable:

default_tags = {
    "project": "pulumi-aws-ec2-asg",
    "owner": "Joshua",
    "Name": "pulumi-aws-ec2-asg"
}

This is similar to a map of resource tags in Terraform, and was fine for most resources as it was. For the ASG I converted my standard_tags dictionary to a list of dictionaries with propagate_at_launch key with

asg_tags = [{"key": k, "value": v, "propagate_at_launch": "True"} for k, v in standard_tags.items()]

This was expected and would be roughly equivalent (in effect) to this Terraform:

  dynamic "tag" {
    for_each = merge(data.aws_default_tags.this.tags, var.tags)
    content {
      key                 = tag.key
      value               = tag.value
      propagate_at_launch = true
    }
  }

What I had not expected was that I would also need to do almost the same thing for my Load Balancer tags. I don’t know if this is because I was using the aws_native provider rather than aws for that resource (the Terraform equivalent would be awscc vs. aws) but I was surprised that this would be different. On the plus side, since it’s Python, it was fairly straightforward in each case here to do the data structure transformation:

def convert_tags_dict_to_array(tags_dict):
    return [{"key": k, "value": v} for k, v in tags_dict.items()]

lb_tags = convert_tags_dict_to_array(standard_tags)

Variable Sources

This was another odd one for me. Generally with Terraform I put ‘personal’ or ‘secret’ values in env vars so that I can avoid having them in committed code. Terraform supports this in a way that means your code doesn’t need to change, using the traditional hierarchy of command line argument, environment variable, config file, defaults. With Pulumi I wound up passing vars to my resources like

name=variables.dns_name,
zone_id=os.environ.get('ROUTE53_ZONE_ID'),

Yes, I could have set up my pulumi.config like

# DNS name can be set via Pulumi Config, or fallback to an environment variable, or default
dns_name = config.get("dns_name") or os.environ.get('DNS_NAME') or "default-dns-name"

# Zone ID can be set via Pulumi Config, or fallback to an environment variable, or default
zone_id = config.get("route53_zone_id") or os.environ.get('ROUTE53_ZONE_ID') or "default-zone-id"

I felt that this would be potentially fragile or non obvious since the config is part of the ‘project stack’ rather than the code proper and I did not want to be shipping my own project stack publicly. Yes, there would have been other possibilities too - which could be great in a production environment- although here I was looking for something simple, discrete and modular.

Weighing up Pulumi as an Option

Based on this small project, my thoughts on Pulumi (Python):

Compared to CloudFormation/SAM

I would put Pulumi on a peg with Terraform against CloudFormation which I’ve covered before and would have the same view with Pulumi. This specific stack would not have been a good match for SAM so I don’t see much point in considering it separately to CloudFormation. As I have said previously:

Overall, while CloudFormation has specific strengths, broader infrastructure needs are often better served by alternative tools that provide greater flexibility, simplicity, and support.

Pulumi also have an article on this: Migrating from AWS CloudFormation to Pulumi

Compared to CDK

This is perhaps where it gets a little more interesting since both offer the ability to define resources in the same ‘real code’ language. Pulumi again have their own article: Pulumi vs AWS Cloud Development Kit (CDK). I have yet to use either CDK or Pulumi at scale and so cannot comment in depth here. I suppose that arranging your stack as a class (where supported) rather than separate object definitions may be preferable for some BUT the elephant in the room is that CDK is an abstraction over CloudFormation- whatever cool stuff you might do at the ‘Real Programming Language’ level you’ll be transpiling locally to CloudFormation before deploying, along with all that that entails. I’ve linked this article before but it is worth again highlighting this quote from SST’s article ‘Moving away from CDK’ describing ‘Ion’:

Ion is a code name for a new engine for deploying SST applications. The constructs (or components) are defined using Terraform providers and deployed using Pulumi; as opposed to CDK and CloudFormation (CFN).

Compared to Terraform

This is the real competition, since Pulumi and Terraform are so directly equivalent. As you might expect Pulumi themselves publish a detailed comparison with Terraform. For me both offer:

  • A free/open source version or fork
  • A subscription managed option with a SAAS backend option
  • The possibility to put reusable code sections in modules and to import these
  • Providers for multiple platforms (although Terraform has far more than Pulumi)
  • A very close and direct match to the target APIs, as opposed to an API for a configuration tool that uses those APIs

Going only on what I have seen here, the key differentiators for me would be:

  • Richness of the ecosystem - whilst both have module registries; Terraform/OpenTofu appears to have a more established third party services ecosystem including, e.g., Terragrunt; Atlantis; helper tooling such as infracost and terraform-docs. Plainly Pulumi is improving in this areas - as per SST article above- and some of those third-party services are themselves a two-edged sword of course, e.g. Terragrunt definitely ’encourages’ a particular approach/style.
  • Intermediate features- things like, (nested) loops, conditionals, restructuring data objects, error handling. I would have to say that Pulumi has the edge here overall and by some margin.

A little more on each of these intermediate features:

Loops

These can be complex in Terraform and the environment is quite restrictive - see linked article in ‘Data Object Manipulation’ below. Pulumi has a clear edge here.

Conditionals

Whilst handling simple conditional resource creation in Python is less verbose I think that this is comparable to comparing Power Shell scripts to Bash scripts- it doesn’t really matter. What does make the difference is doing anything else subsequently with those resources. Clearly the Terraform way is a workaround for historic behaviour and not something that anyone would logically argue for on its own merits.

Data Object Manipulation

Whilst I have not gone into detail about it in this article, manipulating objects to handle data in a ‘graceful’ way in Terraform is not always straightforward. It’s something that I have previously devoted a dedicated article to just for a single use case! Python, as discussed here, is great for this sort of thing but with Pulumi you have the choice of several well established languages.

Error Handling

Whilst Terraform error handling has been difficult in the past, it has improved over recent years. Whilst Pulumi’s error handling here was good, I could not say at this point which was ‘better’. Terraform does have a habit of concealing the underlying API errors in a way that I did not immediately see Pulumi Python doing in this project.

Untried Features

For reasons of time and the length of this article there were some features I didn’t explore here. Some key standouts:

I didn’t try to import Terraform state or code, or Cloud resources to Pulumi for this article although Pulumi advises that this can be to working code, rather than as unmanaged like wth Terraform: Migrating from Terraform to Pulumi. This could be a compelling advantage.

The idea of using Terraform providers for some platforms with Pulumi as both Pulumi themselves describe in Using Terraform Providers and referenced in SST’s article ‘Moving away from CDK’ is intriguing.

I did not look at other language options for Pulumi here or consider IDE/ testing support integrations in any depth although Pulumi do highlight these features and they could be valuable in some settings.

I cannot speak of using Pulumi at scale at this point.

Conclusions

Pulumi offers the advantages of using a ‘real’ programming language with more robust intermediate features and direct API access but with an ecosystem split across different languages. The language variety for Pulumi is a 2-edged sword of course - great choice and support for different languages but you may find that third party modules are written in a different one to that which you’re using. The import and code conversion features seem intriguing but I have not as yet looked into these.

Terraform/OpenTofu offers the advantage of a popular community with a vast range of modules, providers, third-party services, again with direct API access, and a single language (for the most part!). Those third party modules are a massive asset considering the undifferentiated heavy lifting that is common in infrastructure work- generally you are better off using a thoroughly battle tested module to create e.g. an AWS VPC than trying to create your own.