19. October 2022 By Attila Papp
AWS CDK – Three things I like and three things I don't
Introduction
Modern IT has been full of buzzwords and groundbreaking approaches in the past few years. Infrastructure as Code (IaC) is the latter. IaC is the practice of defining your infrastructure through code templates rather than using a GUI or manual effort. Using IaC has several benefits, such as better reusability, CI/CD integration, etc.; but most notably: it helps you achieve greater speed by treating infrastructure like 'cattle' – not' pets.'
While IaC tools have made a long way since they first emerged, and a few purpose-built languages have also popped up, the need to approach such things from traditional programming languages has never disappeared. Pulumi and CDK from AWS are such tools, and, in this blog post, we will be exploring the latter. I have been working with CDK for almost a year, and here are my thoughts.
What is AWS CDK?
For describing infrastructure, there are a couple of approaches with notable differences. Mentioning this topic without the likes of Terraform and HCL is inevitable. Terraform is an IaC tool that uses the HCL (Hashicorp Configuration Language) for infrastructure templates. It is a purpose-built language between traditional programming languages and markup approaches (like AWS CloudFormation). While Terraform supports advanced features like variables and loops, these functionalities are sometimes limited and not as fully featured as you think.
In the past years, infrastructure engineers have understood that configuration languages (like HCL) or markup languages (like CloudFormation with JSON and YAML) are not flexible enough and impose notable compromises when building dynamic infrastructure. Thus, they have been looking into improving the IaC experience for developers who don't want to learn new languages like HCL or understand the complex variable system of CloudFormation, etc.
CDK is a new IaC tool that allows you to describe AWS infrastructure in traditional programming languages: JavaScript, Typescript, Python, Java, C#, and Go. It essentially empowers cloud engineers to use regular programming features, like classes, variables, loops, and many more when working with IaC – and in the end, it is converted into CloudFormation Stacks. As a result, it can be immensely elegant and undeniably cumbersome sometimes. Let's look at the positives and negatives based on my experience with it.
Positive
- 1. Complete support programming language dialects: variables, functions, loops, inheritance, interfaces, dependency inversion, etc. – all the power of programming languages are here. These things are given for a Java programmer, but not so much for IaC code before CDK. No need to say more.
- 2. Imports and references: stack imports and references to other resources are much more straightforward. No need to use the uncomfortable syntax of CloudFormation– just work with them through regular variables, and CDK will handle the rest. Be careful with circular dependencies!
- 3. Constructs: CDK has introduced the concept of Constructs. These are like LEGO building blocks, which you can use to combine and create higher-level constructs. It makes reusability, especially with common components in a project, very easy and helps you write very elegant and concise code on a scale. Constructs are powerful. For example, we have a custom construct that describes a whole data pipeline and can be parametrized to various needs. So instead of having a bunch of YAML templates for Step Functions, Lambdas, Glue Crawlers, etc., we just create a data pipeline that includes all of this. This is probably my favorite feature of CDK, and it's not hard to understand why. I hate boilerplate code.
Negative
- 1. Relatively immature: since its introduction in 2019, it's been open-sourced on GitHub, seeking the community's support to maintain the project. Although this is a welcomed and proven approach (Kubernetes), it had mixed results in this case. You rely on the community to introduce new features and iron out bugs – which is excellent in theory when the project has an interest, but if not, it's a dangerous game. You cannot contact AWS Support with bugs – they will just direct you to the community. Sometimes it also lags with new features compared to CloudFormation, but this was only the case once in my past year using it. CDK is essentially an abstraction on top of CloudFormation, driven by an open-source community. If it loses the wind from its sail, it's questionable how it will continue.
- 2. Speed: CloudFormation was already relatively slow, but CDK is even slower. Be prepared to grab a cup of coffee when you deploy an application with several stacks.
- 3. Constructs Hub is not as rich: the idea of having a hub of constructs made by the community is fantastic. However, you might find only a few things worth having an additional dependency on.
Conclusion
I must admit that the idea of defining things in imperative programming languages – which were traditionally defined by declarative code – felt a bit unnatural to me. So, initially, I had objections towards the core trait of CDK: using programming languages to declare infrastructure. I agree YAML has shortcomings, but somehow it felt more natural and readable to use an actual configuration language to declare configuration.
But I must also admit that CDK somehow grew on me after a year of using it. And now, my objections are no longer in place. Its elegant and concise code reduces boilerplate to a minimum and is something you do by design rather than an effort or something you strive for. Moreover, keeping a high-quality and unified configuration with reusable constructs is easier. Thus, I recommend it – keeping in mind the disadvantages.