V2
AWS CDK: A Review
All of the Things
-
How I learned to Keep Calm and Trust the CDK
-
want the nuance of cloudformation, the expressivity of javascript, the type safety of java and the ability to create infrastructure 10x faster?
In that case, I like to introduce the AWS CDK. At the risk of sounding like a snake oil salesman, after using the CDK over the last several months, I admit that I'm in love.
The CDK is having the cake and eating it to and then being served a nine course meal with custom wine parings afterwards (because meals should always start off with cake). Okay, I'm done with the admiration
now, let get into some specifics
Overview
The AWS Cloud Development kit (CDK), simply put, is AWS's version of Terraform. The CDK allows users to use softare to define and provision infrastructure. The term infrastructure as code is a popular one heard in devops circles and typically has meant checking in your provisioning scripts or cloudformation templates into a github repo to manage. With CDK, your infrastructure is actually code. Specifically, it is typescript code, with support to other languages added using jsii. Currently supported languages include C#/.NET, Java, JavaScript, Python, and TypeScript
Here's an example to create a lambda based cron job (the below snippet is taken from the aws-cdk-examples repository
import events = require('@aws-cdk/aws-events');
import targets = require('@aws-cdk/aws-events-targets');
import lambda = require('@aws-cdk/aws-lambda');
import cdk = require('@aws-cdk/cdk');
import fs = require('fs');
export class LambdaCronStack extends cdk.Stack {
constructor(app: cdk.App, id: string) {
super(app, id);
const lambdaFn = new lambda.Function(this, 'Singleton', {
code: new lambda.InlineCode(fs.readFileSync('lambda-handler.py', { encoding: 'utf-8' })),
handler: 'index.main',
timeout: 300,
runtime: lambda.Runtime.Python27,
});
// Run every day at 6PM UTC
// See https://docs.aws.amazon.com/lambda/latest/dg/tutorial-scheduled-events-schedule-expressions.html
const rule = new events.Rule(this, 'Rule', {
scheduleExpression: 'cron(0 18 ? * MON-FRI *)',
});
rule.addTarget(new targets.LambdaFunction(lambdaFn));
}
}
const app = new cdk.App();
new LambdaCronStack(app, 'LambdaCronExample');
app.run();
Something that you'll notice in the above example is that nowhere did you see YAML, JSON, or some custom config language. Its just code. The equivalent of the above in cloudformation:
TODO: generate cloudformation code
TODO: architecture
Traditionally, I've used raw cloudformation and SAM to provision infrastructure and coming from that world into the CDK feels like the same leap one makes when going from C++ to python
TODO: XKCD
CDK Primer
To get started with CDK, download the cdk
client and use it to generate your first app.
npm i -g aws-cdk
mkdir hello-cdk
cd hello-cdk
cdk init app --language=typescript (or --language=java, ...)
cdk deploy
CDK Constructs
These are basic cloud components and can represent a single service (eg. S3 Bucket) or a set of services that represent a logical entity (eg. a Code Pipeline with four stages). Constructs are like modules in a programming language, which means you can create your own, create constructs that are the composition of multiple sub construcuts, use inheritance and create re-usable sharable constructs for common infrastructure.
The AWS CDK provides two types of constructs - Low and High Level.
High Level constructs will fill in most of the details for you and is akin to using the console on AWS where a lot of the nitty gritty with IAM permissions and dependent resources are created for you automagically.
- cloudtrail example #TODO (Private)
Low level constructs start with Cfnxxx
prefix and there is a direct one-to-one mapping between a low level construct and the cloudformation properties of a particular service
- cloudtrail Cfn example #TODO (Private)
Usage
I started using the CDK tentatively at first for some smaller client projects, building out ci/cd pipelines. I've actually been meaning to use it ever since it came out.
Review
Love
I don't use the word love lightly but here are some things I just abbsolutely love about the cdk.
fantastic tooling
Tooling is usually the thing that is saved for last in backend projects where the focus tends to be more about features then ux (eg. git). That's why I was delighted that I find the cdk great.
cdk init
will standup a cdk project skeleton to build out a new projectcdk diff
will create a DIFF of your current stack vs the one that you are currently deploying on- in case that last one skipped you by, the CDK will CREATE A DIFF OF NEW STACK CHANGES
- apologies for the caps but this is a big stinkin deal because running cloudformation against existing production stacks in the past was always an ulcer inducing moment as you never knew if this was going to be the deploy that would accidentally replace your database
- cloudformation introduced changesets in 2016 so that you can do a diff but doing so was always painful
# in case there was a previous change set that was created but not deployed
aws cloudformation delete-change-set --stack-name $MY_STACK --change-set-name $MY_CHANGESET
# create changeset
# note that all parameters of a changeset have to be specified
# even if they are still using the same as the previous
aws cloudformation create-change-set --stack-name $MY_STACK --template-body "$(cat cloudformation.yaml)" --parameters ParameterKey=CrossAccountCondition,ParameterValue=true ParameterKey=ProjectName,ParameterValue=MyProject ParameterKey=S3Bucket,UsePreviousValue=true ParameterKey=CMKARN,UsePreviousValue=true ParameterKey=ProductionAccount,UsePreviousValue=true ParameterKey=TestAccount,UsePreviousValue=true --capabilities CAPABILITY_NAMED_IAM --change-set-name $MY_CHANGESET
# finally get the diff, and this is in json form
aws cloudformation describe-change-set --stack-name $MY_STACK --change-set-name $MY_CHANGESET
- now look at the same in the CDK
TODO
cdk changeset
cdk synth
generates the raw cloudformation template- this is great just to verify that the code is creating the infrastructure that I want
- provides an
eject
functionality so I can use native cloudformation if I don't want to use the cdk
cdk deploy
validates your cloudformation, creates a changeset and deploys it to your production stackcdk ls
returns all your cloudformation stacks currently in your template
use a programming language
There's imense power from being able to use a programming language to create your infrastructure. This means you can use loops, conditionals, string interpolation, modules and all the other features that make programming languages great.
Some examples: TODO
- autoscalin with conditionals and string interpolation
const asg = new autoscaling.AutoScalingGroup(scope, `ASG-${stage}`, {
vpc: vpc,
minCapacity: 1,
desiredCapacity: 1,
// keep max caapacity low for non-prod stage
maxCapacity: (stage === 'gamma' ? 2 : 5),
// use smaller instance sizes for non-prod stage
instanceType: new ec2.InstanceTypePair(ec2.InstanceClass.Burstable2,
(stage === 'gamma' ? ec2.InstanceSize.Small: ec2.InstanceSize.Medium)
),
machineImage: new ec2.AmazonLinuxImage(),
});
There generally is a tradeoff where you either use a config file like YAML, XML or JSON and declarative declare your resources to enable validation, drift detection and whatnot but it comes at the cost of your developers hating themsleves because writing and debugging YAML is something that no one wakes up excited to do.
Conversely, you have programming languages which are wonderfully expressive and can be tested and put through a whole integration pipeline but that very expressivity makes it hard to generate consistend builds of resources.
The CDK uses the expressivity of typescript to generate a configuration, cheating the tradeoff and developing both.
higher level constructs
AKA sane defaults. AKA how AWS is supposed to work if it weren't complicated.
Using the CDK is like using the console but in a good way. AWS suffers from a reproducibly problem because while it is generally straightforward to test a service using the console, it can be fiendishly hard to reproduce that in cloudformation and replicating that setup requires lots of tenacity, scouring the docs, checking your cloudtrail forums and having access to a local AWS expert. This is because AWS services usually require a whole bunch of dependent resources to get started, usually in the form of S3 buckets, IAM roles, Networking constructs, etc. Using the console, this is done for you atuomatically but it means that you're left scratching your head when tryinng to do the same in cloudformation.
A recent example - [cognito user pools](TODO: link) (not to be confused with [cognito federated identities](TODO: link)) is a managed service that lets customers implement both 1st and 3rd party authentication systems.
User pools have a lambda triggers feature which enable customers to create hooks for specific auth events (eg. run lambda to create an dynamodb entry when a customer first signs up). Using the console, just select the lambda and everything just works.
TODO: image
Use cloudformation, and you will get an access denied.
TODO: snapshot docs
Tursn out that you need to add a resource based policy to the lambda so that cognito can invoke it.
When you use CDK and instantiate a user pool, any lambda that you associate with the pool will automatic have the permissions added.
- cdk code
TODO: user pool
- resulting cloudformation
TODO: cloudformation
typescript
Typescript has, and very much to my surprise, become my new favorite programming language. The feature is in the name - typescript does types and does them really well. When you use typescript with CDK, you get auto completion, documentation and runtime checks about your infrastructure. The importance of this cannot be overstated
1st class cloudformation integration
This is where the CDK really shines because at the end of the day, if the library doesn't support a particular service, users can always drop down to cloudformation itself.
They can do this in two ways: either in the code by adding properties to a high level construct or by using low level constructs to begin with
- high level construct example, add acl to bucket
TODO
- low level construct example
TODO
Finally, customers can always eject
by taking the generated cloudformation file and using it as is
Edges
Like all software, the CDK isn't all magic and rainbows and there very definite pitfalls to take into consideration.
developer preview
AWS has the following disclaimer when you visit the CDK developer guide
This documentation is for the developer preview release (public beta) of the AWS Cloud Development Kit (AWS CDK). Releases might lack important features and might have future breaking changes.
Currently, what this means is that all releases come with breaking changes. As of this writing, we are on version 0.33
of the CDK. Over the past 3 weeks, the CDK changed from 0.31
to 0.32
and now to 0.33
. Each bump has come with breaking changes, usually in the form of renaming constructs (eg. VpcNetwork -> Vpc
). The CDK reference documentation might not point to the latest version (it was completely unavailable for some time) and even then, a lot of the functionality of the library is mostly undocumented.
For the adventures, try figuring out what the following do: TODO
- eg.
cdk synth --no-staging
poor documentation
some core constructs like Token
are barely documented and can lead to frustrating debugging
TODO: token documentation TODO: my github issue
cloudformation override
While it is possible to go into cloudformation, it would be nice if that weren't the case. But some common actions are still not possible
poor multi account support
- implementing multi account pipeline is poor
aws specific
Unlike other third party solutions, the CDK is an AWS specific cloudformation generation tool. While this means first party AWS support, there is zero support for any other clouds. Depending on what side of the cloud divide you're in, this can either be a non-issue or a deal breaker.
hard to deviate from public constructs
Having best practices setup is nice but might not make sense for everyone. For example, a fully redundant multi az rds cluster might not make sense if you are running a low traffic wordpress site or have a legacy single point of failure in a single AZ anyways.
For example, the default VPC construct with create public and private subnets per Availability Zone with a NAT gateway per private subnet.
If you stick with this default in us-west-2, the CDK will construct 3 NAT gateways. The price of a NAT gateway is $0.045/h, regardless if you are sending traffic through it. That means that even if you don't send any traffic, this default VPC will run you ~$97/mo.
It's hard to change this in the high level construct, and there's an open issue here: https://github.com/awslabs/aws-cdk/issues/1305
Your only alternative is to use low level constructs but constructing a vpc, routing tables, subnets, is a painful undertaking and exactly the sort of work that the CDK is supposed to prevent.
ux
programatic ids
CDK best practice is to let the CDK auto generate names for all your resources. This is necessary for certain changes that require a REPLACE
operation where the CDK needs to tear down an old resource and stand up a new one but results in all your
nested stacks
CDK doesn't support nested stacks. You can cheat by using the cloudformation construct but at that point, you're in the same circle in hell as the people that use metaprogramming in java. This is okay per say because you're cdk uses different paradigm but you might run into issues if you're cloudformation gets too bib, especially if you are doing multi accounts.
Use Cases
-
use cdk for all the places you would have used terraform/sam/serverless/troposphere/etc
-
create re-usable, tested, common components
-
create dashboards (possible in AWS but cdk makes it easier)
-
create email subscriptions to sns programatically
CDK vs *
The CDK comes to the infrastructure provisioning in a crowded market. There's a saying at AWS - if something is worth doing, its worth doing (at least) twice. If you look at AWS's own efforts around declarative infrastructure provisioning on AWS, you'll first party services such as cloudformation
, Amplify
and the Serverless Application Model
.
The problem with all these solutions is because there are too complicated (eg. cloudformation) or too limited (eg. SAM). My evaluation of IaaS software falls under the following criteria:
- easy to do the easy things
- easy to do the hard things
There are numerous third party providers, a few listedf below:
- terraform
- troposphere: like CDK but community supported and written in python
- pulumi
- amplify
- sam
- serverless
- etc.
Conclusion
The AWS CDK is the next iteration in the steady trend of infrastructure as code. In that regards, it actually delivers and makes it possible to write, provision, and test infrastructure, as code.
Yes, there are still lots of edges to be smoothed out and it is still in developer preview (aka things could break all the time) but the value that it delivers far outweights any of the setbacks.
Using the CDK has made me realize that the cake is not a lie and that it is possible to have all things all the time - in this case its the expressivity of a programming language combined with the consistency and declarativity of a complicated configuration. The CDK is my favorite new hammer of 2019 and I look forward to seeing the development of both the CDK and best practices around it. In the meanwhile, I've made up some of my own and will be posting it and other updates about the CDK in upcoming posts so stay tuned :)
Other
Currently it is not very exciting and
Resources:
CDKMetadata:
Type: AWS::CDK::Metadata
Properties:
Modules: aws-cdk=0.33.0,@aws-cdk/cdk=0.33.0,@aws-cdk/cx-api=0.33.0,jsii-runtime=node.js/v8.13.0
Saying the CDK provisions AWS infrastructure is like saying a rocketship can fly - it fails to do justice to something that is out of this world. With the CDK, you can initialize AWS services that fill in missing values with sane defaults and also use AWS best practices to do so. High Level constructs will fill in most of the details for you and is akin to using the console on AWS where a lot of the nitty gritty with IAM permissions and dependent resources are created for you automagically.
And because the CDK generates cloudformation, I can always verify what resources get created and unlike other frameworks that try to simplify cloudformation development(cough* SAM), I get to use the full power of cloudformation.
Backlinks