$68,004 spent on AWS in October - A full breakdown of ConvertKit's AWS bill

generalengineeringaws
$68,004 spent on AWS in October - A full breakdown of ConvertKit's AWS bill
Kris Hamoud
Kris Hamoud is an Infrastructure Engineer who enjoys building simple and scalable solutions • Kris's website

Overview

We spent $68,004.70 on AWS in October. This is up 4% from September and is 4.3% of MRR in October. We migrated most of our CDN needs off CloudFront and into Cloudflare. There is an increase in EC2 spend from our changes made in late September and there is a decrease in S3 spend as we begin to realize more data transfer wins from our migration to Cloudflare.

High-level breakdown:

  1. EC2-Instances - $26,305.25 (+25%)
  2. Relational Database Service - $20,080.50 (+7%)
  3. S3 - $7,463.20 (-25%)
  4. EC2-Other - $5,527.62 (+2%)
  5. Support - $4,405.16 (0%)
  6. EC2-ELB - $1,458.94 (+7%)
  7. CloudFront - $165.43 (-93%)

EC2-Instances - $26,305.25 (+25%)

We felt the full effect of the 4 new clusters we added in September. A $5000 increase in EC2 spend is noticeable but we don’t have any infrastructure projects in the future that will affect this in a major way. The most likely increase in EC2 spend will come from adding storage to our Elastic Stack introduced in September. We’ve used about 2.5TB of storage in 15 days with a total storage of 5.7TB in our current cluster. If we double our storage we should have enough room to grow for the foreseeable future.

Other than that, we should see this bill decrease in the future as we begin our migration to Kubernetes and start consolidating compute resources.

Service breakdown

  1. USE2-HeavyUsage:i3.2xlarge - $7,595.86 (+17%)
    • These are our reserved Cassandra and Elasticsearch clusters.
    • We use Cassandra to store massive amounts of data.
    • We use Elasticsearch to search through massive amounts of data and now we also use it to store our logs.
    • The increase comes from our new Elastic Stack that will eventually be used to phase out our old logging provider.
  2. USE2-BoxUsage:c5.2xlarge - $3,628.93 (+10%)
    • These are on-demand instances.
    • The C family instances have become pillars in our infrastructure in the last few months.
    • We use these because of their high compute power.
    • Much of our workloads are either storing data or aggregating data and these instances have pulled through for us making this work much easier and much faster.
    • We can find billing wins here by introducing spot instances as well as considering more reservations or we can look into the new AWS Savings Plan
  3. USE2-HeavyUsage:c5.2xlarge - $1,928.45 (+3%)
    • These are reserved instances.
    • We use the C family instances for web servers, logstash servers, high throughout sidekiq instances, and many more compute intensive workloads.
    • Unless we reserve more of these instances, we will pay this much every month until Q3 2020.
  4. HeavyUsage:i3.2xlarge - $2,865.89 (+486%)
    • These are reserved Cassandra instances.
    • We introduced a secondary Cassandra cluster in us-east-1 in September as a backup to our main Cassandra cluster in us-east-2.
    • We’ll pay this much until every month until Q3 2020.
  5. USE2-DataTransfer-Out-Bytes - $1,664.97 (+16%)
    • This is the cost of our services to communicate with the internet.
    • The increase comes with an increase in email sending volume as well as logging volume.
    • We should see some billing wins here in the future when we migrate off our logging provider.
  6. USE2-BoxUsage:t3.medium - $1,445.99 (-1%)
    • These are on-demand instances.
    • We use these instances for everything from email tracking to Elasticsearch indexing.
    • They are useful for their burst performance.
    • We get good use out of most of them but as we move towards Kubernetes we’ll get billing wins as we consolidate jobs so our instances spend less time idle. t3.medium CPU Usage
  7. USE2-BoxUsage:t3.xlarge - $1,361.82 (+15%)
    • These are on-demand instances.
    • We use these instances for a variety of jobs that rely on burstable CPU.
    • We get pretty good use out of these instances but we can downsize some of them to t3.larges. t3.xlarge CPU Usage
  8. USE2-BoxUsage:t3.large - $1,052.31 (+61%)
    • These are on-demand instances.
    • We use these for a variety of miscellaneous jobs.
    • These instances have not gotten very good usage and we should move them to t3.medium instances. t3.large CPU Usage

Relational Database Service - $20,080.50 (+7%)

The most important thing to note for this service is the 29% increase in backup charges. I mentioned in September that our disaster recovery backups hadn’t been working for part of the month and we had fixed it by the middle of the month. In October our DR backups were working 100% of the month and our costs were steady. Other than that, there were no major changes. The rest of the increases come from October being longer than September.

Service breakdown

  1. USE2-HeavyUsage:db.r5.12xl - $4,949.68 (+3%)
    • This instance is reserved.
    • This is our master MySQL database.
    • We’ll continue to pay this much until Q3 2020.
  2. RDS:ChargedBackupUsage - $3,628.50 (+29%)
    • These are our disaster recovery backups.
    • We take additional backups and send them to a different region in case of emergencies.
    • We fixed our backups in September after having inconsistencies from Cloudwatch.
    • From September through August you can see how our backups weren’t being charged at a normal or flat rate. RDS Backup Charges
  3. USE2-InstanceUsage:db.r4.8xlarge - $2,856.96 (+3%)
    • This is an on-demand instance.
    • This replica is being kept around because we will need it to maintain a healthy application until the end of 2019.
    • At that time we can consider getting rid of it or downsizing and reserving it and using it for other purposes within the company.
  4. USE2-RDS:ChargedBackupUsage - $2,665.82 (+2%)
    • These are our normal backups.
    • These have been functioning properly forever.
  5. USE2-RDS:Multi-AZ-GP2-Storage - $1,751.22 (+4%)
    • These are daily charges.
    • We paid more because October is longer than September.
  6. USE2-HeavyUsage:db.r4.8xlarge - $1,649.89 (+3%)
    • This instance is reseved.
    • This is our MySQL replica.
    • We’ll continue to pay this much until Q3 2020.
  7. USE2-RDS:GP2-Storage - $1,269.60 (0%)
    • We haven’t increased our MySQL storage since September.
    • These costs will remain steady until we add more storage.

S3 - $7,463.20 (-25%)

This is one of the most exciting changes we saw in the month of October. We migrated many of our links from CloudFront to Cloudflare and decreased our S3 bill by almost $2500. This is a win we should continue to see as long as we serve our assets from Cloudflare. These changes have only applied to a single bucket so as we migrate some of our older links we should continue to see further wins from our legacy bucket.

Service breakdown

  1. DataTransfer-Out-Bytes - $2,754.45 (-24%)
    • We saw some small wins here when we migrated our links behind Cloudflare.
    • We should see more wins as we migrate the rest of our legacy links behind Cloudflare. S3 Data Transfer
  2. USE2-DataTransfer-Out-Bytes - $1,572.49 (-63%)
    • This was an awesome win for the amount of data that we transfer.
    • Looking at how much data we were requesting before and after migrating is such a cool thing to see. S3 Data Transfer
    • Seeing the impact this had on how we’re billed is also incredibly cool. S3 Data Transfer
  3. USE2-TimedStorage-ByteHrs - $1,464.90 (+9%)
    • This is the steady growth of our backups coming from our Cassandra and Elasticsearch clusters.
    • This will continue to grow as the amount of data we store grows.
  4. TimedStorage-ByteHrs - $996.07 (+427%)
    • This increase comes from the introduction of backups to our secondary Cassandra cluster.
    • It will continue to grow as the amount of data we store grows. Cassandra Backups

EC2-Other - $5,527.62 (+2%)

We increased our regional data transfer a little with the introduction of a secondary Cassandra cluster. We decreased our EBS use by migrating some of our redis servers from EBS backed AOF files to SSD ephemeral storage.

Service breakdown

  1. USE2-NatGateway-Bytes - $1,770.35 (+4%)
    • We use a NAT gateway for our services to communicate with the internet.
    • Our daily bytes stayed pretty flat.
    • The bill increased because October is longer than September.
  2. USE2-DataTransfer-Regional-Bytes - $1,529.34 (+10%)
    • We have a little extra data transfer because of the secondary Cassandra cluster although it’s almost not noticeable.
  3. USE2-EBS:VolumeUsage.gp2 - $1,347.99 (-5%)
    • We had a small decrease here because we have gotten rid of some EBS volumes we used for AOF files for our redis servers.
    • Moving those redis servers increased our EC2 bill, decreased our EBS cost, but improved our AOF file sync performance.

Support - $4,405.16 (0%)

  1. 7% of monthly AWS usage from $10K-$80K - $2,924.70 (+5%)
    • This is the cost of only our production account.
  2. 10% of monthly AWS usage for the first $0-$10K - $1,480.45 (-8%)
    • This is the cost of our production account and billing account.
    • We could save money by turning off support for our billing account.

EC2-ELB - $1,458.94 (+7%)

These are the load balancers that we sit in front of our application. They distribute requests across our servers and run health checks to ensure that requests are routed to healthy servers. A majority of this cost comes from data transfer. The more popular our application gets, the more we’ll have to pay in load balancer costs. We can decrease this cost in the future as we replace some of our internal load balancers with services in Kubernetes.

CloudFront - $165.43 (-93%)

This is the most exciting change in our infrastructure for the month of October in my opinion. With some simple changes to our links that route them through Cloudflare instead of CloudFront we have saved almost $2000 in one month. The reason why this isn’t 0 is because we had a few days at the beginning of October where we were still sending requests through CloudFront. We also still have some of our staging domains using CloudFront distributions. Their cost is near zero but it’s still important that we move the remainder of our CDN needs to Cloudflare. CloudFront Costs

Conclusion

The biggest changes we had in October were from the introduction of the Elastic Stack, Cassandra cluster, and CloudFront/S3 migration to Cloudflare. We have a lot of work to do to cut down on our EC2 spend but I’m very excited about the infrastructure around our data transfer. A lot of the heavy lifting is done and in the month of October, we have already saved about $5500 between S3 and CloudFront. These cost savings will continue to compound forever because the amount we’re spending on data transfer is going down for the first time ever.