If your ElastiCache Redis or Memcached runs around the clock, you're overpaying by 40%. Here's how to automate Reserved Node purchases and tracking with Terraform.
Here's a painful truth: If your ElastiCache cluster has been running for more than a month, you've already overpaid.
Most teams deploy Redis or Memcached, set it, forget it β and never think about reserved pricing.
Let's fix that.
πΈ The On-Demand Tax
Here's what a typical ElastiCache setup costs on-demand:
cache.r7g.large (Redis, Multi-AZ)
On-Demand: $0.252/hour Γ 730 hours = $184/month
1-Year RI: $0.150/hour Γ 730 hours = $110/month
3-Year RI: $0.102/hour Γ 730 hours = $74/month
That's 40% savings (1-year) or 60% savings (3-year) β for the exact same cluster doing the exact same thing. π°
Scale that across a real environment:
| Setup | On-Demand | 1-Year RI | 3-Year RI |
|---|---|---|---|
| 1 node (cache.r7g.large) | $2,208/yr | $1,320/yr | $888/yr |
| 3-node cluster | $6,624/yr | $3,960/yr | $2,664/yr |
| 3-node + 2 replicas | $11,040/yr | $6,600/yr | $4,440/yr |
A 3-node cluster with replicas saves $4,440/year with 1-year RIs. No changes to your application. Zero downtime. Just cheaper. β
π€ When Should You Reserve?
The break-even point for a 1-year No Upfront RI is roughly 7-8 months. So if your cluster has been running for 8+ months and you haven't reserved β you're burning money.
Reserve when:
- β Cluster has been stable for 3+ months
- β You don't plan to change node types soon
- β It's a production workload running 24/7
- β You're using consistent node families (e.g., r7g, m7g)
Don't reserve when:
- β Dev/test clusters that get torn down
- β You're actively testing different node sizes
- β Cluster is less than 3 months old
- β Planning a migration to a different engine or service
ποΈ Terraform Implementation
Step 1: Deploy Your ElastiCache Cluster
# modules/elasticache/main.tf
variable "environment" {
type = string
}
variable "node_type" {
type = string
default = "cache.r7g.large"
}
variable "num_cache_clusters" {
type = number
default = 3
}
resource "aws_elasticache_replication_group" "redis" {
replication_group_id = "${var.environment}-redis"
description = "${var.environment} Redis cluster"
node_type = var.node_type
num_cache_clusters = var.num_cache_clusters
engine = "redis"
engine_version = "7.1"
port = 6379
parameter_group_name = "default.redis7"
# Multi-AZ for production
automatic_failover_enabled = var.environment == "prod"
multi_az_enabled = var.environment == "prod"
# Encryption
at_rest_encryption_enabled = true
transit_encryption_enabled = true
# Maintenance
maintenance_window = "sun:05:00-sun:07:00"
snapshot_retention_limit = var.environment == "prod" ? 7 : 0
snapshot_window = "03:00-05:00"
tags = {
Environment = var.environment
ManagedBy = "terraform"
ReserveReady = "true" # π Tag for RI tracking
}
}
Step 2: Purchase Reserved Nodes with Terraform
# reserved-instances/elasticache.tf
resource "aws_elasticache_reserved_cache_node" "redis_prod" {
reserved_cache_nodes_offering_id = data.aws_elasticache_reserved_cache_node_offering.redis.offering_id
cache_node_count = 3 # Match your cluster size
}
data "aws_elasticache_reserved_cache_node_offering" "redis" {
cache_node_type = "cache.r7g.large"
duration = "P1Y" # 1 year (P3Y for 3-year)
offering_type = "No Upfront" # or "Partial Upfront", "All Upfront"
product_description = "redis"
}
β οΈ Important: Running
terraform applyon reserved node resources commits you to a purchase. There's no undo. Always runterraform planfirst and review carefully.
Step 3: Payment Options Compared
# Option A: No Upfront (most flexible, least savings)
# Pay monthly, cancel-proof but still committed for term
offering_type = "No Upfront"
# Savings: ~33-36%
# Option B: Partial Upfront (balanced)
# Pay some upfront + reduced monthly
offering_type = "Partial Upfront"
# Savings: ~38-41%
# Option C: All Upfront (maximum savings)
# Pay everything upfront, nothing monthly
offering_type = "All Upfront"
# Savings: ~40-44%
My recommendation: Start with No Upfront 1-Year. You get most of the savings with maximum flexibility. Graduate to Partial/All Upfront once you're confident in your setup. π―
π Automated RI Coverage Monitoring
Don't let reservations expire silently. This Lambda checks coverage and alerts you:
# monitoring/ri-coverage.tf
resource "aws_lambda_function" "ri_monitor" {
filename = data.archive_file.ri_monitor.output_path
function_name = "elasticache-ri-monitor"
role = aws_iam_role.ri_monitor.arn
handler = "index.handler"
runtime = "python3.12"
timeout = 30
source_code_hash = data.archive_file.ri_monitor.output_base64sha256
environment {
variables = {
SNS_TOPIC_ARN = aws_sns_topic.cost_alerts.arn
}
}
}
data "archive_file" "ri_monitor" {
type = "zip"
output_path = "${path.module}/ri_monitor.zip"
source {
content = <<-PYTHON
import boto3
import os
from datetime import datetime, timedelta
def handler(event, context):
ec = boto3.client('elasticache')
sns = boto3.client('sns')
# Get all running nodes
clusters = ec.describe_cache_clusters()['CacheClusters']
running_nodes = {}
for c in clusters:
key = f"{c['CacheNodeType']}|{c['Engine']}"
running_nodes[key] = running_nodes.get(key, 0) + c['NumCacheNodes']
# Get active reservations
reservations = ec.describe_reserved_cache_nodes()['ReservedCacheNodes']
reserved = {}
expiring_soon = []
for r in reservations:
if r['State'] == 'active':
key = f"{r['CacheNodeType']}|{r['ProductDescription']}"
reserved[key] = reserved.get(key, 0) + r['CacheNodeCount']
# Check if expiring within 30 days
end_time = r['StartTime'] + timedelta(seconds=r['Duration'])
if end_time - datetime.now(end_time.tzinfo) < timedelta(days=30):
expiring_soon.append({
'id': r['ReservedCacheNodeId'],
'type': r['CacheNodeType'],
'expires': end_time.strftime('%Y-%m-%d')
})
# Find unreserved nodes
unreserved = []
for key, count in running_nodes.items():
reserved_count = reserved.get(key, 0)
if count > reserved_count:
node_type, engine = key.split('|')
unreserved.append(
f" {node_type} ({engine}): "
f"{count - reserved_count} unreserved of {count} total"
)
# Build alert
alerts = []
if unreserved:
alerts.append("UNRESERVED NODES (wasting money!):\n"
+ "\n".join(unreserved))
if expiring_soon:
alerts.append("EXPIRING WITHIN 30 DAYS:\n" + "\n".join(
f" {e['id']} ({e['type']}) expires {e['expires']}"
for e in expiring_soon
))
if alerts:
sns.publish(
TopicArn=os.environ['SNS_TOPIC_ARN'],
Subject='ElastiCache RI Coverage Alert',
Message="\n\n".join(alerts)
)
return {'unreserved': len(unreserved), 'expiring': len(expiring_soon)}
PYTHON
filename = "index.py"
}
}
# Run weekly
resource "aws_cloudwatch_event_rule" "weekly_ri_check" {
name = "elasticache-ri-check"
schedule_expression = "rate(7 days)"
}
resource "aws_cloudwatch_event_target" "ri_monitor" {
rule = aws_cloudwatch_event_rule.weekly_ri_check.name
arn = aws_lambda_function.ri_monitor.arn
}
resource "aws_lambda_permission" "allow_eventbridge" {
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.ri_monitor.function_name
principal = "events.amazonaws.com"
source_arn = aws_cloudwatch_event_rule.weekly_ri_check.arn
}
resource "aws_sns_topic" "cost_alerts" {
name = "elasticache-cost-alerts"
}
You'll get an email alert whenever nodes are unreserved or reservations are about to expire. No more surprise bills. π¬
β‘ Quick Audit: Are You Wasting Money Right Now?
Run this CLI command to check your current RI coverage:
# List all running ElastiCache nodes
aws elasticache describe-cache-clusters \
--query 'CacheClusters[].{ID:CacheClusterId,Type:CacheNodeType,Engine:Engine,Nodes:NumCacheNodes}' \
--output table
# List active reservations
aws elasticache describe-reserved-cache-nodes \
--query 'ReservedCacheNodes[?State==`active`].{Type:CacheNodeType,Count:CacheNodeCount,Expires:StartTime}' \
--output table
If the first table has more nodes than the second β you're overpaying. π¨
π― Implementation Checklist
- Audit β Run the CLI commands above to find unreserved nodes
- Identify stable clusters β Production clusters running 3+ months
- Start conservative β 1-Year, No Upfront for your first reservation
- Deploy monitoring β Set up the Lambda to catch gaps and expirations
- Review quarterly β Reassess node types and reservation coverage
π‘ Pro Tips
- Reservations are region-specific β A reservation in us-east-1 won't cover nodes in eu-west-1
- Node type must match exactly β cache.r7g.large RI won't cover cache.r7g.xlarge
- Reservations apply automatically β Once purchased, billing adjusts immediately. No cluster changes needed
- Combine with Graviton β If you haven't migrated to r7g/m7g yet, do that first (20% cheaper), then reserve the Graviton nodes for compounding savings π₯
π TL;DR
| Action | Savings | Effort |
|---|---|---|
| 1-Year No Upfront RI | ~36% | 5 minutes |
| 1-Year All Upfront RI | ~42% | 5 minutes |
| 3-Year All Upfront RI | ~60% | 5 minutes |
| + Graviton migration | +20% on top | 5 minutes |
Bottom line: If your Redis has been running for 8+ months and you haven't reserved, you're throwing away 40% of that bill. Fix it today. β‘
Running ElastiCache without Reserved Nodes is like paying rent monthly when the landlord offers 40% off for signing a lease. Same apartment, just cheaper. π
Found this helpful? Follow for more AWS cost optimization with Terraform! π¬
Top comments (0)