DEV Community

Sesank Munukutla (Naga)
Sesank Munukutla (Naga)

Posted on

How to Implement Just-In-Time SSH Access for AWS EC2 (Stop Leaving Port 22 Open!)

The problem: Most EC2 instances have permanent SSH access. Port 22 is open to the world. Long-lived keys floating around. Security groups that nobody remembers creating.

The reality: You get brute-force attempts within hours. Keys get leaked. Forgotten access stays forever.

I spent last week implementing Just-In-Time (JIT) access for EC2, and it completely changed how I think about server access. Here's what I built and what I learned.

What You'll Learn

  • ✅ How to eliminate permanent SSH access to EC2
  • ✅ Build auto-expiring access with native AWS services (no third-party tools)
  • ✅ Set up full audit trails with CloudTrail
  • ✅ Use Lambda to automate security group changes
  • ✅ Apply least-privilege IAM design

Why JIT Access Matters

Permanent SSH access to EC2 is one of the most common and dangerous cloud misconfigurations.

Here's what happens when you leave SSH open:

  • 🚨 Open port 22 = instant target for attackers
  • 🚨 Long-lived SSH keys = credential sprawl
  • 🚨 Forgotten security group rules = standing access nobody uses
  • 🚨 No audit trail = "who logged in last month?"

JIT access solves this:

  • SSH access granted only when needed
  • Access is time-bound (expires automatically)
  • Revocation is automatic (no manual cleanup)
  • Every action is fully auditable

Architecture Overview

The solution uses only native AWS services:

  • Amazon EC2 – The target instance
  • AWS Lambda – JIT automation engine
  • IAM – Least-privilege roles
  • CloudTrail – Audit evidence
  • CloudWatch – Execution logs

Design principle:

No standing access. Access exists only while the automation is running.

Architecture Overview

Step 1: Lock Down the EC2 Instance (Baseline)

Start with zero access by default.

The EC2 instance is launched with:

  • No inbound SSH rules
  • No 0.0.0.0/0 on port 22
  • ✅ Only necessary outbound access

Why this matters:

  • Eliminates permanent exposure
  • Forces all access through controlled mechanisms
  • Creates a secure baseline

EC2 Security Group with no inbound SSH rules
Security group showing zero SSH access by default - no port 22 open


Step 2: EC2 IAM Role – SSM Only

The EC2 instance gets an IAM role with only the SSM managed policy.

Key point: The instance cannot modify its own security group.

This prevents:

  • Self-escalation attacks
  • Instance-level privilege creep
  • Unauthorised security group changes

Separation of duties is critical here.

IAM Role attached to EC2 instance with SSM policy only
EC2 instance role with only SSM permissions - cannot modify security groups


Step 3: CloudTrail for Auditability

CloudTrail is configured to log management events only.

This captures:

  • ✅ Security group changes
  • ✅ Who made the change
  • ✅ When it happened
  • ✅ What was modified

Why not data events?

  • JIT access modifies infrastructure, not data
  • Precision logging beats noisy logging
  • Lower costs, better signal-to-noise ratio

CloudTrail configured for management events only
CloudTrail logging configuration - management events capture security group changes


Step 4: JIT Lambda Function

The core automation: a Lambda function that controls the access lifecycle.

What it does:

  1. Grant – Add SSH rule for a single IP
  2. Wait – Hold access for configured duration
  3. Revoke – Remove the rule automatically

Lambda pseudo-code:

def lambda_handler(event, context):
    # Read configuration from environment variables
    security_group_id = os.environ['SECURITY_GROUP_ID']
    allowed_ip = os.environ['ALLOWED_IP']
    duration = int(os.environ['DURATION'])  # seconds

    # Grant SSH access
    ec2.authorize_security_group_ingress(
        GroupId=security_group_id,
        IpPermissions=[{
            'IpProtocol': 'tcp',
            'FromPort': 22,
            'ToPort': 22,
            'IpRanges': [{'CidrIp': f'{allowed_ip}/32'}]
        }]
    )

    # Wait for configured duration
    time.sleep(duration)

    # Revoke SSH access automatically
    ec2.revoke_security_group_ingress(
        GroupId=security_group_id,
        IpPermissions=[{
            'IpProtocol': 'tcp',
            'FromPort': 22,
            'ToPort': 22,
            'IpRanges': [{'CidrIp': f'{allowed_ip}/32'}]
        }]
    )
Enter fullscreen mode Exit fullscreen mode

Step 5: Lambda Configuration via Environment Variables

All JIT behavior is externalized:

  • SECURITY_GROUP_ID – Which security group to modify
  • ALLOWED_IP – Which IP gets access
  • DURATION – How long access lasts (seconds)

Why environment variables?

  • ✅ No hardcoding
  • ✅ Easy to audit
  • ✅ Safe to change duration without code changes
  • ✅ Works across environments (dev/staging/prod)

Lambda environment variables: SECURITY_GROUP_ID, ALLOWED_IP, DURATION
Lambda configuration externalized - no secrets hardcoded in the function


Step 6: Least-Privilege Lambda Execution Role

The Lambda execution role gets only the permissions needed:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:AuthorizeSecurityGroupIngress",
        "ec2:RevokeSecurityGroupIngress"
      ],
      "Resource": "arn:aws:ec2:*:*:security-group/sg-xxxxxxxxx"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

This enforces:

  • Separation of duties
  • Minimal blast radius
  • Defense in depth

Lambda execution role with least-privilege IAM policy
IAM policy allowing only security group ingress modifications - nothing else


Step 7: JIT Execution (Grant → Revoke)

When you execute the Lambda:

Timeline:

00:00 – Lambda starts
00:01 – SSH access GRANTED for 203.0.113.45/32
05:00 – Access active (you can SSH in)
10:00 – SSH access REVOKED automatically
10:01 – Lambda completes
Enter fullscreen mode Exit fullscreen mode

Key log evidence:

  • ✅ SSH access granted (CloudWatch)
  • ✅ SSH access revoked (CloudWatch)
  • ✅ Execution duration ≈ configured timeout
  • ✅ No manual intervention

This confirms true JIT behavior, not manual cleanup.

CloudWatch logs showing SSH access granted and then revoked automatically
The smoking gun: Logs prove access was granted at 00:01 and auto-revoked at 10:00


Step 8: CloudTrail Proof (Audit Evidence)

CloudTrail records both sides of the access lifecycle:

Grant event:

{
  "eventName": "AuthorizeSecurityGroupIngress",
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AROAXXXXXXXXX:jit-lambda-function"
  },
  "requestParameters": {
    "groupId": "sg-xxxxxxxxx",
    "ipPermissions": {
      "items": [{
        "ipProtocol": "tcp",
        "fromPort": 22,
        "toPort": 22,
        "ipRanges": ["203.0.113.45/32"]
      }]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Revoke event:

{
  "eventName": "RevokeSecurityGroupIngress",
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AROAXXXXXXXXX:jit-lambda-function"
  }
}
Enter fullscreen mode Exit fullscreen mode

What this proves:

  • 📊 Who made the change (Lambda role)
  • 📊 What was changed (security group rule)
  • 📊 When it happened (timestamp)
  • 📊 That it was revoked (not forgotten)

This is audit-grade evidence, not just console screenshots.

CloudTrail event showing AuthorizeSecurityGroupIngress
CloudTrail proof #1: Lambda granted SSH access for specific IP at exact timestamp


What This Project Demonstrates

No standing SSH access – Zero permanent exposure

Time-bound, IP-restricted – Access expires automatically

Automatic revocation – Enforced by code, not humans

Full auditability – CloudTrail logging for compliance

Least-privilege IAM – Minimal permissions everywhere


The 5 Mistakes I Made (So You Don't Have To)

  1. Forgot to add CloudWatch Logs permissions – Lambda couldn't log execution
  2. Used 0.0.0.0/0 in testing – Defeated the entire point
  3. Didn't externalize duration – Had to redeploy Lambda to change timing
  4. Set timeout too short – Lambda terminated before revoking access
  5. Didn't test CloudTrail delay – Events take 5-15 minutes to appear

Final Takeaway

Security isn't about blocking access — it's about granting the right access, for the right time, with proof.

This project shows how JIT access can be implemented using native AWS services without introducing operational complexity or permanent risk.

No third-party tools. No manual cleanup. Just automatic, auditable, time-bound access.


What's Next?

In the next post, I'll cover:

  • Integrating this with AWS Systems Manager Session Manager (no SSH at all!)
  • Adding Slack notifications when access is granted
  • Building a self-service portal for developers

Have you implemented JIT access in your environment? What challenges did you face?

Drop a comment below or connect with me on LinkedIn.


Series: AWS Security Projects

This is part of my Friday Security Projects series, where I build hands-on cloud security projects focused on real-world failures and defence-in-depth design.

Previous projects:

  • Project 5: Zero-Trust EC2 Access with IAM, SSM, and GuardDuty
  • Project 4: Eliminating SSH with AWS Systems Manager
  • Project 3: Implementing Security Controls in Production
  • Project 2: Security Design Trade-offs
  • Project 1: Building a Secure Cloud Baseline

Top comments (0)