Introduction
AWS Systems Manager (SSM) plays a critical role in managing EC2 instances at scale—handling automation, orchestration, and remote sessions seamlessly. However, over time, orchestration session logs can accumulate significantly on instances, consuming valuable disk space and potentially impacting system stability if left unmanaged.
This blog outlines a practical and automated retention strategy to manage SSM orchestration logs by compressing older session data and enforcing a defined cleanup policy—ensuring optimal disk utilization while maintaining necessary operational history.
Objective
The primary goal of this implementation is to:
Optimize disk usage on EC2 instances
Retain recent orchestration data for troubleshooting and audit needs
Automate cleanup without impacting SSM agent functionality
Retention Policy
Target Path: /var/lib/amazon/ssm/<instance-id>/session/orchestration/
Compression: Older than 1 day
Retention Period: 7 days
Deletion: Archives older than 7 days
Solution Overview
The solution uses a lightweight Bash script that:
Dynamically identifies SSM orchestration directories across instance IDs
Compresses session folders older than one day
Retains compressed archives for seven days
Automatically deletes expired archives
Runs daily using a cron job
This approach is simple, scalable, and does not require any changes to AWS-side configurations.
Implementation Steps
Step 1 — Create Cleanup & Compression Script
The script dynamically identifies orchestration directories, compresses older session folders, and deletes expired archives.
sudo tee /usr/local/bin/ssm_orch_retention.sh > /dev/null <<'EOF'
#!/bin/bash
# Retention and compression policy for SSM Orchestration logs
# Applies dynamically to any instance ID directory under /var/lib/amazon/ssm/
BASE_DIR="/var/lib/amazon/ssm"
RETENTION_DAYS=7
# Find all orchestration paths across all instance IDs
find "$BASE_DIR" -type d -path "*/session/orchestration" 2>/dev/null | while read -r ORCH_DIR; do
[ ! -d "$ORCH_DIR" ] && continue
echo "[INFO] Processing: $ORCH_DIR"
# Compress directories older than 1 day (but newer than retention period)
find "$ORCH_DIR" -mindepth 1 -maxdepth 1 -type d -mtime +1 -mtime -"$RETENTION_DAYS" ! -name "*.tar.gz" | while read -r d; do
tar -czf "${d}.tar.gz" -C "$(dirname "$d")" "$(basename "$d")" && rm -rf "$d"
echo "[COMPRESSED] $(basename "$d") -> ${d}.tar.gz"
done
# Delete compressed archives older than retention period
find "$ORCH_DIR" -type f -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete
done
EOF
Step 2 — Manual Testing
Run the script manually and validate compressed and deleted logs.
sudo /usr/local/bin/ssm_orch_retention.sh
Expected Outcome
Session directories older than 1 day are compressed
into
.tar.gz
Compressed archives older than 7 days are removed automatically
Step 3 — Automate with Cron
Schedule the script to run daily at 2:00 AM using cron.
sudo bash -c 'echo "0 2 * * * /usr/local/bin/ssm_orch_retention.sh
>/var/log/ssm_orch_retention.log 2>&1" >> /etc/crontab'
Automation Benefits
No manual intervention required
Centralized execution logging
Consistent retention enforcement
Verification Checklist
Ensure compression, deletion, and cron execution are functioning as expected.
Key Notes and Considerations
The script dynamically detects instance ID directories
Designed for minimal operational impact
Helps prevent disk exhaustion caused by unmanaged orchestration logs
Retains a 7-day compressed audit trail for troubleshooting
Safe to deploy across multiple EC2 instances
Conclusion
Unmanaged SSM orchestration logs are one of those quiet issues that only surface when disk space runs out—usually at the worst possible time. By introducing a simple, automated retention and compression strategy, you turn log sprawl into a controlled, predictable process.