KloudMate
KloudMate Agent Docs
Installation

AWS ASG Deployment

Deploy kmagent across an Auto Scaling Group with zero manual per-instance work.

Deploy kmagent across an Auto Scaling Group with zero manual per-instance work. Configure once in the Launch Template, every instance (existing + future scale-outs) gets the agent automatically.


Prerequisites

  • AWS CLI configured with appropriate permissions
  • An existing ASG with a Launch Template
  • IAM Instance Profile attached to your ASG instances
  • SSM Agent running on instances (default on Amazon Linux / Ubuntu AMIs)

Step 1: Store API Key in SSM Parameter Store

Never hardcode secrets in user data. Use SSM Parameter Store with encryption:

aws ssm put-parameter \
  --name "/kloudmate/api-key" \
  --value "your-actual-km-api-key" \
  --type SecureString \
  --region ap-south-1

Step 2: IAM Permissions

Ensure your ASG's IAM Instance Profile has permission to read the SSM parameter:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "ssm:GetParameter",
      "Resource": "arn:aws:ssm:ap-south-1:YOUR_ACCOUNT_ID:parameter/kloudmate/*"
    }
  ]
}

If using SSM Run Command (Step 5, Option B), also add:

{
  "Effect": "Allow",
  "Action": [
    "ssm:SendCommand",
    "ssm:ListCommandInvocations"
  ],
  "Resource": "*"
}

Step 3: Create the User Data Script

Save this as userdata.sh:

#!/bin/bash
set -euo pipefail

# -----------------------------------------------
# Fetch API key from SSM (IMDSv2 compatible)
# -----------------------------------------------
REGION=$(curl -sf http://169.254.169.254/latest/meta-data/placement/region)
TOKEN=$(curl -sf -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 60")

KM_API_KEY=$(aws ssm get-parameter \
  --name "/kloudmate/api-key" \
  --with-decryption \
  --query "Parameter.Value" \
  --output text \
  --region "$REGION")

# -----------------------------------------------
# Install kmagent
# -----------------------------------------------
curl -fsSL https://raw.githubusercontent.com/kloudmate/km-agent/main/install.sh \
  | bash -s -- \
    --api-key "$KM_API_KEY" \
    --colector-endpoint "https://otel.kloudmate.com"

# -----------------------------------------------
# Verify
# -----------------------------------------------
if systemctl is-active --quiet kmagent; then
  echo "[kloudmate] Agent installed and running"
else
  echo "[kloudmate] ERROR: Agent failed to start"
  journalctl -u kmagent --no-pager -n 20
  exit 1
fi

Step 4: Apply User Data to Launch Template

# Create a new Launch Template version with user data
aws ec2 create-launch-template-version \
  --launch-template-name your-template-name \
  --source-version '$Latest' \
  --launch-template-data '{
    "UserData": "'$(base64 -w0 userdata.sh)'"
  }'

# Point ASG to use the latest version
aws autoscaling update-auto-scaling-group \
  --auto-scaling-group-name your-asg-name \
  --launch-template "LaunchTemplateName=your-template-name,Version=\$Latest"
  1. Go to EC2 → Launch Templates → your-template
  2. Click Actions → Modify template (Create new version)
  3. Scroll to Advanced details → User data
  4. Paste the contents of userdata.sh
  5. Click Create template version
  6. Go to Auto Scaling Groups → your-asg
  7. Click Edit → Launch Template → Version → Latest
  8. Save

Step 5: Roll Out to Existing Instances

The new user data only executes on newly launched instances. To deploy on your existing 50 instances, choose one of the following:

Rolling Replace

Terminates and replaces instances in batches. Each new instance boots with the updated user data and gets kmagent automatically.

aws autoscaling start-instance-refresh \
  --auto-scaling-group-name your-asg-name \
  --preferences '{
    "MinHealthyPercentage": 90,
    "MaxHealthyPercentage": 110,
    "InstanceWarmup": 120
  }'

How it works:

  • Keeps 90% of instances healthy at all times
  • Replaces ~5 instances at a time (for a fleet of 50)
  • Each batch waits for health checks to pass before proceeding
  • Total rollout time: ~20-30 minutes depending on health check config

Monitor progress:

aws autoscaling describe-instance-refreshes \
  --auto-scaling-group-name your-asg-name

In-Place Install, No Downtime

Installs the agent on running instances without replacing them. Fastest path to full coverage.

aws ssm send-command \
  --targets "Key=tag:aws:autoscaling:groupName,Values=your-asg-name" \
  --document-name "AWS-RunShellScript" \
  --parameters 'commands=[
    "REGION=$(curl -sf http://169.254.169.254/latest/meta-data/placement/region)",
    "TOKEN=$(curl -sf -X PUT http://169.254.169.254/latest/api/token -H X-aws-ec2-metadata-token-ttl-seconds:60)",
    "KM_API_KEY=$(aws ssm get-parameter --name /kloudmate/api-key --with-decryption --query Parameter.Value --output text --region $REGION)",
    "curl -fsSL https://raw.githubusercontent.com/kloudmate/km-agent/main/install.sh | bash -s -- --api-key $KM_API_KEY --collector-endpoint https://otel.kloudmate.com"
  ]' \
  --max-concurrency "10" \
  --max-errors "5" \
  --comment "Install kmagent on ASG fleet"

How it works:

  • Runs on 10 instances at a time (--max-concurrency 10)
  • All 50 instances done in 5 batches
  • Stops if more than 5 instances fail (--max-errors 5)
  • Zero downtime — no instance restarts required

Monitor progress:

# Get command ID from send-command output, then:
aws ssm list-command-invocations \
  --command-id "COMMAND_ID" \
  --details \
  --query "CommandInvocations[].{Instance:InstanceId,Status:Status}" \
  --output table

Which Option to Choose?

CriteriaOption A (Instance Refresh)Option B (SSM Run Command)
DowntimeBrief per-instance (rolling)None
Speed~20-30 min~5-10 min
Instance stateFresh instancesExisting instances preserved
RiskLower (clean slate)Slightly higher (in-place)
Use whenYou want clean rolloutYou need it NOW

Recommended approach: Use Option B for the immediate install on all 50 instances today, and the Launch Template user data ensures every future instance is covered automatically.


Verification

After deployment, verify the agent is running across your fleet:

# Check all instances via SSM
aws ssm send-command \
  --targets "Key=tag:aws:autoscaling:groupName,Values=your-asg-name" \
  --document-name "AWS-RunShellScript" \
  --parameters 'commands=["systemctl status kmagent | head -5"]' \
  --max-concurrency "50" \
  --output text

# Or SSH into any instance and check
systemctl status kmagent
journalctl -u kmagent -f        # follow logs
curl -s http://localhost:13133   # health check endpoint

Troubleshooting

IssueCauseFix
unable to locate credentialsIAM Instance Profile missing or no SSM permissionsAttach IAM role with ssm:GetParameter to your Launch Template
command not found: awsAWS CLI not installed on instanceAdd yum install -y aws-cli or apt install -y awscli before the SSM call in user data
kmagent failed to startPort conflict or config errorCheck journalctl -u kmagent -n 50 on the instance
SSM command stuck in PendingSSM Agent not runningEnsure amazon-ssm-agent is installed and running
Instance refresh stuckHealth check failing on new instancesCheck your ASG health check grace period and target group settings

On this page