Troubleshoot SSM Domain Join

aws
01-Dec, 2016

Now for those that don’t know, AWS have a really handy feature called SSM (Simple Systems Manager) which allows you to perform simple actions against either Windows or Linux hosts.

I am attempting to domain join a Windows 2016 instance to an AWS AD Enterprise Directory Service and am not having any joy. This document details my experiences and the (I hope) fix.

The SSM Document that I am using is pretty simple and is as follows:

{
    "description": "Join instances to an AWS Directory Service domain.",
    "runtimeConfig": {
      "aws:domainJoin":{
        "properties":{
          "directoryOU": "ou=Computers,ou=domain.local",
          "directoryId": "d-97673d0000", "directoryName": "domain.local"}
        }
      },
    "schemaVersion": "1.2"
}

Instance not joining the domain

I am provisioning the instance using CloudFormation and it should be joining the domain on startup. The CloudFormation stack executes OK and the instance can be logged on using the password obtained through the AWS console.

It is connected to SSM:

aws ssm describe-instance-information --instance-information-filter-list key=InstanceIds,valueSet=i-0a529828074260000

With the result:

{
"InstanceInformationList": [
    {
        "IsLatestVersion": false,
        "ComputerName": "EC2AMAZ-GU425FN.WORKGROUP",
        "PingStatus": "Online",
        "InstanceId": "i-0a529828074260000",
        "IPAddress": "10.51.1.01",
        "ResourceType": "EC2Instance",
        "AgentVersion": "1.2.371.0",
        "PlatformVersion": "10.0.14393",
        "PlatformName": "Microsoft Windows Server 2016 Datacenter",
        "PlatformType": "Windows",
        "LastPingDateTime": 1480555932.076
    }
]
}

When I query the action association though it has failed..

aws ssm describe-association --name SSMDocumentName --instance-id i-0a529828074260000

With the result:

{
    "AssociationDescription": {
        "InstanceId": "i-0a529828074260000",
        "Date": 1480552707.186,
        "Name": "SSMDocumentName",
        "Parameters": {},
        "Status": {
          "Date": 1480552953.0,
          "AdditionalInfo":
            "{\"lang\":\"en-US\",\"name\":\"amazon-ssm-agent-default\",\"os\":\"\",\"osver\":\"1\",\"ver\":\"\"}",
          "Message": "1 out of 1 plugin processed, 0 success, 1 failed, 0 timedout",
          "Name": "Failed"
        }
    }
}

Now the SSM Logs are stored at:

c:/ProgramData/Amazon/SSM/Logs

Now, you need to scroll down in the log because it may not have been cleared out in the AMI that you used to provision your image, in my case the error was as follows:

2016-12-01 01:12:56 ERROR [instanceID=i-0a529828074260000] [MessageProcessor] error when calling AWS APIs. error details - GetMessages Error: AccessDeniedException: User: arn:aws:sts::491253400000:assumed-role/HostRole-181YC69WRAQ22/i-0a529828074260000 is not authorized to perform: ec2messages:GetMessages on resource: *
status code: 400, request id: 4ae0f258-b763-11e6-86ad-dffe6caddfa2

Clearly this is an IAM problem, my current host has the following access attached to it:

- ssm:DescribeAssociation
- ssm:GetDocument
- ssm:ListAssociations
- ssm:UpdateAssociationStatus
- ssm:UpdateInstanceInformation

Which I will update to the following:

- ssm:DescribeAssociation
- ssm:GetDocument
- ssm:ListAssociations
- ssm:UpdateAssociationStatus
- ssm:UpdateInstanceInformation
- ec2messages:AcknowledgeMessage
- ec2messages:DeleteMessage
- ec2messages:FailMessage
- ec2messages:GetEndpoint
- ec2messages:GetMessages
- ec2messages:SendReply
- ec2:DescribeInstanceStatus
- ds:CreateComputer
- ds:DescribeDirectories

This modification fixed my problem and the host was able to join the domain.

comments powered by Disqus