Centralizing SSM Data from Multiple Accounts

Published on 11 May 2021

SSM inventory is great, but when you have lots of different accounts, and need to find an instance that matches particular properties, or want to know about installed applications across the estate, you don't want to have to go and check each SSM service in each account, even programatically.

Thankfully, there is a solution. Copy your data to a centralized S3 bucket, and then query it with Athena (or even Quicksight). There are actually quite a few different pages in various Amazon sites with different ways to do this, but some of the are just not that clear.

Bucket Configuration

Get a list of the account IDs for all the accounts that will need to upload inventory data, and create an S3 bucket. You will then need to edit the permissions of the bucket and apply something like the following policy (replace the bucket name (in both resource sections), and use your own account IDs):

        {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Sid": "SSMBucketPermissionsCheck",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ssm.amazonaws.com"
                    },
                    "Action": "s3:GetBucketAcl",
                    "Resource": "arn:aws:s3:::mybucket"
                },
                {
                    "Sid": " SSMBucketDelivery",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "ssm.amazonaws.com"
                    },
                    "Action": "s3:PutObject",
                    "Resource": [
                        "arn:aws:s3:::mybucket/*/accountid=266746823218/*",
                        "arn:aws:s3:::mybucket/*/accountid=912464427539/*",
                        "arn:aws:s3:::mybucket/*/accountid=040335542091/*",
                        "arn:aws:s3:::mybucket/*/accountid=157611232727/*"
                    ],
                    "Condition": {
                        "StringEquals": {
                            "s3:x-amz-acl": "bucket-owner-full-control"
                        }
                    }
                }
            ]
        }

That will allow SSM to write the bucket from those accounts. You will also need the region that the bucket has been created in.

Configure SSM

You will need to do this in each of your accounts that you want to collect inventory from:

  1. Go to SSM
  2. Go to Fleet Manager
  3. Press Getting started (if you need to)
  4. Spend about five minutes pulling your hair out before realising that the item you want, Resource Data Syncs, for Inventory, is found by pressing the "Account Management" button and then looking for it at the bottom

fleetmanager

  1. Choose "Create Resource Data Sync"
  2. Give it a name
  3. Enter the bucket name you created above, and choose the region
  4. Press create.

resourcedatasync

If you get any 403 or similar issues at this point, it is probably your permissions on your bucket, so double/triple check those, then check again.

Once that is done, you should be syncing your inventory data to your central bucket! If you haven't enabled inventory, go and do that too, and remember the basics (such as making sure your IAM Profile for your EC2 instances has the correct roles).

Once done, you should pretty quickly see the data in your central bucket, woohoo!

Querying your Data

The data is stored in JSON format, so you have plenty of options here, but a good one is Athena. Athena allows you to run SQL queries against the data in S3. To do this, we need to go to Athena, and then we need to create a database. We create the database by running the following command in the query window:

    CREATE DATABASE ssminventory

You can then use the following command to create your first table (changing the bucket/path):

    CREATE EXTERNAL TABLE IF NOT EXISTS ssminventory.AWS_InstanceInformation (
    `AgentType` string,
    `AgentVersion` string,
    `ComputerName` string,
    `IamRole` string,
    `InstanceId` string,
    `IpAddress` string,
    `PlatformName` string,
    `PlatformType` string,
    `PlatformVersion` string
    )
    PARTITIONED BY (AccountId string, Region string, ResourceType string)
    ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
    WITH SERDEPROPERTIES (
    'serialization.format' = '1'
    ) LOCATION 's3://mybucket/AWS:InstanceInformation/'

Finally you need to partition the table:

    MSCK REPAIR TABLE ssminventory.AWS_InstanceInformation

All that is left to do now is query the data:

    select computername, instanceid, ipaddress 
    from aws_instanceinformation 
    where platformtype = 'Windows'

athena

If you want fancy graphics, feel free to check out Quicksight. Also this page has some other tables.

comments powered by Disqus