Automated Asset Inventory

Asset inventory is considered an essential part of any security program. For example, it is listed as the very first item on the 18 CIS Critical Security Controls.

Collecting asset data for as-a-service environments can be challenging. Assets on these platforms, i.e. virtual machines on the hyperscalers, can be very dynamic: they can created or deleted, or undergo configuration changes, rather frequently.

Building an inventory of these resources should therefore be done with a relatively high frequency, e.g. once a day. This activity is therefore a good candidate for automation where it can be executed regularly based on a time-based schedule.

We are going to use the Argo Workflows based approach introduced in the previous post to implement asset data collection for a cloud provider environment (AWS).

For reading this blog post you should have a very basic understanding of general AWS IAM concepts and how AWS cross-account access using IAM roles works. You should also be familiar with Kubernetes service accounts.

Overview

The goal is to collect asset data once a day from an AWS cloud environment and store this data in a database. To achieve this we need the following components:

A Kubernetes (K8S) cluster for running Argo Workflows. For simplicity we assume that our cluster is running in a (security team owned) AWS account.
Argo Workflows: workflow orchestration tool.
A container image with all required tooling preinstalled
A tool that can be used to collect asset data from a source (like AWS) and write it to a database. We will be using CloudQuery that will be explained in the Data Collection section below.
A database server where the collected asset data can be stored: we are using PostgreSQL. You can deploy this into the K8S cluster using the Bitnami Helm chart for Postgres or something similar.
Argo Workflow: workflow that implements the logic for collecting asset data, executed by Argo. This workflow will be executed once a day.

The different components are visualized in the diagram below.

We assume that the K8S cluster, Argo Workflows, the Postgres server and the container image myrepo/tooling:1.0 have already been deployed/built. For the rest of this post we will focus on how to implement the workflow that performs the asset data collection.

Data Collection

How can we collect asset data from AWS?

AWS provides APIs for everything. You can not only create, update or delete resources via their API but you can also read resource metadata. For example, to retrieve the list of virtual machines (EC2 instances) that exist in an AWS account, you can call the DescribeInstances API. This returns a list of all machines with relevant metadata included (instance ID, private IP, attached disks, etc.).

We do not want to implement these API calls ourselves though.

We are therefore going to use CloudQuery; an Extract, Load and Transform (ELT) tool that allows extracting data from cloud-providers, SaaS apps and other APIs and store the collected data in a destination.

This tool is based on an extensible source and destination plugin concept that supports a large variety of data sources and destinations. Check out their plugins documentation page for an overview of supported APIs & systems. You can also extend the tool with your own custom plugins.

Unfortunately, although CloudQuery is open source, certain plugins are not for free. You will have to obtain an API key that will incur costs if you collect a certain amount of data that is above the free tier threshold. They also offer a managed, as-a-service option. For our implementation we are following a self-hosted approach though where the CloudQuery CLI tool is running within our own infrastructure.

As we are collecting asset data from AWS, we will need CloudQuery with the AWS plugin.

AWS Organizations, IAM & Kubernetes Service Account

How can CloudQuery that is running inside an Argo Workflow, a Kubernetes object, access different AWS accounts?

Companies usually own multiple AWS accounts. Those can be grouped into an AWS organization. CloudQuery can collect asset data from all accounts that exist within an organization. This works by granting CloudQuery access to a role in the organization management account from where all accounts of the organization can be assumed. If you are not familiar with this concept, I suggest to read the AWS documentation on IAM roles.

We assume that our Argo workflow will be executed in a Kubernetes cluster that is located in a security team specific AWS account. We have to associate a Kubernetes service account with this workflow. That service account must in turn have have the permissions to obtain a credential for the AWS IAM role that is available inside the AWS security account. This can be achieved by using either IAM roles for service accounts or alternatively EKS Pod Identities, assuming your are using AWS' managed Kubernetes service EKS.

We further assume that the security account IAM role has permissions to assume an IAM role in the organization management account. From there every account in the organization can be accessed with another IAM role that exists in each account. This role will need read-only permissions in each account.

All of this is shown in the diagram below.

And this is how CloudQuery can obtain the credentials and permissions required to retrieve asset information from all AWS accounts in the organization.

There are also other options available to achieve this, as can be seen in the CloudQuery AWS plugin documentation. I strongly recommend to follow this approach though.

Asset Inventory Workflow

Now that we have explained what tool we are using and how the AWS accounts can be accessed, we can finally start defining the actual workflow and it’s implementation.

The (abstract) workflow model for our asset inventory looks as follows:

Yes, that is really it! Asset data collection with CloudQuery is so simple that our workflow consists of only one step.

Implementation

We can now start implementing the workflow, which consists of two Argo resource types, the CronWorkflow and the WorkflowTemplate.

Cron Workflow

The CronWorkflow defines the parameters of the workflow, the execution schedule and a reference to the actual workflow implementation.

apiVersion: argoproj.io/v1alpha1
kind: CronWorkflow
metadata:
  generateName: asset-inventory-aws-
spec:
  schedule: "01 06 * * *"      
  timezone: "Etc/GMT"
  workflowSpec:
    arguments:
      parameters:
        - name: tables          
          value: >-
            "aws_ec2_ebs_volumes",
            "aws_ec2_eips",
            "aws_ec2_instances",
            "aws_ec2_nat_gateways",
            "aws_ec2_security_groups",
            "aws_eks_clusters",
            "aws_elbv1_load_balancers",
            "aws_elbv2_listeners",
            "aws_elbv2_load_balancers",
            "aws_iam_users",
            "aws_iam_user_access_keys",
            "aws_s3_buckets"
        - name: aws-config      
          value: |
            {
              "org-mgmt-account-id"   : "123456789012",
              "org-mgmt-account-role" : "Argo",
              "account-iam-role"      : "SecurityMonitoring"
            }
    workflowTemplateRef:  
      name: asset-inventory-aws

	Execute every day at 06:01 GMT
	The resource types that CloudQuery will collect. Refer to the CloudQuery AWS plugin documentation for more information.
	We want to retrieve data for all accounts of an AWS organization. This configuration object provides input parameters that we need to implement this.
	Reference to the WorkflowTemplate that contains the actual automation logic.

This defines how often the workflow will be executed (once a day) and what AWS resource metadata we are collecting. The CloudQuery terminology is "tables" as in "database tables". CloudQuery supports a larger number of AWS resource types, but we are restricting our list to just a few entries to keep this example small.

We also specify some IAM related information, like the account ID and IAM role to be used for the management account, as well as the name of the IAM role used to access each account of the AWS organization.

Finally, the CronWorkflow contains a reference to the WorkflowTemplate that contains the automation logic.

Workflow Template

The actual workflow implementation is defined in a workflow template:

apiVersion: argoproj.io/v1alpha1
kind: WorkflowTemplate
metadata:
  name: asset-inventory-aws
spec:
  entrypoint: cloud-assets
  templates:
  # overall workflow
  - name: cloud-assets
    inputs:
      parameters:
        - name: tables
        - name: aws-config
    dag:
      tasks:
        # obtain assets for each AWS org
        - name: worker
          template: worker
          arguments:
            parameters: 
            - name: tables
              value: "{{inputs.parameters.tables}}"
            - name: org-mgmt-account-id
              value: "{{input.parameters.aws-config.org-mgmt-account-id}}"
            - name: org-mgmt-account-role
              value: "{{input.parameters.aws-config.org-mgmt-account-role}}"
            - name: account-iam-role
              value: "{{item.account-iam-role}}"
  #############################################################################
  # collect cloud asset data
  - name: worker 
    inputs:
      parameters:
      - name: tables
      - name: org-mgmt-account-id
      - name: org-mgmt-account-role
      - name: account-iam-role
    outputs:
      artifacts: 
      - name: cloudquerylog
        path: /cache/cloudquery.log
    script: 
      image: myrepo/tooling:1.0
      command: [bash]
      source: |
        set -e
        echo "Generating AWS config file" 
        mkdir .aws
        echo "[profile org-mgmt-account]" >> .aws/config
        echo "role_arn=arn:aws:iam::{{inputs.parameters.org-mgmt-account-id}}:role/{{inputs.parameters.org-mgmt-account-role}}" >> .aws/config
        echo "Generating destination plugin config file" 
        echo "kind: destination" >> postgres.yml
        echo "spec:" >> postgres.yml
        echo "  name: \"assetdb_aws\"" >> postgres.yml
        echo "  registry: \"cloudquery\"" >> postgres.yml
        echo "  path: \"cloudquery/postgresql\"" >> postgres.yml
        echo "  version: \"${POSTGRES_PLUGIN}\"" >> postgres.yml
        echo "  write_mode: \"overwrite-delete-stale\"" >> postgres.yml
        echo "  migrate_mode: \"forced\"" >> postgres.yml
        echo "  spec:" >> postgres.yml
        echo "    connection_string: \"postgresql://$${DB_USER}:$${DB_USER_PASS}@db.postgres.svc.cluster.local/aws?sslmode=prefer\"" >> postgres.yml
        #
        echo "Generating source plugin config file"
        echo "kind: source" >> aws.yml
        echo "spec:" >> aws.yml
        echo "  name: \"aws\"" >> aws.yml
        echo "  path: \"cloudquery/aws\"" >> aws.yml
        echo "  version: \"${AWS_PLUGIN}\"" >> aws.yml
        echo '  tables: [{{inputs.parameters.cq-tables}}]' >> aws.yml
        echo '  skip_dependent_tables: true' >> aws.yml
        echo "  destinations: [\"assetdb_aws\"]" >> aws.yml
        echo "  spec:" >> aws.yml
        echo "    org:" >> aws.yml
        echo "      admin_account:" >> aws.yml
        echo "        local_profile: \"org-mgmt-account\"" >> aws.yml
        echo "      member_role_name: {{inputs.parameters.account-iam-role}}" >> aws.yml
        echo "Synching data" 
        ./cloudquery sync ./ --telemetry-level none
        echo "Sync completed."
      env: 
      - name: AWS_PLUGIN
        value: "v27.23.1"
      - name: POSTGRES_PLUGIN
        value: "v8.6.2"
      - name: CLOUDQUERY_API_KEY
        valueFrom:
          secretKeyRef:
            name: cloudquery-api-key
            key: token
      - name: DB_USER
        valueFrom:
          secretKeyRef:
            name: db-credentials
            key: username
      - name: DB_USER_PASS
        valueFrom:
          secretKeyRef:
            name: db-credentials
            key: password

	These parameters are required input for our workflow template
	The workflow step that implements asset data collection
	When the workflow step is completed, this file will be collected from the container file system by Argo Workflows and made available for download
	This section contains the actual implementation of this workflow step: we are specifying a script that will be executed within the container image `myrepo/tooling:1.0`
	Generate an AWS config file with IAM role information for accessing organization management account
	Generate CloudQuery config files for source and destination plugins
	Execute CloudQuery
	Environment variables that are available inside the script

Our WorkflowTemplate has two input parameters (1):

The CloudQuery tables to be synchronized: this defines what AWS resource metadata will be collected from AWS and written to our PostgreSQL server
Configuration parameters with AWS specific account & IAM information

These parameters are provided as input to the workflow’s only workflow step, the worker task. The actual automation logic is implemented as a bash script that is executed in the specified container image.

We then define our only task, or workflow step, that receives all input parameters (2).

The actual implementation of the workflows step is a bash script that is executed inside a container image (4).

First, CloudQuery will need read access to the AWS organization management account so it can collect resource information from all AWS accounts. For this reason we are generating (5) an AWS configuration file with a profile that specifies that the management account can be accessed via a particular IAM role. As a reminder, we assume that the workflow is being executed inside the security team AWS account and has access to an IAM role that can assume an IAM role in the management account.

Afterwards (6) we create the CloudQuery configuration files that specify the source and destination plugins and their configurations. For the destination plugin, Postgres, we define the server address and the database user credentials (via environment variables). For the source plugin, AWS, we specify what profile can be used to access the management account (defined in the previously generated AWS config file) and the name of the IAM role to be used to access every individual organization member account.

We then execute CloudQuery (7) that will access the organization management account, retrieve a list of AWS accounts that exist in the organization and then accesses each account and collects the asset data that we specified (via the tables input parameter).

It is important to mention that we inject various environment variables into the running container (8), such as configuration parameters or credentials. For the latter it is assumed that the respective Kubernetes secrets objects exist in the namespace where the workflow is being executed.

Deploying and Executing

This workflow can be deployed as follows, assuming that the Kubernetes namespace $NS already exists and has been configured in Argo WF for workflow execution:

$ argo template create $PATH_WORKFLOWTEMPLATE -n $NS
$ argo cron create $PATH_CRONWORKFLOW_FILE -n $NS --serviceaccount $K8S_SERVICE_ACCOUNT_NAME

We first create the workflow template and then deploy the CronWorkflow object that references the template. It is the CronWorkflow that will cause the workflow to be regularly executed, based on the defined schedule. We also have to associate a K8S service account with the CronWorkflow, so that CloudQuery can assume the AWS IAM roles, as mentioned in AWS Organizations, IAM & Kubernetes Service Account.

A screenshot from a successful workflow execution is shown below (this is a slightly more complex workflow collecting data from three different AWS organizations).

Once completed the AWS resource metadata is available in the Postgres server.

Next Steps

With asset information stored inside a database, you can then use one of the CloudQuery provided dashboards to visualize those resources in a Grafana dashboard. An example is provided below.

An asset database is also useful for other automation uses cases. I might write about those in a future post.

Summary

We have defined a CronWorkflow that collects asset data from all accounts of an AWS organization once a day. The actual implementation of the automation logic is a bash script inside a workflow step/task that is specified in the WorkflowTemplate.

This script first generates an AWS config file that specifies the IAM roles to be used by CloudQuery. The tool is then executed, collecting asset data via the AWS API and writing this data to a PostgreSQL server (it is assumed that the DB server is hosted in the same K8S cluster where the workflow is being executed).

That’s about ~100 lines of code for a fully automated AWS asset data collection.

This should give you an idea how workflows can be implemented in Argo WF. Most importantly, as you can see, this is not rocket science.