S3 Datasource Configuration

Enterprise OPA supports pulling in data from any S3-compatible blob store, e.g. AWS S3 and Google Cloud Storage. This allows you to use a common storage option to push data to, and have that data available and up to date for policy evaluations in Enterprise OPA.

Example Configuration

The S3 integration is provided via the data plugin, and needs to be enabled in Enterprise OPA's configuration.

Minimal

# enterprise-opa-conf.yaml
plugins:
  data:
    all.users:
      type: s3
      url: s3://databucket/users.json
      access_id: "${AWS_ACCESS_KEY_ID}"
      secret: "${AWS_SECRET_ACCESS_KEY}"

With this minimal configuration, Enterprise OPA will pull in the users.json file from the databucket bucket on Amazon S3 every 5 minutes.

All of this, and various more settings for S3-compatible APIs can be configured using an advanced configuration:

Advanced

# enterprise-opa-conf-advanced.yaml
plugins:
  data:
    all.users:
      type: s3
      url: s3://databucket/users.json
      endpoint: https://s3-api.internal:9000/
      access_id: "${AWS_ACCESS_KEY_ID}"
      secret: "${AWS_SECRET_ACCESS_KEY}"
      region: us-west-1
      polling_interval: 10s # minimum: 10s

With a config like this, Enterprise OPA will retrieve the file from the specified bucket location, and attempt to parse it as any of the following formats:

XML
YAML
JSON

The result will then be available to all policy evaluations under data.all.users.

Using Google Cloud Storage

Google Cloud Storage is also available as an S3-compatible blob store. Follow the Google migrate tutorial to setup your project and generate the appropriate API keys. Then configure your data plugin as usual, replacing the protocol in your url key to gs://.

# enterprise-opa-conf.yaml
plugins:
  data:
    all.users:
      type: s3
      url: gs://databucket/users.json
      access_id: "${GS_ACCESS_KEY_ID}"
      secret: "${GS_SECRET_ACCESS_KEY}"

Example Call

If the referenced S3 bucket contains a users.json file with this content,

[
  {
    "username": "alice",
    "roles": [
      "admin"
    ]
  },
  {
    "username": "bob",
    "roles": []
  },
  {
    "username": "catherine",
    "roles": [
      "viewer"
    ]
  }
]

then Enterprise OPA's data.all.users will look like this:

curl "http://127.0.0.1:8181/v1/data/all/users?pretty"
{
  "result": [
    {
      "roles": [
        "admin"
      ],
      "username": "alice"
    },
    {
      "roles": [],
      "username": "bob"
    },
    {
      "roles": [
        "viewer"
      ],
      "username": "catherine"
    }
  ]
}

note

The key below data in the configuration (all.users in the example) can be anything you want, and determines where the retrieved document will be found in Enterprise OPA's data hierarchy.

Data Transformations

The rego_transform attribute specifies the path to a rule used to transform data pulled from S3 into a different format for storage in Enterprise OPA.

rego_transform policies take incoming messages as JSON via input.incoming and returns the transformed JSON.

Example

Starting with the Enterprise OPA configuration above and the example data above

Our data.e2e.transform policy is:

package e2e

import rego.v1

transform.users := {name |
	some entry in input.incoming
	name := entry.username
}

transform.roles[id] := members if {
	some entry in input.incoming
	some role in entry.roles

	id := role

	members := role_members(id)
}

role_members(name) := {id |
	some entry in input.incoming
	name in entry.roles
	id := entry.username
}

Then the data retrieved by the S3 plugin would be transformed by the above into:

curl "${ENTERPRISE_OPA_URL}/v1/data/all/users?pretty"
{
  "result": {
    "roles": {
      "admin": [
        "alice"
      ],
      "viewer": [
        "catherine"
      ]
    },
    "users": [
      "alice",
      "bob",
      "catherine"
    ]
  }
}

Example Configuration​

Minimal​

Advanced​

Using Google Cloud Storage​

Example Call​

Data Transformations​

Example​