S3 Datasource Configuration
Enterprise OPA supports pulling in data from any S3-compatible blob store, e.g. AWS S3 and Google Cloud Storage. This allows you to use a common storage option to push data to, and have that data available and up to date for policy evaluations in Enterprise OPA.
Example Configuration
The S3 integration is provided via the data
plugin, and needs to be enabled in Enterprise OPA's configuration.
Minimal
# enterprise-opa-conf.yaml
plugins:
data:
all.users:
type: s3
url: s3://databucket/users.json
access_id: "${AWS_ACCESS_KEY_ID}"
secret: "${AWS_SECRET_ACCESS_KEY}"
With this minimal configuration, Enterprise OPA will pull in the users.json
file from the
databucket
bucket on Amazon S3 every 5 minutes.
All of this, and various more settings for S3-compatible APIs can be configured using an advanced configuration:
Advanced
# enterprise-opa-conf-advanced.yaml
plugins:
data:
all.users:
type: s3
url: s3://databucket/users.json
endpoint: https://s3-api.internal:9000/
access_id: "${AWS_ACCESS_KEY_ID}"
secret: "${AWS_SECRET_ACCESS_KEY}"
region: us-west-1
polling_interval: 10s # minimum: 10s
With a config like this, Enterprise OPA will retrieve the file from the specified bucket location, and attempt to parse it as any of the following formats:
- XML
- YAML
- JSON
The result will then be available to all policy evaluations under data.all.users
.
Using Google Cloud Storage
Google Cloud Storage is also available as an S3-compatible blob store. Follow the Google migrate tutorial to setup your project and generate the appropriate API keys. Then configure your data plugin as usual, replacing the protocol in your url
key to gs://
.
# enterprise-opa-conf.yaml
plugins:
data:
all.users:
type: s3
url: gs://databucket/users.json
access_id: "${GS_ACCESS_KEY_ID}"
secret: "${GS_SECRET_ACCESS_KEY}"
Example Call
If the referenced S3 bucket contains a users.json
file with this content,
[
{
"username": "alice",
"roles": [
"admin"
]
},
{
"username": "bob",
"roles": []
},
{
"username": "catherine",
"roles": [
"viewer"
]
}
]
then Enterprise OPA's data.all.users
will look like this:
curl "http://127.0.0.1:8181/v1/data/all/users?pretty"
{
"result": [
{
"roles": [
"admin"
],
"username": "alice"
},
{
"roles": [],
"username": "bob"
},
{
"roles": [
"viewer"
],
"username": "catherine"
}
]
}
The key below data
in the configuration (all.users
in the example) can be anything you want,
and determines where the retrieved document will be found in Enterprise OPA's data
hierarchy.
Data Transformations
The rego_transform
attribute specifies the path to a rule used to transform data pulled from S3 into a different format for storage in Enterprise OPA.
rego_transform
policies take incoming messages as JSON via input.incoming
and returns the transformed JSON.
Example
Starting with the Enterprise OPA configuration above and the example data above
Our data.e2e.transform
policy is:
package e2e
import rego.v1
transform.users := {name |
some entry in input.incoming
name := entry.username
}
transform.roles[id] := members if {
some entry in input.incoming
some role in entry.roles
id := role
members := role_members(id)
}
role_members(name) := {id |
some entry in input.incoming
name in entry.roles
id := entry.username
}
Then the data retrieved by the S3 plugin would be transformed by the above into:
curl "${ENTERPRISE_OPA_URL}/v1/data/all/users?pretty"
{
"result": {
"roles": {
"admin": [
"alice"
],
"viewer": [
"catherine"
]
},
"users": [
"alice",
"bob",
"catherine"
]
}
}