Skip to main content

AWS S3 Data Source

AWS S3 data source pulls JSON and YAML files from an S3 bucket and recursively into directories, and loads into DAS. It uses Rego for transformation or filtering on data before it's loaded into DAS. It authenticates using IAM access key and secret access key that is stored as a secret in DAS.

Configure through the DAS GUI

The following section helps you to configure styra-das-id.styra.com to access a data stored in AWS S3 using the DAS GUI.

Create a DAS System

Go to <das-id>.styra.com. To add a new system, click the + next to SYSTEMS on the left side of the navigation panel.

Fill in the following fields:

  • System type (required): Select any system type from the drop down list. For example, Custom.

  • System name (required): A human-friendly name so that you can distinguish between the different systems.

  • Description (optional): More details about this system.

  • Leave the Show errors switch ON to display the errors.

  • Click Add system button.

Now, your DAS system is created under the SYSTEMS on the left side of the navigation panel.

Add a Data Source

After you create your system, click the three dots (⋮) next to it and select Add Data Source to start configuring the data source.

Figure 1 - Add Data Source

Now, your Custom System >> Add Data Source dialog appears.

Figure 2 - Add Data Source Window

Do the following actions in your Custom System >> Add Data Source dialog box.

  1. Type: An editable data source that you fill in with JSON data and publish. Click the down arrow to select the data source type. For example, select AWS S3 for JSON object import to pull a JSON object from a specific AWS S3 bucket. This refreshes regularly.

    Figure 3 - Data Source Type

  2. Path: Enter a new or existing path separated by /. For example, am/datasourcetypes.

  3. Data source name (required): Enter a name for the data source type. For example, am-aws-s3.

  4. Description: This field is optional.

  5. AWS region (required): A string representing the AWS region. Select one of the regions from AWS service Endpoints. For example, us-east-1.

  6. Bucket Name (and Path) (required): A string representing the bucket name. Enter the bucket name and a path within that bucket. For example, aws-s3-bucket-testing. For more information on how to setup an AWS user and S3 bucket for secure DAS S3 access, see AWS S3 Bucket Access page.

note
  • If only one file is returned from S3 then the result will contain the content of that file. For example, if the bucket name and path is tests3/test.json the result is {"foo": "bar"}.

  • If multiple files are returned from S3 then the result will have additional layers with the full folder structure and file names to avoid collisions. For example, if the bucket name and path is bucket and path: tests3/data the result is {"data": {"file.json": {"foo": "bar"}}}.

  1. Endpoint: A gateway endpoint. For more information, see AWS S3 Endpoints.

  2. Refresh interval: Enter a refresh interval which is the amount of time between polling intervals. Default is s.

  3. Access Keys for IAM Users: Enter the following access key credentials.

    • Access Key ID (required): Enter the access key ID. For more information, see AWS IAM User Access Keys.

    • Secret Access Key (required): This DAS secret is required if you are using a S3 bucket within your own AWS account.

  4. Click the arrow to expand the Advanced field.

  5. Data transform: Specify a policy and write a query that allows you to apply Rego transformations before it is persisted as data. For example, Select Custom and fill in the following fields:

    • Policy: An existing policy separated by /. For example, transform/transform.rego.

    • Rego query: Path to the Rego rule to evaluate. For example, data.transform.query.

  6. Leave the Enable on-premise data source agent switch OFF.

Now, make sure you filled all the fields similar to Figure 4.

Figure 4 - Completed Data Source Form

  1. Finally, click the Add button to add a data source.

The following shows an example output which appears after the data source is created in DAS.

{
"data": {
"s3-test.json": {
"foo1": "bar1"
},
"s3-test.yaml": {
"foo3": "bar3"
},
"s3-test.yml": {
"foo2": "bar2"
}
}
}