Skip to main content

Operations

This page describes the various operations on a data source.

Create a Data Source

warning

Skip this section only when a data source has already been created for you within a system.

At the time of writing, you can only create new data sources through the API, and those data sources will not appear in the GUI; however, you can still use them in policies. With the introduction of Multi-File Policy Authoring for Custom systems, you can now create data sources through the GUI.

Data sources are created within the systems. To create a data source, you must first decide on where you want to create it. For each one, run a PUT under v1/datasources and provide the location and hierarchical name of the data source.

The following examples shows the API call to create a data source named foo/bar (referenced in policy as foo.bar) in each location.

tip

For system and stack, replace YYY with the ID of the corresponding system or stack.

  • System: PUT https://styra-das-id.styra.com/v1/datasources/systems/YYY/foo/bar.

  • Stack: PUT https://styra-das-id.styra.com/v1/datasources/stacks/YYY/foo/bar.

  • Library: PUT https://styra-das-id.styra.com/v1/datasources/global/foo/bar.

Technically, you can mount your data sources in locations other than Systems, Stacks, and Libraries; however, the data sources are not use by the policies.

Concretely, run the following curl command to create a data source under system YYY.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
-X PUT https://styra-das-id.styra.com/v1/datasources/global/foo/bar \
-d '{
"category": "rest",
"type": "push"
}'

The category and type fields describe what kind of data source you want to create. The data source described here requires that you push the data through the DAS REST API.

tip

To configure the data source agents for Kubernetes systems, see the documentation.

Limitations

The following limitations occur when you create a data source.

  • The data source name should not be a prefix of another data source name. For example, the data source foo should not be created if the data source foo/bar already exists (or vice versa).

  • The data source name should not be a prefix of a policy name (or vice versa). For example, if the policy package foo.bar exists then the data source foo does not exist.

Use a Data Source

Once your data source exists, you can read and write the JSON data through the DAS API by using v1/data instead of v1/datasources as follows:

1. Read the JSON data with GET https://styra-das-id.styra.com/v1/data/global/foo/bar. For example, run the following curl command to read the JSON data from the data source.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
-X GET https://styra-das-id.styra.com/v1/data/global/foo/bar

2. Update the JSON data with PUT https://styra-das-id.styra.com/v1/data/global/foo/bar <data>. For example, run the following curl command to write the JSON data into the data source.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
-X PUT https://styra-das-id.styra.com/v1/data/global/foo/bar \
-d '{"baz": {"qux": 7}}'

Additionally, when writing policies you can use the data source with the keyword data. A slightly different Rego reference is used for each of the systems, stacks, and libraries.

  • Systems: A DAS system provides a sandbox so that anything located under v1/data/system/YYY/ appears as if it is in the root data namespace. A system's data sources (or policies) are not accessible by other systems. For example, data source v1/datasource/system/YYY/foo/bar is referenced as data.foo.bar.

  • Stacks: A DAS Stack exists in the global namespace so that it can be used across many systems. The data source v1/datasource/stack/YYY/foo/bar is referenced from Rego as data.stack.YYY.foo.bar.

  • Libraries: The DAS Library is a global collection of Rego that all systems and all stacks may choose to import and use. The data source v1/datasource/global/foo/bar is referenced from Rego as data.global.foo.bar.

The following shows an example when the data source is created in the library.

package rules

datasource_succeeded { data.global.foo.bar.baz.qux == 7 }

Delete a Data Source

You can delete a data source by running a DELETE command on the data source's name located under v1/datasources. Delete the data source with DELETE https://styra-das-id.styra.com/v1/data/global/foo/bar.

For example, run the following curl command to delete the data source constructed earlier.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
-X DELETE https://styra-das-id.styra.com/v1/datasources/global/foo/bar

Once the data source is deleted, subsequent PUT commands will return errors. However, a policy referencing the data source will not produce an error. Instead, the value of that data reference will be undefined in Rego.

Git Data Sources

Another option for making JSON data available to policies is to store that data in Git and use a special data source that automatically reads the JSON out of Git. At present, the GUI will not show the files in that repository, but the remainder of the DAS functionality will work properly. For example, distributing those policies to OPA and evaluating those policies.

Mount JSON Files

The Git data source described in this section can be used to mount JSON files inside the Library. For more details on mounting Git repositories, see the Git-mounting page.

When you mount a Git repository, choose a directory that contains only JSON files named data.json. The directory that the JSON file exists in will correspond to the package it is loaded at.

For example, suppose you have the following directory structure in a Git repository.

โ”œโ”€โ”€ systemmount
โ”œโ”€โ”€ a
โ”œโ”€โ”€ b
โ”œโ”€โ”€ data.json

The contents of data.json:

{
"foo": 17,
"bar": "a string"
}

Use the following steps to mount the contents of data.json within a system, so that you can reference it within Rego with data.myroot.a.b by choosing the mount point systems/<systemid>/myroot. One thing that differs is that your repository likely requires credentials.

1. Create a DAS secret with credentials that will access the Git repository, if you have not created one already. To create a secret named alice/repos/data, run the following curl command.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
-X PUT https://styra-das-id.styra.com/v1/secrets/alice/repos/data -d '{
"name": "alice",
"secret": "super-secret-Password-44321"
}'

2. Mount the Git repository using that secret. You must provide your bearer token for XXX and the system ID for YYY. The system ID is located on your Systems >> Settings >> General page.

To mount the directory systemmount within the Git repository to the mount point systems/<systemid>/myroot, run the following curl command.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
-X PUT \
https://styra-das-id.styra.com/v1/datasources/systems/YYY/myroot -d '{
"category": "git/rego",
"type": "pull",
"url": "https://github.com/timothyhinrichs/gitsave.git",
"path": "systemmount",
"reference": "refs/heads/master",
"credentials": "alice/repos/data"
}'

3. Check the STATUS of that data source to make sure everything is working. For example, the STATUS of the data source should have the code set to Finished if everything is functioning properly.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' \
https://styra-das-id.styra.com/v1/datasources/systems/YYY/myroot -d '{

4. Finally, add the following Rego to one of the policy files within your system. To verify if the mounting process is completed successfully, you must preview the file to check that mounted_correctly is assigned true.

mounted_correctly {
data.myroot.a.b.foo == 17
}

S3 Data Sources

S3 data sources are similar to the Git data sources. Instead of automatically reading the JSON out of Git, they read the data from a bundle stored in a S3 bucket. With the Git data sources, the GUI does not show the files in that bundle, but the remainder of the DAS functionality works properly.

Mount JSON Files

For the required layout of the files stored in a bundle, see how to mount JSON files with the Git Data Sources. The root of the bundle should have the same contents as the directory pointed by the Git data source "path" configuration.

To mount the S3 bundle within a system:

1. Create a DAS secret with credentials that will access the S3 bucket. To create a secret named alice/buckets/data, run the following curl command. You must provide your bearer token for XXX and your AWS credentials.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json'
-X PUT https://styra-das-id.styra.com/v1/secrets/alice/buckets/data -d '{
"name": "Insert AWS access key id",
"secret": "Insert AWS secret access key"
}'

2. Mount the S3 bundle using the DAS secret. Again, you must provide your bearer token for XXX and the system ID for YYY. The system ID is located on your Systems >> Settings >> General page. To mount the files within the bundle repository to the mount point systems/<systemid>/myroot, run the following curl command.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json'
-X PUT https://styra-das-id.styra.com/v1/datasources/systems/YYY/myroot -d '{
"category": "bundle/s3",
"type": "pull",
"bucket": "insert s3 bucket here",
"path": "insert s3 object path to bundle here",
"region": "insert S3 bucket region",
"credentials": "alice/bundles/data",
"endpoint": "leave empty unless S3 API endpoint needs to be set"
}'

3. Check the STATUS of that data source to make sure everything is working. For example, the STATUS of the data source should have the code set to Finished if everything is functioning properly.

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json' https://styra-das-id.styra.com/v1/datasources/systems/YYY/myroot

HTTP Data Sources

HTTP data source is similar to Git or S3 data sources. Instead of reading the data from a storage, the HTTP data source reads data from an external server by making HTTP requests.

Configure HTTP Data Sources

HTTP data source plugin supports both common and more advanced HTTP queries with the ability to use custom HTTP headers.

To create the HTTP data source plugin, run the following curl command:

curl -H 'authorization: bearer XXX' -H 'Content-Type: application/json' https://styra-das-id.styra.com/v1/datasources/http/url -XPUT -d'
{
"category": "http",
"type": "pull",
"url": "<datasource url>",
"polling_interval: 60,",
"headers": [
{
"name": "<header name>",
"value": "<header value>",
"secret_id": "<id of the stored secret>"
}
],
"skip_tls_verification": true,
"ca_certificate": "<pem file>"
}'
  1. The url parameter is a link to an endpoint that returns a data in JSON or YAML format.

  2. The polling_interval parameter holds the interval value of 60 in seconds. The value is float.

  3. The headers parameter is a list of custom headers with the following information:

    - `name`: Name of the header. This field is mandatory.

    - `value`: Value of the header stored as a plain text.

    - `secret_id`: Name of the secret stored in the system. For more information about secrets, see the [secrets API](https://test.styra.com/v1/docs/redoc.html#tag/secrets) page.
    important

    Set the value or secret_id, but not both.

  4. The skip_tls_verification allows to ignore any custom or wrong certificates. Default value: false.

  5. The ca_certificate allows to use a custom CA certificate. A certificate should be uploaded as plain text in pem format.

Configuration Example

{
"category": "http",
"type": "pull",
"url": "https://www.example.com/api/test",
"headers": [
{
"name:": "Env",
"value": "QA"
},
{
"name:": "Authorization",
"secret_id": "auth/qa-token"
}
]
}

auth/qa-token secret id:

{
"description": "Bearer token",
"name": "qa-token",
"secret": "Bearer SUPER-QA-TOKEN"
}

The HTTP data source makes a request similar to the following curl command:

curl -XGET https://www.example.com/api/test -H'Content-Type: application/json, text/vnd.yaml, application/yaml, application/x-yaml, text/x-yaml, text/yaml, text/plain' -H'Env: QA' -H'Authorization: Bearer SUPER-QA-TOKEN'

Policy Filtering

A policy filter is used to poll from a data source that you want to transform captured data source information before storing it. Specifying a policy filter and query will allow you to apply Rego transformations before it is persisted as data. This mechanism is useful for filtering out data that you no longer want to store or for any other mutations that you want to perform.

It works by specifying a policy that is evaluated through Rego with your data source HTTP as input. You also specify a query to apply to that policy and data. The result of that query will be stored as data, instead of what is polled from the HTTP endpoint.

In addition to the standard options, you must specify the following when you create or update a data source:

  1. The policy_filter parameter is the ID of a policy you want to use for filtering.

  2. The policy-query parameter is the Rego query you want to evaluate.

The following is an example of a curl command with additional parameters:

curl -H 'authorization: bearer XXX' -H 'Content-Type: application/json' https://styra-das-id.styra.com/v1/datasources/http/url -XPUT -d'
{
"category": "http",
"type": "pull",
"url": "<datasource url>",
"headers": [
{
"name": "<header name>",
"value": "<header value>",
"secret_id": "<id of the stored secret>"
}
],
"skip_tls_verification": true,
"ca_certificate": "<pem file>",
"policy_filter": "/my/test/policy",
"policy_query": "<rego statement>"
}'

Policy Filtering Example

The following example explains how to filter a policy and write a query that allows you to apply Rego transformations before it is persisted as data.

1. Data returned from <URL>:

{
"servers": [
{"id": "app", "protocols": ["https", "ssh"], "ports": ["p1", "p2", "p3"]},
{"id": "db", "protocols": ["mysql"], "ports": ["p3"]},
{"id": "cache", "protocols": ["memcache"], "ports": ["p3"]},
{"id": "ci", "protocols": ["http"], "ports": ["p1", "p2"]},
{"id": "busybox", "protocols": ["telnet"], "ports": ["p1"]}
],
"networks": [
{"id": "net1", "public": false},
{"id": "net2", "public": false},
{"id": "net3", "public": true},
{"id": "net4", "public": true}
],
"ports": [
{"id": "p1", "network": "net1"},
{"id": "p2", "network": "net3"},
{"id": "p3", "network": "net2"}
]
}

2. For this example, there exists a relevant policy with parsed Rego:

curl  -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json'
-X GET https://styra-das-id.styra.com/v1/policies/example/networks

package example.networks

public_server[server] { # a server exists in the public_server set if...
some i, j
server := input.servers[_] # it exists in the input.servers collection and...
server.ports[_] == input.ports[i].id # it references a port in the input.ports collection and...
input.ports[i].network == input.networks[j].id # the port references a network in the input.networks collection and...
input.networks[j].public # the network is public.
}

3. To create a new data source, run the following curl command.

curl -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json'
-X PUT -d '{"category":"http", "type":"pull", "url":"https://<mycustomdatasource>/topology", "policy_filter":"/example/networks", "policy_query": "data.example.networks.public_server[results]"}'
https://styra-das-id.styra.com/v1/datasources/systems/test/test-datasource

4. After polling occurs, the following shows the result of your query in the data:

curl -H 'Authorization: Bearer XXX' -H 'Content-Type: application/json'
-X GET https://styra-das-id.styra.com/v1/data/systems/test/test-datasource

{
"request_id": "<request ID>",
"result": [
{
"id": "app",
"ports": [
"p1",
"p2",
"p3"
],
"protocols": [
"https",
"ssh"
]
},
{
"id": "ci",
"ports": [
"p1",
"p2"
],
"protocols": [
"http"
]
}
],
"revision": "<revision>"
}