Skip to main content

Writing valid data filtering policies

To be able to use Rego for data filtering, the policy needs to be constructed to

  1. correctly represent the desired row filtering logic
  2. properly be translatable into the target representation (such as SQL)

The subset of Rego that can be used to represent row filtering logic is colloquially referred to as the fragment.

What you will learn

You will develop an intuition for what is valid Rego for data filters depending on the target. Not every construct is supported for every target.

What is Partial Evaluation?

The translation of data policies into queries (like SQL WHERE clauses) is driven by partial evaluation (PE) of a Rego query.

With partial evaluation, callers specify that certain inputs or pieces of data are unknown. OPA evaluates as much of the policy as possible without touching parts that depend on unknown values.1

The unknown values that remain during partial evaluation represent the pieces of data that represent your filter.

tip

When only known values are used, you can use all of Rego.

Example Preamble

In our running example, we'll assume a table fruits with columns name, colour, and price. These unknown values are represented with input.<TABLE>.<COLUMN> e.g. input.fruits.name

Our data filters also depend on user information. These known values are represented with input.user

{
"name": "april",
"email": "april@corp.com",
"fav_colour": "yellow",
"budget": 10
}

Simple comparisons

The fragment supports simple comparisons, such as ==, !=, <, >, <=, >=, between unknown and known values. It is not important if the unknown is on the left-hand side ("lhs") or right-hand side ("rhs"), but it is critical that only one side is unknown:

OK
package filters

include if {
input.fruits.name == "banana" # lhs unknown, rhs constant
input.fruits.price <= input.user.budget # lhs unknown, rhs known
input.user.fav_colour == input.fruits.colour # lhs known, rhs unknown
}

SQL target: WHERE name = 'banana' AND price <= 10 AND colour = 'yellow'

As you can see the known values from input.user have been replaced.

NOT OK
package filters

include if {
input.fruits.name != input.fruits.colour # lhs and rhs unknown
input.fruits.price # plain unknown
}
SQL

For SQL translation targets, it's possible to have unknowns on both sides of the simple comparisons.

package filters

include if input.fruits.name != input.fruits.colour

SQL target: WHERE name <> colour

Built-in Functions

Certain built-in functions can be translated with certain restrictions:

  • startswith
  • endswith
  • contains
  • k in ... (not k, v in ...)

These built-in functions can only be used with unknowns on the left-hand side.

OK
package filters

include if {
startswith(input.fruits.name, "ba")
input.fruits.colour in {"blue", "green"}
}

SQL target: WHERE name LIKE 'ba%' AND colour IN ('blue', 'green').

NOT OK
package filters

include if {
endswith("apple", input.fruits.name) # rhs unknown
1, input.fruits.colour in ["blue", "green"] # k, v in ...
regexp.match(input.fruits.name, '^b[an]+$') # unsupported builtin (for unknown values)
}

Other built-in functions are not supported for usage with unknown values. If your filtering rules use other built-ins with known values, that's OK -- see below for an example.

Rules and functions

Many Rego constructs are available for building filters, with certain restrictions:

  1. every may not be used with unknown values
  2. default rules (or functions) may not be used in combination with unknown values

Nonetheless, you can use rules and functions to structure your policy, as long as these restrictions are observed:

OK
package filters

include if user_in_corp

include if {
not user_in_corp
apple_ish
}

# apple_ish rule does not use `default` or `every`
apple_ish if input.fruits.name == "pineapple"
apple_ish if input.fruits.name == "apple"

# user_in_corp only uses known values
default user_in_corp := false
user_in_corp if endswith(lower(input.user.email), "@corp.com")

SQL target: WHERE name = 'pineapple' if the user's email is not ending in @corp.com. If it does, the filter would be empty, not restricting the database query.

NOT OK
package filters

include if only_pineapples

default only_pineapples := false # default rule
only_pineapples if input.fruits.name == "pineapple"

not expressions

Expressions using not are permitted for simple expressions and built-in functions. not combined with a unknown value or a rule reference is not allowed.

OK
package filters

include if {
not input.fruits.name == "apple"
not input.fruits.colour in {"blue", "green"}
}

SQL target: WHERE (NOT name = 'apple' AND NOT colour IN ('blue', 'green')).

NOT OK
package filters

include if not input.fruits.name # plain unknown

include if not apple_ish # not + rule
apple_ish if endswith(input.fruits.name, "apple")
apple_ish if startswith(input.fruits.name, "apple")