Rego Built-in Function: regex.match
regex.match()
is a commonly used built-in function that checks if a string matches a
given regular expression pattern. The function returns true
if the string matches the
pattern and false
otherwise.
Some examples of policy use cases where regex.match()
might be used include:
- Validating formats, such as ensuring an email address follows a specific pattern or checking if a credit card number matches common formats.
- Matching HTTP paths to specific patterns for routing or access control purposes.
Check out regex101.com and use the RE2 syntax to test your Rego patterns in a visual way.
Examples
Pattern email validation
Validating emails with Regular Expressions is a common policy task. Email validation is more complicated than just checking an email matches a pattern, but since a Rego policy is often a first point of contact, doing a pattern based test on emails is still a good idea as it can help surface issues to users early if they make a mistake.
regex.match
is the best way to validate emails in Rego.
{}
{
"email": "hello at example.com"
}
package play
import rego.v1
example_email_1 := "foo [at] example.com"
example_email_2 := "foo@example.com"
match_1 := regex.match(`^[^@]+@[^@]+\.[^@]+$`, example_email_1)
match_2 := regex.match(`^[^@]+@[^@]+\.[^@]+$`, example_email_2)
match_3 := regex.match(`^[^@]+@[^@]+\.[^@]+$`, input.email)
Rule | Output Value | Notes |
---|---|---|
match_1 | false | |
match_2 | true | |
match_3 | Depends on user input |
Path-based access
Managing access control in web applications is crucial for security. The
following example uses Rego's regex.match
to define role-based access to
different URL paths. By associating URL patterns with user roles like "intern"
and "admin," it ensures that users only access authorized paths.
package play
import rego.v1
news_pattern := `^/news/.*`
admin_pattern := `^/admin/.*`
path_patterns := {
"intern": {news_pattern},
"admin": {news_pattern, admin_pattern},
}
default allow := false
allow if {
some pattern in path_patterns[input.role]
regex.match(pattern, input.path)
}
{
"role": "intern",
"path": "/admin/staff/123/salary"
}
{}
Rule | Output Value | Notes |
---|---|---|
allow | false | Interns can't access /admin paths. |
Validating user text input
Text provided by users is often unstructured and untrusted.
To ensure that the data is both safe to use and error-free, regex.match()
can be used to validate the data against a simple pattern.
{}
{}
package play
import rego.v1
name_pattern := `^(\p{L}+\s?)+\p{L}+$`
valid_name1 := regex.match(name_pattern, "Juan Pérez")
valid_name2 := regex.match(name_pattern, "张伟")
invalid_name1 := regex.match(name_pattern, "Juan ")
invalid_name2 := regex.match(name_pattern, "- 张伟")
Rule | Output Value | Notes |
---|---|---|
valid_name1 | true | |
valid_name2 | true | |
invalid_name1 | false | This name has a trailing space |
invalid_name2 | false | This name has - at the start |
Case insensitive matching
Sometimes data can be supplied in a variety of cases, and matches need to be the same regardless of case. One example of this when matching GitHub usernames.
This is where the (?i)
modifier comes in. In the following example we can see
how repos with different cases are matched.
package play
import rego.v1
matching_repos contains repo if {
some repo, url in input.repos
regex.match(`(?i)^github.com\/styrainc\/`, url)
}
{
"repos": {
"regal": "github.com/styrainc/regal",
"demos": "github.com/StyraInc/opa-sdk-demos",
"enterprise-opa": "github.com/styrainc/enterprise-opa",
"opa": "github.com/open-policy-agent/opa"
}
}
{}
Rule | Output Value |
---|---|
matching_repos | ["demos","enterprise-opa","regal"] |
Here are the common modifiers for regular expressions:
Flag | Description |
---|---|
i | case-insensitive (default false) |
m | multi-line mode: ^ and $ match begin/end line in addition to begin/end text |
s | let . match \n (default false) |
Read more here on the RE2 Wiki.