Skip to main content

Regular Expressions

Regular Expressions (regex) is available for filtering against properties in J1Ql.

Example Queries

All Administrator Roles

Find all roles called Administrator using a case-insensitive search on the role name

FIND AccessRole WITH name = /administrator/i

All AWS Policies Allowing Create or Delete

Find all AWS entities that are related by an ALLOW policy which includes Update and Delete permission flags

FIND * WITH _integrationType = "aws"
THAT ALLOWS >> AS r *
WHERE r.permissionFlags = /..UD.../
RETURN TREE

Supported Features

The features available today are restricted based on what is supported by the upstream storage services, and what is considered to be safe in regex for performance and complexity.

Character Classes

Standard character classes are supported:

  • [0-9]
  • [a-zA-Z]
  • [a-zA-Z0-9]

Some shorthand classes are also supported:

  • \d
  • \D
  • \w
  • \W

The following are currently NOT supported:

  • \s and \S these whitespace shorthand classed are not supported. You can use a literal space in your regex.
  • POSIX character classes such as [:digit:]
  • Literal whitespace in the regex needs to be in a character class. For example /john[ ]smith/ to match the string john smith.
info

The limitation on use of \s and \S and whitespace is due to the regex implementaion in ElasticSearch. This limitation is expected to be resolved soon.

Anchor Tags

The start ^ and $ anchor tags are not supported at this time, although regex filters can be combined with the ^= starts with and $= ends with comparison operators:

FIND User WITH name ^= /john/i
FIND User WITH name $= /smith/i

Alternation

Regular expression | alternation is not supported, although multiple regex filters can be applied to the same field using the J1QL AND and OR syntax:

FIND User WITH (name = /john/i OR name = /smith/i)

Other Unsupported Features

Regex has many features, some additional not currently supported features:

  • Lookarounds
  • Atomic Groups
  • Possessive Quantifiers

Named capture groups

Named capture groups are only supported in RETURN statements and only supported by invoking the REGEX function.

The REGEX function takes two parameters:

  1. The property to search
  2. The regex to search with

REGEX requires that the regex argument has a named capture group. Anything else will fail to parse.

Capture group names

Capture group names must only be made up of letters. Anything else will fail to parse.

Return values

By default, the REGEX function will return the matched group as its own column, with the capture group name as the column name. The entirety of the match will not be included unless the REGEX function itself is aliased.

In practice, that means that

FIND User as u RETURN REGEX(u.username, '(?<firstLetter>\w).*')

will return a single column firstLetter that contains the first letter of the match. The rest of the match is discarded.

FIND User as u RETURN REGEX(u.username, '(?<firstLetter>\w).*(?<lastLetter>\w)') as username

returns three columns, firstLetter which contains the first letter of the match, lastLetter which contains the last letter of the match, and username which contains the entirety of the match.