Subqueries
J1QL supports using a subquery as a filter for a property on an outer query.
These queries allow for filtering results from an outer query by specifying one or more subqueries to use. This is a powerful way to build queries across disjoint sections of the graph, or to simplify queries that would otherwise require complex traversals.
Subqueries as Filters
J1QL subqueries can be used in place of property filter values for equality (=
) and inequality (!=
) checks. This allows for the filter to be dynamic, in contrast to using something like a query parameter which is a fixed value, or list of values.
Subquery Filter Syntax
Subqueries can be used in WHERE
and WITH
filters, and can be combined with other filters. The most important thing about writing a subquery is that it must return only a single property from which to build the filter.
Subqueries are contained in parentheses and should follow the following pattern:
FIND something AS s
WHERE s.property = ( FIND something_else AS se RETURN se.id )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the subquery
The supported operators are equality (=
) and inequality (!=
) only at this time. Subqueries do not support returning regex expressions, starts|endswith, or contains.
Example
In the example below, we will look at a relatively simple query that can be completed both with and without the use of subqueries. In this example, we are looking for all AWS S3 buckets where the bucket belongs to an AWS account that falls under the Prod
organizational unit in AWS.
J1QL without subqueries:
FIND aws_organizational_unit WITH name = "Prod"
THAT HAS aws_account
THAT HAS aws_s3
THAT HAS aws_s3_bucket AS bucket
RETURN bucket.displayName, bucket.arn, bucket.accountId
J1QL with subqueries:
FIND aws_s3_bucket AS bucket
WHERE bucket.accountId = ( FIND UNIQUE aws_organizational_unit WITH name = "Prod" THAT RELATES TO aws_account AS a RETURN a.accountId )
RETURN bucket.displayName, bucket.arn, bucket.accountId
In this example, both queries are of similar complexity, and the relationships do exist in the graph to build the query without using subqueries, but the examples are not always so simple!
We may now decide we want these same results, but only where the aws_account entity has an email address that does not correspond to a User in our identity system's security team.
Given our initial J1QL without subqueries:
FIND aws_organizational_unit WITH name = "Prod"
THAT HAS aws_account <-- We would need a tangential traversal to filter these entities
THAT HAS aws_s3
THAT HAS aws_s3_bucket AS bucket
RETURN bucket.displayName, bucket.arn, bucket.accountId
We would need to filter the aws_account nodes in this query based on the email addresses of the users in our security team. This is not possible for two reasons: we cannot combine in an optional/tangential traversal halfway through our query, and JupiterOne does not build relationships between aws_account and User entities based on the email contact.
How would we solve this with Subqueries?
FIND aws_s3_bucket AS bucket
THAT HAS aws_s3
THAT HAS aws_account WITH email != ( FIND google_group WITH name = "Security" THAT HAS google_user AS u RETURN u.email )
WHERE bucket.accountId = ( FIND UNIQUE aws_organizational_unit WITH name = "Prod" THAT RELATES TO aws_account AS a RETURN a.accountId )
RETURN bucket.displayName, bucket.arn, bucket.accountId
Here there are two subqueries being used. The first is specifically filtering S3 buckets that belong to accounts where the account email is NOT found in the Google "Security" user members' emails. Note that this is a relationship that does not exist in the graph and was previously impossible to join this data. The second subquery is as before, looking only for accounts that belong to the "Prod" organizational unit.
Important Considerations and Limits
The subqueries in J1QL have the following limitations that are good to understand as you write queries:
-
RETURN limit: The subquery is limited to return up to 5000 values for building the filter. This limit has been found to provide a good balance between usability and performance. It is not possible to adjust this limit on a per-customer basis, but in the future, JupiterOne hopes to be able to increase this limit.
It is recommended that you check your subqueries to ensure that you will not exceed the 5000 return limit. If your query does exceed the limit JupiterOne will inform the user that "The query results may be incomplete. One or more subqueries exceeded the maximum number of results".
-
Subquery Performance: The subquery must be performant. Because the filter must be built in a single query as part of the outer query execution flow, it is not possible to apply JupiterOne's graph pagination strategy to the subquery. The query timeout applies to both the subquery and the outer queries' execution time.