Subqueries
J1QL supports using a subquery as a filter for a property on an outer query.
These queries allow for filtering results from an outer query by specifying one or more subqueries to use. This is a powerful way to build queries across disjoint sections of the graph, or to simplify queries that would otherwise require complex traversals.
This feature is currently marked as BETA
. Please see the notes below regarding known limitations.
Subqueries as Filters
J1QL subqueries can be used in place of property filter values for equality (=
) and inequality (!=
) checks. This allows for the filter to be dynamic, in contrast to using something like a query parameter which is a fixed value, or list of values.
Subquery Filter Syntax
Subqueries can be used in WHERE
and WITH
filters, and can be combined with other filters. The most important thing about writing a subquery is that it must return only a single property from which to build the filter.
Subqueries are contained in parentheses and should follow the following pattern:
FIND something AS s
WHERE s.property = ( FIND something_else AS se RETURN se.id )
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the subquery
The supported operators are equality (=
) and inequality (!=
) only at this time. Subqueries do not support returning regex expressions, starts|endswith, or contains.
Example
In the example below, we will look at a relatively simple query that can be completed both with and without the use of subqueries. In this example, we are looking for all AWS S3 buckets where the bucket belongs to an AWS account that falls under the Prod
organizational unit in AWS.
J1QL without subqueries:
FIND aws_organizational_unit WITH name = "Prod"
THAT HAS aws_account
THAT HAS aws_s3
THAT HAS aws_s3_bucket AS bucket
RETURN bucket.displayName, bucket.arn, bucket.accountId
J1QL with subqueries:
FIND aws_s3_bucket AS bucket
WHERE bucket.accountId = ( FIND UNIQUE aws_organizational_unit WITH name = "Prod" THAT RELATES TO aws_account AS a RETURN a.accountId )
RETURN bucket.displayName, bucket.arn, bucket.accountId
In this example, both queries are of similar complexity, and the relationships do exist in the graph to build the query without using subqueries, but the examples are not always so simple!
We may now decide we want these same results, but only where the aws_account entity has an email address that does not correspond to a User in our identity system's security team.
Given our initial J1QL without subqueries:
FIND aws_organizational_unit WITH name = "Prod"
THAT HAS aws_account <-- We would need a tangential traversal to filter these entities
THAT HAS aws_s3
THAT HAS aws_s3_bucket AS bucket
RETURN bucket.displayName, bucket.arn, bucket.accountId
We would need to filter the aws_account nodes in this query based on the email addresses of the users in our security team. This is not possible for two reasons: we cannot combine in an optional/tangential traversal halfway through our query, and JupiterOne does not build relationships between aws_account and User entities based on the email contact.
How would we solve this with Subqueries?
FIND aws_s3_bucket AS bucket
THAT HAS aws_s3
THAT HAS aws_account WITH email != ( FIND google_group WITH name = "Security" THAT HAS google_user AS u RETURN u.email )
WHERE bucket.accountId = ( FIND UNIQUE aws_organizational_unit WITH name = "Prod" THAT RELATES TO aws_account AS a RETURN a.accountId )
RETURN bucket.displayName, bucket.arn, bucket.accountId
Here there are two subqueries being used. The first is specifically filtering S3 buckets that belong to accounts where the account email is NOT found in the Google "Security" user members' emails. Note that this is a relationship that does not exist in the graph and was previously impossible to join this data. The second subquery is as before, looking only for accounts that belong to the "Prod" organizational unit.
Known Limitations
The subqueries in J1QL have the following limitations that are good to understand as you write queries:
RETURN limit: The subquery is limited to return up to 5000 values for building the filter. This value has been found to provide good performance in the research conducted during development. It is not possible to adjust this limit on a per-customer basis, but in the future, JupiterOne hopes to be able to increase this limit.
It is recommended that you check your subqueries to ensure that you will not exceed the 5000 return limit. At this time, JupiterOne cannot inform the user when a subquery has been truncated and results may therefore be incomplete. See: Release Status below.
Subquery Performance: The subquery must be performant. Because the filter must be built in a single query as part of the outer query execution flow, it is not possible to apply JupiterOne's graph pagination strategy to the subquery. The query timeout applies to both the subquery and the outer queries' execution time.
Release Status
At this time, the subqueries capability is released as a BETA
feature. This means that while we believe the feature is functionally complete, we are also aware of specific limitations or shortcomings that we wish to overcome, and we are expecting feedback from customers on the future of the feature.
The notable improvement that is currently pending, and the specific enhancement that will allow JupiterOne to make this feature Generally Available (GA), is query feedback. For the Known Limitations mentioned above, we need to be able to communicate to the user that those limitations have been hit, either that the subquery has timed out or returned too many results and was truncated.