Skip to main content

JupiterOne Collector

The JupiterOne Collectors are used to allow integrations to gather data from within your network, or "behind the firewall". The Collector configures and runs the integration workload within your infrastructure, giving you full control over what resources are accessed and how that data is transmitted to JupiterOne.

The JupiterOne Collector is installed and registered in your infrastructure. It then becomes available as a deployment option for integrations that are marked as compatible. You can deploy more than one JupiterOne Collector simultaneously.

Architecture

The JupiterOne Collector is built as an extension of the JupiterOne platform, with configuration all managed from within the JupiterOne application. The Collector registers itself in your environment and will communicate with the JupiterOne platform to pull its configuration and job queue. There is no requirement for the JupiterOne cloud service to have access into your network, allowing for very secure data collection.

The collector orchestrates a container runtime environment for the workloads, with very straightforward installation requirements, see Installation. The JupiterOne collector is a lightweight custom container orchestration tool, running the same integration workloads as in the cloud service.

In normal operation the JupiterOne collector will be:

  • Sending regular heartbeats to the JupiterOne platform
  • Reading from its dedicated job queue

Once an integration task is scheduled for a Collector to run it will appear on the Collectors job queue. This job will provide the collector with all the details required to configure and launch the integration task. This task will be created as a one-time container that will run the integration workload.

Installation

The JupiterOne Collector runs as a container and requires no dependencies beyond the container runtime and appropriate network connectivity. Infrastructure sizing will depend on the integration workload that you need to accommodate.

System requirements

The JupiterOne Collector requires a modern Linux based x86_64 environment with a container runtime such as Docker. The Beta//1.0 release of the Collector runs integration workloads sequentially, so the infrastructure should be sized based on the single largest integration job you wish to run.

A dedicated host should be allocated for the JupiterOne collector. The container environment must not be shared with other non-JupiterOne services.

Operating SystemAny modern 64-bit systemd linux distribution should work. The JupiterOne Collector has been specifically tested against:

  • Ubuntu Server 22.04 LTS
  • Ubuntu Server 23.04
Container RuntimeWhilst any docker compatible container runtime should work (e.g., podman), only the official Docker engine has been tested and certified to work. See installation instructions here

CPU2 vCPU +1 vCPU per additional integration.

For a collector running 4 integrations 4 - 6 vCPU are recommended
RAM2 GB minimum, 8 - 12 GB recommended.

Note: Most integration jobs operate with <1GB of RAM but the memory requirement will depend on the number of entities being handled. Some integrations may require 2-10GB or available memory.

NetworkingThe Collector needs to be able to connect to JupiterOne's services at *.us.jupiterone.io or *.eu.jupiterone.io over HTTPS/443. The JupiterOne Collector host also needs connectivity to the integration target; e.g., if running an Active Directory integration, it will need connectivity to the LDAP port of your AD servers.

Storage50 GB available local storage. JupiterOne integrations running on the Collector do not themselves need persistent storage. Only the JupiterOne Collector itself requires persistent storage to maintain a configuration file.

The Collector host should have sufficient local storage to hold logs for the various services.

Container Runtime

We recommend using the official Docker Engine. The installation instructions depend on the host OS, and can be found here: https://docs.docker.com/engine/install/

note

You do not need to install Docker Desktop features. For Ubuntu you would use these instructions: https://docs.docker.com/engine/install/ubuntu/

You can confirm you have a working container runtime environment on your host machine using a test command:

sudo docker run hello-world

Which should produce the following output: Example docker run result

This confirms that the docker engine is running and able to pull and launch containers.

Deploying a Collector

To set up a collector, you will need a JupiterOne account with the Collector features enabled.

  1. Navigate to Integrations > Collectors, choose New collector. Provide a name for the new collector and select Create:

Creating a collector in JupiterOne

  1. Take the terminal command and run it in your new collector instance.
note

You may need to prefix the command with sudo depending on how your permissions are configured and your current user.

Kicking off the collector setup process in JupiterOne

This will kick off the process to setup the collector:

  1. You can confirm that the collector is running using docker ps command, you should see two containers running (daemon and runner):

Using the docker ps command to confirm collector is running

  1. Finally, examine the logs of the runner to ensure it's successfully connecting to the JupiterOne Collector service. The ID of the container can be read from the ps command above:

Examining runner logs to ensure success in connecting to JupiterOne

This shows the runner polling for new integration jobs to run.

At this point the collector is set up and running. You will see the collector as "Active" in the collector overview after a few minutes:

note

The collector containers are running with a restart policy of "unless-stopped", so they will restart automatically on failure, and will start automatically when the container runtime (re)starts, i.e. at system reboot time.

JupiterOne Collecter status Active in the JupiterOne dashboard

Assigning an Integration

Assigning an integration job to a collector first requires that there are collectors registered and available. Once collectors are available, the process for defining an integration job and assigning it to a collector is straightforward.

For integrations that are collector compatible, complete the integration configuration as normal. During configuration, you'll notice there's an additional option to choose where the integration should run.

Select Collector on the integration instance, and choose the corresponding collector for which you'd like the integration to run.

Choosing run on Collector within the JupiterOne integration instance configuration

Removing a Collector

A collector can be removed by completing all the following steps:

  1. Remove the collector from the JupiterOne console using the "Delete Collector" option.

    NOTE: a collector cannot be removed if it has configured integrations, please reassign where to run the integration first.

  2. On the collector host machine stop and removed (or force remove) the collector daemon and runner containers.

If you try deleting a collector that has integrations configured you will see a message like the following:

Removing a JupiterOne Collector

Known Limitations (Beta)

There are some known limitations in the initial version of the JupiterOne Collectors. These are going to be addressed in future versions.

Unable to Migrate Integrations

It is not currently possible to migrate an existing integration job from the JupiterOne servers to a collector, or between collectors.

The most notable impact of this is that, should a collector be removed or become unavailable, the integration jobs will need to be reconfigured on a new collector.

Integration Jobs are run in Parallel

If a collector is configured to run multiple integration jobs at the same time, these will be run in parallel. The collector pulls its tasks from a job queue, and integration jobs can be placed on the queue simultaneously. When this happens the runner will read and start the jobs as soon as it sees them, irrespective of any jobs already running or the capacity available on the collector.

It should be noted that multiple identical jobs will not queue up, nor will identical jobs run in parallel.

In a future version of the collectors we hope to use historical performance metrics (i.e., CPU and RAM usage) to run simultaneous integration tasks only where the collector capacity permits this.

Limited Options for High Availability (HA)

The JupiterOne collectors are single nodes at this time. This means it's not currently possible to have multiple instances of a collector processing a single integration queue. If the collector becomes unavailable, all integrations assigned to that collector will be held in the queue, although no new duplicate integration tasks will be added to the queue.

In a future version, it will be possible to allocate multiple collectors to a "pool" and have them round-robin the integration jobs; meaning that should one collector instance fail or become unavailable the integration job queue will be processed by the remaining collectors.

Security Considerations

By running JupiterOne Collectors in your own infrastructure, you are entering into a shared responsibility model for the security of that infrastructure. Whilst we at JupiterOne do everything we can to protect your data and configurations, it is the customers responsibility to ensure that the collector host machine is secure.

The following recommendations should be reviewed and considered:

  • Ensure that the host OS and container runtime are kept up to date.
  • Ensure that access to the collector host machine is properly managed.

The collector stores the private key used for encryption of the job configs in /etc/.j1config/. This directory is set to be readable only by root, but depending on how you operate the container runtime these permissions should be reviewed.

caution

If this key is accessed the integration jobs for the collector can be decrypted, potentially allowing the credentials used for the integrations to be read. It's important that all integration accounts are minimal read-only accounts where possible.

FAQ

Q: Can I run the Collector using EKS, ECS, or OpenShift?

No, the JupiterOne Collector is not just a container. The daemon and runner containers that get installed actually orchestrate the container environment using an approach similar to docker-on-docker. The JupiterOne Collector is essentially a mini container orchestration tool. The Collector should be installed on a dedicated machine (typically a small Ubuntu VM or similar) where the machine is dedicated to running integration workloads.

Q: Does J1 Collector do network scanning?

No, the JupiterOne collector does not do any additional scanning in your environment. The JupiterOne Collector can only run integration jobs that already exist.

What the Collector does give is the ability for JupiterOne and the community to build new integration types that would not have been possible from a saas-only integration workload.

Q: The integration I want to configure doesn't have a collector option

Not all integrations are compatible with collectors, either because of the way they're built (e.g., the AWS integration and its use of AWS roles for authentication), or because the integration has not been packaged and tested yet.

If there is an integration that you would want to run on a collector that isn't currently compatible, please contact us.

Q: What network connectivity is required?

The collector needs to be able to reach the following domains:

  • ghcr.io, pkg-containers.githubusercontent.com and github.com. The GitHub package registry where the collector and integration images are hosted.
  • *.jupiterone.io, where the collector registers, sends it heartbeats, reads the integration job queue, and finally sends the data collected by the integration jobs. This can be optionally restricted to the region specific instance, such as *.us.jupiterone.io or *.eu.jupiterone.io.
  • Whatever domains the integration itself needs to connect to, for example your local vSphere instance.

Q: Do all features of all integration work?

No, there are some limitations when the integration is running on a JupiterOne collector, but generally these limitations don't impact real use cases. For example, you may find that some integrations offer the ability to "Test Credentials", this operation will only work if the integration target is reachable from the JupiterOne Cloud infrastructure, in which case you're probably not going to run that integration in a collector.

There may also be issues with some of the authentication methods, for example where the integration authenticates based on a shared trust between JupiterOne infrastructure and the target of the integration.

Q: Can I run the JIRA integration (e.g., create tickets)

No, the JupiterOne Collector only runs ingest based integration workloads, it does not include any action/alert based integrations at this time.

Q: I installed a J1 Collector but I don’t see it registered in my JupiterOne console

This indicates that the JupiterOne Collector was unable to start or register with JupiterOne. The first step is to review the logs produced by the JupiterOne collector components.

Running the command docker ps --all will show which containers ran (or are running). You can expect to find references to an "installer", "runner" and "daemon" containers. Please use docker logs <CONTAINER_ID> to retrieve the logs associated with the containers and review, attaching the logs to a support case, if required.

caution

Please be sure to redact any sensitive details prior to sending.