Skip to main content

JupiterOne Collector

The JupiterOne Collectors are used to allow integrations to gather data from within your network, or "behind the firewall". The Collector configures and runs the integration workload on an appliance within your infrastructure, giving you full control over what resources are accessed and how that data is transmitted to JupiterOne.

The JupiterOne Collector is configured in your infrastructure and registered to JupiterOne. It then becomes available as a deployment option for integrations that are marked as compatible. You can deploy more than one JupiterOne Collector simultaneously, and each collectr instance is capable of running multiple integrations.

Overview

The JupiterOne collector is intended to be installed on a dedicated appliance within your infrastructure. The collector is capable of running multiple integrations, and will coordinate the execution of these integrations using a job queue managed by the JupiterOne platform.

info

Although the collector uses container orchestration to run integrations, it is designed to be deployed on a dedicated appliance (typically a Linux VM). The collector is not designed to be run in a shared container environment such as a Kubernetes cluster or OpenShift.

The detailed installation instructions are provided below, but in summary the process is:

  1. Install the collector on a dedicated appliance within your infrastructure.
  2. Register the collector with JupiterOne.
  3. Assign integrations to the collector, and configure them to run on the collector.
  4. Monitor the collector and its integrations, and manage the collector configuration through the JupiterOne console.

Why is a container runtime required?

The JupiterOne Collector runs integration workloads as containers, so a container runtime is required. The collector is designed to be run on a dedicated appliance, and the container runtime is installed and managed there. It is not possible to run the collector in a shared container environment such as a Kubernetes cluster or OpenShift.

The JupiterOne collector uses the "docker in docker" approach to run the integration workloads, and the collector itself also runs its processes within containers. This provides a great level of isolation from the host machine, and allows the collector to be more resilient to differences in the host environment.

Installation

The JupiterOne Collector runs as a container and requires no dependencies beyond the container runtime and appropriate network connectivity. Infrastructure sizing will depend on the integration workload that you need to accommodate. Most integration workloads require less than a GB of RAM and a single vCPU, unfortunatley the sizing is dependent on the volume of data being processed rather than the specific integration workload. The recommendation is to start small and scale up as needed.

System requirements

The JupiterOne Collector requires a modern Linux based x86_64 environment with a local container runtime such as Docker. The collector runs integration workloads sequentially, so the infrastructure should be sized based on the single largest integration job you wish to run.

A dedicated host should be allocated for the JupiterOne collector. It is recommended that the host appliance is not shared with other non-JupiterOne workloads.

Operating SystemAny modern 64-bit systemd linux distribution should work. The JupiterOne Collector has been specifically tested against the following Operating Systems using the official Docker Engine:
  • Ubuntu Server 22.04 LTS
  • Ubuntu Server 23.04
  • RedHat Enterprise Linux 9.4
  • RedHat Enterprise Linux 8.9
Container RuntimeWhilst any docker compatible container runtime should work (e.g., podman), only the official Docker engine has been tested and certified to work. See Docker installation instructions here. There are specific notes later in this document if you wish to use podman instead of Docker, see Podman Container Runtime.
CPU4 vCPU

The CPU requirement is dependent on the integration workload, but 4 vCPU is a good starting point.
RAM2 GB minimum, 8 - 12 GB recommended.

Most integration jobs operate with <1GB of RAM but the memory requirement will depend on the number of entities being handled. Some very large integration instances, those handling millions of entities, may require 10GB+ of available memory.
NetworkingThe Collector needs to be able to connect to JupiterOne's services at *.us.jupiterone.io or *.eu.jupiterone.io over HTTPS/443.

The JupiterOne Collector host also needs connectivity to the integration target; e.g., if running an Active Directory integration, it will need connectivity to your local LDAP port of your AD servers.

The container images are pulled from the GitHub package registry at ghcr.io. The container images are all signed by JupiterOne, this requires connectivity to the cosign signing service also hosted at github.
Storage50 GB available local storage.

JupiterOne integrations running on the Collector do not themselves need persistent storage. Only the JupiterOne Collector itself requires persistent storage to maintain a configuration file.
The Collector host should have sufficient local storage to hold logs for the various services.

Container Runtime

We recommend using the official Docker Engine. The installation instructions depend on the host OS, and can be found here: https://docs.docker.com/engine/install/

note

You do not need to install Docker Desktop features. For Ubuntu you would use these instructions: https://docs.docker.com/engine/install/ubuntu/

You can confirm you have a working container runtime environment on your host machine using a test command:

sudo docker run hello-world

Which should produce the following output: Example docker run result

This confirms that the docker engine is running and able to pull and launch containers. At this point you are ready to deploy a collector.

Deploying a Collector

To set up a collector, you will need a JupiterOne account.

  1. Navigate to Integrations > Collectors, choose New collector. Provide a name for the new collector and select Create:

Creating a collector in JupiterOne

  1. Take the terminal command and run it in your new collector instance.
note

You may need to prefix the command with sudo depending on how your permissions are configured and your current user.

Kicking off the collector setup process in JupiterOne

Explanation of the command:

docker run -e JUPITERONE_AUTH_TOKEN='d03c294ffc3cd505608e97d96da27c70c1ff1ee278ddd86b7131373239363838XXXXXXXX' -e JUPITERONE_COLLECTOR_ID='3b445384-cccc-bbbb-aaaa-e0c3e17d6522' -e JUPITERONE_ACCOUNT_ID='j1dev' -e JUPITERONE_API_BASE_URL='https://api.dev.jupiterone.io' -v /etc/.j1config:/etc/.j1config -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/jupiterone/collector-scripts/installer:latest
  • docker run is the command to run the installer container.
  • -e are environment variables that are passed to the container.
    • JUPITERONE_AUTH_TOKEN is the authentication token for the collector.
    • JUPITERONE_COLLECTOR_ID is the ID of the collector.
    • JUPITERONE_ACCOUNT_ID is the ID of your JupiterOne account.
    • JUPITERONE_API_BASE_URL is the base URL of the JupiterOne API, varying by region.
  • -v are volume mounts for the container.
    • /etc/.j1config:/etc/.j1config is the volume mount for the collector configuration. This file contains sensitive information and is stored on the host machine. See notes below about security considerations here: Security Considerations.
    • /var/run/docker.sock:/var/run/docker.sock is the volume mount for the Docker socket. This is what allows the collector to manage the container runtime and launch the integration jobs.
  • ghcr.io/jupiterone/collector-scripts/installer:latest is the container image to run.

Running this command on your Collector host will kick off the process to setup the collector.

  1. You can confirm that the collector is running using docker ps command, you should see two containers running (daemon and runner):

Using the docker ps command to confirm collector is running

  • Daemon: This is the collector daemon, it manages the local state of the collector, performs upgrades and health checks etc.
  • Runner: This is the runner container, it manages the job queue and launches integration jobs.
  1. Finally, examine the logs of the runner to ensure it's successfully connecting to the JupiterOne Collector service and polling for integration jobs. The ID of the container can be read from the ps command above:

Examining runner logs to ensure success in connecting to JupiterOne

This shows the runner polling for new integration jobs to run.

At this point the collector is set up and running. You will see the collector as "Active" in the collector overview after a few minutes:

note

The collector containers are running with a restart policy of "unless-stopped", so they will restart automatically on failure, and will start automatically when the container runtime (re)starts, i.e. at system reboot time.

JupiterOne Collecter status Active in the JupiterOne dashboard

Podman Container Runtime

If you prefer to use podman instead of Docker that is possible with some additional configuration and packages. The following instructions are based on testing with podman on RedHat Enterprise Linux 9.4. The following instructions assume that podman is already installed and working. Installation of podman is outside the scope of this document.

The following are required to make podman work with the collector by introducing docker compatibility:

  1. Install the podman-docker and podman-remote packages:
sudo dnf install -y podman-docker podman-remote
  1. Enable the podman socket:
sudo systemctl enable --now podman.socket
  1. Create the JupiterOne configuration directory on the host machine:
sudo mkdir -p /etc/.j1config/

Then when using the installer command you will need to make two modifications to the command, call podman rather than docker and pass the --privilidged flag to the installer. You may also need to prefix the command with sudo:

sudo podman run --privileged -e JUPITERONE_AUTH_TOKEN='d03c294ffc3cd505608e97d96da27c70c1ff1ee278ddd86b7131373239363838XXXXXXXX' -e JUPITERONE_COLLECTOR_ID='3b445384-cccc-bbbb-aaaa-e0c3e17d6522' -e JUPITERONE_ACCOUNT_ID='j1dev' -e JUPITERONE_API_BASE_URL='https://api.dev.jupiterone.io' -v /etc/.j1config:/etc/.j1config -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/jupiterone/collector-scripts/installer:latest

caution

The --privileged flag is required to allow the collector to manage the container runtime when using podman. It is recommended that you review the security implications of running the collector with these privileges.

Assigning an Integration

Assigning an integration job to a collector first requires that there are collectors registered and available. Once collectors are available, the process for defining an integration job and assigning it to a collector is straightforward.

For integrations that are collector compatible, complete the integration configuration as normal. During configuration, you'll notice there's an additional option to choose where the integration should run.

Select Collector on the integration instance, and choose the corresponding collector for which you'd like the integration to run.

Choosing run on Collector within the JupiterOne integration instance configuration

Removing a Collector

A collector can be removed by completing all the following steps:

  1. Remove the collector from the JupiterOne console using the "Delete Collector" option.

    NOTE: a collector cannot be removed if it has configured integrations, please reassign where to run the integration first.

  2. On the collector host machine stop and removed (or force remove) the collector daemon and runner containers.

If you try deleting a collector that has integrations configured you will see a message like the following:

Removing a JupiterOne Collector

Known Limitations

There are some known limitations in the initial version of the JupiterOne Collectors. These are going to be addressed in future versions.

Unable to Migrate Integrations

It is not currently possible to migrate an existing integration job from the JupiterOne servers to a collector, or between collectors.

The most notable impact of this is that, should a collector be removed or become unavailable, the integration jobs will need to be reconfigured on a new collector.

Integration Jobs are run in Parallel

If a collector is configured to run multiple integration jobs at the same time, these will be run in parallel. The collector pulls its tasks from a job queue, and integration jobs can be placed on the queue simultaneously. When this happens the runner will read and start the jobs as soon as it sees them, irrespective of any jobs already running or the capacity available on the collector.

It should be noted that multiple identical jobs will not queue up, nor will identical jobs run in parallel.

In a future version of the collectors we hope to use historical performance metrics (i.e., CPU and RAM usage) to run simultaneous integration tasks only where the collector capacity permits this.

Limited Options for High Availability (HA)

The JupiterOne collectors are single nodes at this time. This means it's not currently possible to have multiple instances of a collector processing a single integration queue. If the collector becomes unavailable, all integrations assigned to that collector will be held in the queue, although no new duplicate integration tasks will be added to the queue.

In a future version, it will be possible to allocate multiple collectors to a "pool" and have them round-robin the integration jobs; meaning that should one collector instance fail or become unavailable the integration job queue will be processed by the remaining collectors.

Security Considerations

By running JupiterOne Collectors in your own infrastructure, you are entering into a shared responsibility model for the security of that infrastructure. Whilst we at JupiterOne do everything we can to protect your data and configurations, it is the customers responsibility to ensure that the collector host machine is secure.

The following recommendations should be reviewed and considered:

  • Ensure that the host OS and container runtime are kept up to date.
  • Ensure that access to the collector host machine is properly managed.

The collector stores the private key used for encryption of the job configs in /etc/.j1config/. This directory is set to be readable only by root, but depending on how you operate the container runtime these permissions should be reviewed.

caution

If this key is accessed the integration jobs for the collector can be decrypted, potentially allowing the credentials used for the integrations to be read. It's important that all integration accounts are minimal read-only accounts where possible.

Architecture Notes

The JupiterOne Collector is built as an extension of the JupiterOne platform, with configuration all managed from within the JupiterOne application. The Collector registers itself in your environment and will communicate with the JupiterOne platform to pull its configuration and job queue. There is no requirement for the JupiterOne cloud service to have access into your network, allowing for secure data collection.

The collector orchestrates a container runtime environment for the workloads, with very straightforward installation requirements, see Installation. The JupiterOne collector is a lightweight custom container orchestration tool, running the same integration workloads as in the JupiterOne cloud service.

In normal operation the JupiterOne collector will be:

  • Sending regular heartbeats to the JupiterOne platform
  • Reading from its dedicated job queue

Once an integration task is scheduled for a Collector to run it will appear on the Collectors job queue. This job will provide the collector with all the details required to configure and launch the integration task. This task will be created as a one-time container that will run the integration workload.

FAQ

Q: Can I run the Collector using EKS, ECS, or OpenShift?

No, the JupiterOne Collector is not just a container. The daemon and runner containers that get installed actually orchestrate the container environment using an approach similar to docker-on-docker. The JupiterOne Collector is essentially a mini container orchestration tool. The Collector should be installed on a dedicated machine (typically a small Ubuntu VM or similar) where the machine is dedicated to running integration workloads.

Q: Does J1 Collector do network scanning?

No, the JupiterOne collector does not do any additional scanning in your environment. The JupiterOne Collector can only run integration jobs that already exist.

What the Collector does give is the ability for JupiterOne and the community to build new integration types that would not have been possible from a saas-only integration workload.

Q: The integration I want to configure doesn't have a collector option

Not all integrations are compatible with collectors, either because of the way they're built (e.g., the AWS integration and its use of AWS roles for authentication), or because the integration has not been packaged and tested yet.

If there is an integration that you would want to run on a collector that isn't currently compatible, please contact us.

Q: What network connectivity is required?

The collector needs to be able to reach the following domains:

  • ghcr.io, pkg-containers.githubusercontent.com and github.com. The GitHub package registry where the collector and integration images are hosted.
  • *.jupiterone.io, where the collector registers, sends it heartbeats, reads the integration job queue, and finally sends the data collected by the integration jobs. This can be optionally restricted to the region specific instance, such as *.us.jupiterone.io or *.eu.jupiterone.io.
  • Whatever domains the integration itself needs to connect to, for example your local vSphere instance.

Q: Do all features of all integration work?

No, there are some limitations when the integration is running on a JupiterOne collector, but generally these limitations don't impact real use cases. For example, you may find that some integrations offer the ability to "Test Credentials", this operation will only work if the integration target is reachable from the JupiterOne Cloud infrastructure, in which case you're probably not going to run that integration in a collector.

There may also be issues with some of the authentication methods, for example where the integration authenticates based on a shared trust between JupiterOne infrastructure and the target of the integration.

Q: Can I run the JIRA integration (e.g., create tickets)

No, the JupiterOne Collector only runs ingest based integration workloads, it does not include any action/alert based integrations at this time.

Q: I installed a J1 Collector but I don’t see it registered in my JupiterOne console

This indicates that the JupiterOne Collector was unable to start or register with JupiterOne. The first step is to review the logs produced by the JupiterOne collector components.

Running the command docker ps --all will show which containers ran (or are running). You can expect to find references to an "installer", "runner" and "daemon" containers. Please use docker logs <CONTAINER_ID> to retrieve the logs associated with the containers and review, attaching the logs to a support case, if required.

caution

Please be sure to redact any sensitive details prior to sending.