JupiterOne Collector
The JupiterOne Collectors are used to allow integrations to gather data from within your network, or "behind the firewall". The Collector configures and runs the integration workload within your infrastructure, giving you full control over what resources are accessed and how that data is transmitted to JupiterOne.
The JupiterOne Collector is installed and registered in your infrastructure. It then becomes available as a deployment option for integrations that are marked as compatible. You can deploy more than one JupiterOne Collector simultaneously.
Architecture
The JupiterOne Collector is built as an extension of the JupiterOne platform, with configuration all managed from within the JupiterOne application. The Collector registers itself in your environment and will communicate with the JupiterOne platform to pull its configuration and job queue. There is no requirement for the JupiterOne cloud service to have access into your network, allowing for very secure data collection.
The collector orchestrates a container runtime environment for the workloads, with very straightforward installation requirements, see Installation. The JupiterOne collector is a lightweight custom container orchestration tool, running the same integration workloads as in the cloud service.
In normal operation the JupiterOne collector will be:
- Sending regular heartbeats to the JupiterOne platform
- Reading from its dedicated job queue
Once an integration task is scheduled for a Collector to run it will appear on the Collectors job queue. This job will provide the collector with all the details required to configure and launch the integration task. This task will be created as a one-time container that will run the integration workload.
Installation
The JupiterOne Collector runs as a container and requires no dependencies beyond the container runtime and appropriate network connectivity. Infrastructure sizing will depend on the integration workload that you need to accommodate.
System requirements
The JupiterOne Collector requires a modern Linux based x86_64 environment with a container runtime such as Docker. The Beta//1.0 release of the Collector runs integration workloads sequentially, so the infrastructure should be sized based on the single largest integration job you wish to run.
A dedicated host should be allocated for the JupiterOne collector. The container environment must not be shared with other non-JupiterOne services.
Operating System | Any modern 64-bit systemd linux distribution should work. The JupiterOne Collector has been specifically tested against:
|
Container Runtime | Whilst any docker compatible container runtime should work (e.g., podman), only the official Docker engine has been tested and certified to work. See installation instructions here |
CPU | 2 vCPU +1 vCPU per additional integration. For a collector running 4 integrations 4 - 6 vCPU are recommended |
RAM | 2 GB minimum, 8 - 12 GB recommended. Note: Most integration jobs operate with <1GB of RAM but the memory requirement will depend on the number of entities being handled. Some integrations may require 2-10GB or available memory. |
Networking | The Collector needs to be able to connect to JupiterOne's services at *.us.jupiterone.io or *.eu.jupiterone.io over HTTPS/443 . The JupiterOne Collector host also needs connectivity to the integration target; e.g., if running an Active Directory integration, it will need connectivity to the LDAP port of your AD servers. |
Storage | 50 GB available local storage. JupiterOne integrations running on the Collector do not themselves need persistent storage. Only the JupiterOne Collector itself requires persistent storage to maintain a configuration file. The Collector host should have sufficient local storage to hold logs for the various services. |
Container Runtime
We recommend using the official Docker Engine. The installation instructions depend on the host OS, and can be found here: https://docs.docker.com/engine/install/
You do not need to install Docker Desktop features. For Ubuntu you would use these instructions: https://docs.docker.com/engine/install/ubuntu/
You can confirm you have a working container runtime environment on your host machine using a test command:
sudo docker run hello-world
Which should produce the following output:
This confirms that the docker engine is running and able to pull and launch containers.
Deploying a Collector
To set up a collector, you will need a JupiterOne account with the Collector features enabled.
- Navigate to Integrations > Collectors, choose New collector. Provide a name for the new collector and select Create:
- Take the terminal command and run it in your new collector instance.
You may need to prefix the command with sudo
depending on how your permissions are configured and your current user.
This will kick off the process to setup the collector:
- You can confirm that the collector is running using
docker ps
command, you should see two containers running (daemon and runner):
- Finally, examine the logs of the runner to ensure it's successfully connecting to the JupiterOne Collector service. The ID of the container can be read from the ps command above:
This shows the runner polling for new integration jobs to run.
At this point the collector is set up and running. You will see the collector as "Active" in the collector overview after a few minutes:
The collector containers are running with a restart policy of "unless-stopped", so they will restart automatically on failure, and will start automatically when the container runtime (re)starts, i.e. at system reboot time.
Assigning an Integration
Assigning an integration job to a collector first requires that there are collectors registered and available. Once collectors are available, the process for defining an integration job and assigning it to a collector is straightforward.
For integrations that are collector compatible, complete the integration configuration as normal. During configuration, you'll notice there's an additional option to choose where the integration should run.
Select Collector on the integration instance, and choose the corresponding collector for which you'd like the integration to run.
Removing a Collector
A collector can be removed by completing all the following steps:
- Remove the collector from the JupiterOne console using the "Delete Collector" option.
NOTE: a collector cannot be removed if it has configured integrations, please reassign where to run the integration first.
- On the collector host machine stop and removed (or force remove) the collector daemon and runner containers.
If you try deleting a collector that has integrations configured you will see a message like the following:
Known Limitations (Beta)
There are some known limitations in the initial version of the JupiterOne Collectors. These are going to be addressed in future versions.
Unable to Migrate Integrations
It is not currently possible to migrate an existing integration job from the JupiterOne servers to a collector, or between collectors.
The most notable impact of this is that, should a collector be removed or become unavailable, the integration jobs will need to be reconfigured on a new collector.
Integration Jobs are run in Parallel
If a collector is configured to run multiple integration jobs at the same time, these will be run in parallel. The collector pulls its tasks from a job queue, and integration jobs can be placed on the queue simultaneously. When this happens the runner will read and start the jobs as soon as it sees them, irrespective of any jobs already running or the capacity available on the collector.
It should be noted that multiple identical jobs will not queue up, nor will identical jobs run in parallel.
In a future version of the collectors we hope to use historical performance metrics (i.e., CPU and RAM usage) to run simultaneous integration tasks only where the collector capacity permits this.
Limited Options for High Availability (HA)
The JupiterOne collectors are single nodes at this time. This means it's not currently possible to have multiple instances of a collector processing a single integration queue. If the collector becomes unavailable, all integrations assigned to that collector will be held in the queue, although no new duplicate integration tasks will be added to the queue.
In a future version, it will be possible to allocate multiple collectors to a "pool" and have them round-robin the integration jobs; meaning that should one collector instance fail or become unavailable the integration job queue will be processed by the remaining collectors.
Security Considerations
By running JupiterOne Collectors in your own infrastructure, you are entering into a shared responsibility model for the security of that infrastructure. Whilst we at JupiterOne do everything we can to protect your data and configurations, it is the customers responsibility to ensure that the collector host machine is secure.
The following recommendations should be reviewed and considered:
- Ensure that the host OS and container runtime are kept up to date.
- Ensure that access to the collector host machine is properly managed.
The collector stores the private key used for encryption of the job configs in /etc/.j1config/
. This directory is set to be readable only by root, but depending on how you operate the container runtime these permissions should be reviewed.
If this key is accessed the integration jobs for the collector can be decrypted, potentially allowing the credentials used for the integrations to be read. It's important that all integration accounts are minimal read-only accounts where possible.
FAQ
Q: Can I run the Collector using EKS, ECS, or OpenShift?
No, the JupiterOne Collector is not just a container. The daemon
and runner
containers that get installed actually orchestrate the container environment using an approach similar to docker-on-docker. The JupiterOne Collector is essentially a mini container orchestration tool. The Collector should be installed on a dedicated machine (typically a small Ubuntu VM or similar) where the machine is dedicated to running integration workloads.
Q: Does J1 Collector do network scanning?
No, the JupiterOne collector does not do any additional scanning in your environment. The JupiterOne Collector can only run integration jobs that already exist.
What the Collector does give is the ability for JupiterOne and the community to build new integration types that would not have been possible from a saas-only integration workload.
Q: The integration I want to configure doesn't have a collector option
Not all integrations are compatible with collectors, either because of the way they're built (e.g., the AWS integration and its use of AWS roles for authentication), or because the integration has not been packaged and tested yet.
If there is an integration that you would want to run on a collector that isn't currently compatible, please contact us.
Q: What network connectivity is required?
The collector needs to be able to reach the following domains:
ghcr.io
,pkg-containers.githubusercontent.com
andgithub.com
. The GitHub package registry where the collector and integration images are hosted.*.jupiterone.io
, where the collector registers, sends it heartbeats, reads the integration job queue, and finally sends the data collected by the integration jobs. This can be optionally restricted to the region specific instance, such as*.us.jupiterone.io
or*.eu.jupiterone.io
.- Whatever domains the integration itself needs to connect to, for example your local vSphere instance.
Q: Do all features of all integration work?
No, there are some limitations when the integration is running on a JupiterOne collector, but generally these limitations don't impact real use cases. For example, you may find that some integrations offer the ability to "Test Credentials", this operation will only work if the integration target is reachable from the JupiterOne Cloud infrastructure, in which case you're probably not going to run that integration in a collector.
There may also be issues with some of the authentication methods, for example where the integration authenticates based on a shared trust between JupiterOne infrastructure and the target of the integration.
Q: Can I run the JIRA integration (e.g., create tickets)
No, the JupiterOne Collector only runs ingest based integration workloads, it does not include any action/alert based integrations at this time.
Q: I installed a J1 Collector but I don’t see it registered in my JupiterOne console
This indicates that the JupiterOne Collector was unable to start or register with JupiterOne. The first step is to review the logs produced by the JupiterOne collector components.
Running the command docker ps --all
will show which containers ran (or are running). You can expect to find references to an "installer", "runner" and "daemon" containers. Please use docker logs <CONTAINER_ID>
to retrieve the logs associated with the containers and review, attaching the logs to a support case, if required.
Please be sure to redact any sensitive details prior to sending.