Hunters service limits

This article lists the most common service limits you might encounter as you use Hunters.

What is the purpose of these limits?

These limitations serve to protect the Hunters infrastructure from spam leads and to maintain high performance. It also serves to prevent Hunters customers from an influx of irrelevant leads and stories. In parallel to these limitations, our research teams are constantly learning from the data and improving detectors to achieve higher fidelity in lead creation.

Detection limits

Item	Limit	More about it
Lead creation from Hunters and custom detectors	150 per detector per day	The detector `Execution of Nmap`, for instance, will generate up to 150 leads per event day.
Lead creation from third-party detectors	5K per detector per day	The detector `Code42 Alerts`, for instance, will generate up to 100,000 leads per event day.

📘Note
In case a lead’s Start Time and End Time values are not found in the raw data (i.e. no event time is found as part of the event), Hunters will resort to the following fallback logic:
If start_time or end_time are null, Hunters will insert to these fields the timestamp of the event’s insertion time into the data lake (METADATA$INSERTION_TIME).
If insertion time is also missing, Hunters will insert the lead creation time.
This is relevant to lead creation, but not for event processing. Meaning, for detectors where the event time is an inherent part of the logic (e.g. time windowed detectors, or statistical time series analysis detectors), events with missing timestamps will not be adjusted using the above-mentioned logic, but rather omitted and not processed as part of the detection logic.

Ingestion limits

Ingestion through an intermediary S3 bucket

Item	Limit	More about it
File size	50MB (compressed)	Files in the connected bucket must be below the specified limit.

Our permanent ingestion pipeline is designed to handle various data sources and volumes effectively. However, processing extremely large files can lead to performance issues, resource constraints, and unexpected behaviors not only for this pipeline but also for your other interconnected pipelines. By capping the file size at 50MB (after compression), we can maintain optimal pipeline performance and minimize the risk of disruptions.

Consequences of exceeding the file size limit

Pipeline Disruptions: Uploading files larger than the specified limit can cause disruptions in the data processing flow, impacting your data's availability downstream in this pipeline and other interconnected pipelines.
Resource Strain: Processing large files consumes significant system resources, potentially leading to slowdowns not only for data processing in this pipeline but also affecting the performance of your other connected pipelines.
Increased Latency: Larger files take longer to process, which may increase data ingestion latency for your datasets in this pipeline and other connected pipelines.

Best practices

To ensure a smooth and efficient data ingestion process across all your pipelines, please follow these best practices:

Pre-processing Large Files: If you have files that exceed the 50MB limit, consider breaking them down into smaller, manageable chunks before submitting them to the ingestion pipeline.
Compression: Compressing large files (e.g., using gzip) can significantly reduce their size without sacrificing data integrity. Ensure that your compressed files remain within the 50MB limit.
Batching: If you have multiple smaller files to upload, consider batching them together to minimize the number of individual uploads and reduce potential overhead.
Optimized File Formats: Select file formats that balance file size and data integrity. Some formats are more efficient than others for certain types of data.

Investigation limits

Auto-Investigation

Item	Limit	More about it
Auto-Investigation	250 leads per detector per day	Out of all the leads created daily, we will initiate an auto-investigation process for the specified number of leads per day per detector.

Investigation On-Demand

You can run Investigation On-Demand up to 3 times per lead and only after at least 20 minutes have passed since the last investigation was completed.

Data access limits

Access to data is now limited to one year back, across all API endpoints and within the portal UI, for all tenants.

Contacting support

If you encounter any issues or have questions related to data ingestion or file size limitations, our support team is here to assist you. Feel free to reach out to our support channels, and we'll be glad to help you.

By adhering to these file size limit guidelines you contribute to a stable and reliable data ingestion process for all your pipelines. We appreciate your cooperation and look forward to a successful and uninterrupted data processing experience!