Use these estimators as a starting point for deciding on the number of machines needed for capture and OpenSearch/Elasticsearch nodes.
Calculating the number of machines needed for capturing is relatively simple. It is based on the average traffic rate, the number of days of retention, how much space is available on each machine, and the avg amount of traffic each machine can handle. If more then one machine is required, we highly recommend getting a NPB to load balance the traffic across the cluster. We suggest RAID 5, RAID 50, RAID 6, RAID 10 for capture disks.
Arkime makes it possible to not save encrypted packets, other then the session negotiation. If you plan on using this feature select the percentage of TLS/QUIC traffic on the network. Most networks will see 10-40% of TLS traffic, resulting in huge disk space savings.
Arkime can compress PCAP when saving the files to disk using standard gzip or zstd format. Most networks will see a 20% savings of disk usage with compression turned on, however this will increase CPU usage. Starting with Arkime 4.0 compression is turned on by default, but can be disabled.
Space Required | All disks for data RAID 0 |
One disk extra RAID 5 |
Two disks extra RAID 6 or RAID 5 + Hot Spare |
---|---|---|---|
Calculating the number of machines needed for OpenSearch/Elasticsearch is a fine art. It heavily depends on the type of traffic that Arkime will be seeing plus of course the traffic rate and number of days of retention. Each node requires 64GB - 128GB of memory: 30GB for OpenSearch/Elasticsearch, and 34-96GB for OS disk cache. For large machines plan on running multiple nodes per host. You may want to read more recomendations from Elastic's Reference and Blog.
Many scaling guides will recommend you do NOT use RAID 5, assuming you will use OpenSearch/Elasticsearch replication. However by default Arkime does NOT enable replication, so it is strongly recommended that you DO use RAID 5 or RAID 6. If you decide to use OpenSearch/Elasticsearch replication you will need more machines, but don't need RAID 5 in theory.
The calculated host counts are just estimates.
Total Space Required | All disks for data RAID 0 |
One disk extra RAID 5 |
Two disks extra RAID 6 or RAID 5 + Hot Spare |
|
---|---|---|---|---|
Average traffic mix | ||||
High DNS/HTTP traffic | ||||
Pathological traffic mix |