I was asked to come up with some rough numbers on how long it would take to rebuild an indexer if one completely died. So, if I were to remove an existing indexer from my multi-site cluster (2 sites) and replace it with a new one in it's place. I know there are a lot of variables but I am asking for help on how to get some rough numbers. The last time this happened, it took about 12 hours for the cluster to meet RF/SF after replacing a single indexer.
**How can I calculate an estimate for this?**
To simplify the question, assume the following:
- 20 indexers (10 at each site)
- 10TB of data (hot+cold) on each indexer
- RF=3, SF=2
- [Splunk recommended hardware][1] (800 IOPs)
- Minimal WAN latency between the two sites (100-150ms)
- Default 5 fix-up tasks per indexer
- 50,000 buckets per indexer
- 10 Gigabit circuit
In essence, the cluster would need to reproduce 10TB of data, or 5TB would be done by indexers in 1 data center and 5TB by the other (assuming 50% split in work load).
Would this just be 10TB = 80,000 Gb / 5 Gbps = 16,000 seconds (4.5 hours)? That's very conservative compared to my real life experience where it took 12 hours. What am I missing in my calculation?
[1]: https://docs.splunk.com/Documentation/Splunk/7.3.2/Capacity/Referencehardware
↧