Quantcast
Channel: Questions in topic: "splunk-enterprise"
Viewing all articles
Browse latest Browse all 47296

Archive data to S3, understanding the options.

$
0
0
I have an indexer cluster with a minimum replication factor of 2 to prevent data loss. I would like to setup Splunk to archive frozen data after the retention period has passed to an S3 bucket (This will eventually be in a S3 glacier bucket for minimal cost and reliable storage). This data DOES NOT need to be searchable, it just needs to be available to thawing in the future. It seems that Splunk provides a few options with advantages and disadvantages so I am trying to understand what would be the best in such a scenario. #### Using cold to frozen script This seems to fit most of the criteria but it requires a separate disk area to move the frozen data to first. There are also some questions on this method * What is the API of such a script, I cannot find any information. By that I mean what arguments does Splunk supply the script if any? * Instead of having the coldToFrozen script move the data and then a separate script to move to S3 as per recommendation, couldn't one set auto archiving (coldToFrozenDir) in Splunk and then having the second script move from there to S3, thus saving one script? #### Hadoop Data Roll This one seems to be a bit of an odd ball. The information on how this works is spread everywhere and one might think you require a Hadoop cluster here but some information seems to point to the fact that one can just have a Hadoop client on Splunk to write to S3. Is this correct? Also some other things. * This is definitely more complicated to setup. Is there a definitive step by step guide on how to go about this with examples? * It is a bit unclear on how this works. Does it archive warm/cold buckets too? Does it archive frozen data at all? * I want Splunk to be searching warm/cold data from the disks, not from S3 but it is unclear if this is the case here So I am a bit confused on what would be the best way to go here. It feels simple to setup the coldToFrozen script (if I can figure out the API of the script call) but I am willing to get my hands dirty with the Hadoop data roll process if that means I will have not only archived data in S3 but also searchable but only if Splunk is still searching from the disks for hot/warm/cold buckets and only frozen from S3 (due to obvious performance differences) It is a bit of a long post but any comments and suggestions are more than welcome to try and clarify the issue. Source: https://answers.splunk.com/answers/578596/archive-splunk-buckets-to-aws-s3.html https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/HowHadoopDataRollworks https://docs.splunk.com/Documentation/Splunk/7.0.0/Indexer/Automatearchiving

Viewing all articles
Browse latest Browse all 47296

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>