Quantcast
Channel: Questions in topic: "splunk-enterprise"
Viewing all articles
Browse latest Browse all 47296

Should I use an index-time field extraction?

$
0
0
Dear fellow Splunkers, I have seen the [docs](http://docs.splunk.com/Documentation/Splunk/6.1.3/Indexer/Indextimeversussearchtime) on index-time field extractions and a few related answers [here](https://answers.splunk.com/answers/67170/index-time-field-extraction.html), [there](https://answers.splunk.com/answers/57247/index-time-field-extraction.html) or [there](https://answers.splunk.com/answers/5817/search-time-versus-index-time-field-extractions.html) with the general guidance that an index-time extraction is rarely ever needed or beneficial. However, I have a dedicated index that holds Apache logfiles for a lot of different virtual hosts. I have set up search-time field extractions to get the apache_virtualhost and HTTP status code. Now following search index=web apache_virtualhost=some.virtual.host | timechart count by status is _very_ slow and does not complete even after a few minutes, keeping the CPU 100% busy. However, index=web source=/path/to/logs/for/this/vhost-only.log | timechart count by status returns a result in an acceptable amount of time. Being used to relational DBs, I immediately thought "sure, in the second case Splunk can retrieve the small subset of matching rows from the index, whereas the first case needs to push _all_ rows through the regexp first". But that is probably not the way Splunk works. So, is there an simple explanation for this? Have I found one of the rare cases where index-time field extraction would make sense? Thanks for sharing your insights!

Viewing all articles
Browse latest Browse all 47296

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>