Based on [THIS][1] old blog post and [THIS][2] answers post I have tried to utilise index time modifiers as ways to obtain a unique list of events regardless of their time stamp.
ie. process each event once and only once but atleast once.
An example search might be :
index=_internal sourcetype=splunkd earliest=0 latest=now _index_earliest=-2m@m _index_latest=-1m@m | rename _indextime AS _time | bucket span=1m _time | stats count by _time
If we have this as a savedsearch run on a 1 minute schedule saving to a summary we should get a count of events that were indexed between 1 min and 2 mins ago.
This number should never change right? ie. There should never be new events with an index time into the past.
If I ran the same search with fixed epoch times the numbers should remain the same time after time ?
ie. earliest=0 latest=now _index_earliest=1471218487 _index_latest=1471218547
What I am finding is that this isn't the case.
index=_internal sourcetype=splunkd | rename _indextime AS _time | bucket span=1m _time | stats count AS raw_count by _time | append [search index=summary | timechart span=1m values(count) as summary_count] | stats values(*count) AS *count by _time | eval diff=raw_count-summary_count
I understand events can be late and still be inflight and would be inserted back into the timeline but that is based on _time search limitations and not _indextime. Raw event latency is inside the normal earliest and latest times so shouldn't be culled.
The _indextime value should be an ever increasing number so I don't know how events can have old values added to it so that these searches produce different number of raw events.
[1]: http://blogs.splunk.com/2013/09/26/an-introduction-to-the-theory-or-relative-time-modifiers-for-_indextime/
[2]: https://answers.splunk.com/answers/171/using-indextime-to-specify-time-range.html
↧