After living with this for a while, I decided that today that I cannot. The signature_id in the Splunk for Windows Add-On (TA) is extracted in a way that massively impacts search performance of Windows security data. This issue may not be noticeable in smaller environments but in environments of 100gb of windows data or more, searching by signature_id is incredibly slow - unnecessarily so. The problem is compounded by the fact that the extracted signature_id field is leveraged in all the eventtypes and tags within the TA too.
The field is really simply based upon EventCode, which is a naturally occurring KV field within Windows events. Searching by EventCode is incredibly fast, but searching by signature_id is incredibly slow. To demonstrate this - search over time for a signature_id in Windows, then run the same search but use EventCode. You will see a massive difference in search performance if you have a decent amount of windows events.
I tracked the issue to the default props.conf entry here:
## Since FIELDALIAS is destructive we need to preserve signature_id for certain SourceName values
EVAL-signature_id = if(SourceName="Microsoft-Windows-WindowsUpdateClient",signature_id,EventCode)
I have verified that evaluating the signature_id like this is the main cause of the issue. For example, if a Field Alias is used, everything (tag searches, eventtype searches, signature_id searches) run a gazillion times faster.
FIELDALIAS-signature_id = EventCode AS signature_id
Now clearly the comment in the props states that the field alias method cannot be used due to the fact that there is one minor case where it might not work in regards to Windows update events. However, I would have thought that an alternative approach could be leveraged that can take into account this windows update scenario, whilst still allowing a field alias (or similar performing) signature_id extraction approach to be leveraged.
In the meantime, if you are not actually indexing the Windows Update event log, then I see no reason why you wouldn't just use the FIELDALIAS approach I outlined above instead. Hopefully this will drive a more permanent fix in a future update of the TA.
↧