I have json log files that I need to pull into my Splunk instance. They have some trash data at the beginning and end that I plan on removing with `SEDCMD`. My end goal is to clean up the file using SEDCMD, index properly (line break & timestamp), auto-parse as much as possible.
The logs are on a system with a UF which send to the indexers. I'm getting very confused about `INDEXED_EXTRACTIONS` & `KV_MODE`. I thought that I would use `INDEXED_EXTRACTIONS` on the UF `props.conf`, then everything else I need on my indexers, but [the docs][1] state that:> When you forward structured data to an indexer, it is not parsed when it arrives at the indexer, even if you have configured props.conf on that indexer with INDEXED_EXTRACTIONS. Forwarded data skips the following pipelines on the indexer, which precludes any parsing of that data on the indexer...
This leads me to believe that if I use `INDEXED_EXTRACTIONS` on the UF, it won't apply any of the indexer props...so do I just use `INDEXED_EXTRACTIONS` on my indexers instead? Or does that only apply if I use one of the [pretrained sourcetypes][2]? Some answers I read said to use `KV_MODE` on the search heads? I'm pretty lost on this one.
I have this written up so far:
**inputs.conf ON UF**
[monitor://path_to_files]
index = my_json_index
sourcetype = my_custom_sourcetype
**props.conf ON IDX**
[my_custom_sourcetype]
disabled = false
INDEXED_EXTRACTIONS = JSON
KV_MODE = none
SHOULD_LINEMERGE = false
TRUNCATE = 0
LINE_BREAKER = (,)\{\"type\":\"\w+\",\"id\":\"\d+\",\"eventTime\":\"
TIME_PREFIX = \{\"type\":\"\w+\",\"id\":\"\d+\",\"eventTime\":\"
TIME_FORMAT = %FT%T.%3Q
TIME_ZONE = UTC
SEDCMD-1_del_header = s/.*\"events\":\[//g
SEDCMD-2_clean_eof = s/\(.*\)\]\}/\1/g
[1]: https://docs.splunk.com/Documentation/Splunk/7.2.1/Data/Extractfieldsfromfileswithstructureddata#Forward_data_extracted_from_structured_data_files
[2]: https://docs.splunk.com/Documentation/Splunk/7.2.1/Data/Listofpretrainedsourcetypes
↧