I am having some trouble working with JSON events. I use Splunk Enterprise 6.4.1. I'm using KV_MODE=json in my props.conf file. For regular fields and top level arrays, it's working great. However I have an array of objects (which contain arrays...see example data below). It looks like by using KV_MODE=json, I lose the relationship between the objects/fields.
For example, using the event below, I can tell how many players the team has at each position, but I can't tell how many players of each position are at a given school. My search is:
index=foo sourcetype=json_linebreaker players{}.school="resevoir" | stats count by players{}.positions
If even just one player on a team has school = resevoir, then ALL the players from that team are included in the count, so the results (assuming this is the only team) are:
2B 2
P 2
C 2
RF 2
SS 1
LF 1
3B 1
CF 1
1B 1
The results I want is to only see the positions of the players who go to resevoir:
3B 1
CF 1
1B 1
P 1
C 1
I've tried using spath (even tho the docs say that if you use KV_MODE=json you don't need spath)
*index=foo sourcetype=json_linebreaker players{}.school="resevoir" | spath players{} output=x | fields - _raw | fields x | mvexpand x | spath input=x | search school="resevoir" | stats count by positions{}*
This should return the results that I'm looking for, but for some events (say 10%), the first spath command generates an empty field named x. When I look at the raw event, there is a players{} array, but for some reason the fields within it are not extracted.
So, what is the best way to index/search this data? I get an average of 7 million events a day and have about 50 users currently. (yes the sample data is made up, but it has the same structure as the data I am actually ingesting). I'd like to make it as easy as possible for the users to write searches, but it also needs to be as performant as possible.
props.conf
[json_linebreaker]
JSON_TRIM_BRACES_IN_ARRAY_NAMES=true
KV_MODE=json
LINE_BREAKER=\s{4}\},(,[\n\r])\s{4}\{(.*)
MAX_TIMESTAMP_LOOKAHEAD=30
NO_BINARY_CHECK=true
SHOULD_LINEMERGE=true
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%Z
TIME_PREFIX=regDate\"\x*:\s*\"
Sample JSON input:
{
"team" : "spirit",
"coach": "matt",
"division": "14U",
"tournaments": ["Save Second Base", "Autism", "Fear the Fastpitch"],
"teamType": "fastpitch",
"assistants": ["Jay", "Dwayne", "Bob", "Alyssa"],
"players": [
{
"name":"Marissa",
"number": 27,
"positions": ["2B", "P", "C", "RF"],
"lineup_spot":2,
"pitches": ["fastball", "changeup", "curve", "drop", "rise"],
"school": "Chesapeake Middle"
},
{
"name":"Sierra",
"number": 5,
"positions": ["SS","LF"],
"lineup_spot":7,
"school": "Elkridge"
},
{
"name":"Rhi",
"number": 12,
"positions": ["3B", "CF","1B"],
"lineup_spot":1,
"school":"resevoir"
},
{
"name":"Kasey",
"number": 10,
"positions": ["P", "C"],
"pitches": ["fastball", "changeup"],
"lineup_spot":9,
"school": "resevoir"
},
{
"name":"Stella",
"number": 46,
"positions": ["RF", "2B"],
"lineup_spot":10,
"school":"bollman"
}
]
}
↧