I am trying to determine the right SPL to dig through a financial data set and look for duplicate entries. The data generally is unique but occasionally a vendor may submit a duplicate request resulting in bad things.
Test data:
id=11111,vendor=blah,name=tacoco,value=201,date="1/1/18"
id=11112,vendor=abc,name=jump,value=321,date="2/1/18"
id=11113,vendor=sneeze,name=china,value=421,date="3/1/18"
id=11114,vendor=alpha,name=pooch,value=521,date="4/1/18"
id=11115,vendor=splunk,name=tacos,value=221,date="5/1/18"
id=11116,vendor=internet,name=golf,value=621,date="6/1/18"
id=11117,vendor=office,name=mexico,value=721,date="7/1/18"
id=11118,vendor=splunk,name=tacos,value=221,date="5/1/18"
id=11119,vendor=random,name=burger,value=821,date="8/1/18"
id=11120,vendor=opera,name=browser,value=921,date="9/1/18"
I would like to create a search that identifies any time where vendor, name, value, and date all have the same values but id is different. (vendor=splunk rows for example above) There are other fields in the event data but this would be what I'm looking for specifically.
↧