I have a report that groups webpage request by from an IIS log by SC_STATUS. The results are really bad because splunk appears to be getting confused on what line and what part of a line it's reading, resulting in data like "myurl.com" showing up where "200" for sc_status should be.
I have Splunk set up to monitor the folder where log files are stored in real time and I manually selected IIS logs when identifying the format of the files.
This is what Splunk has stored for one request:
2015-12-30 15:06:54 W3SVC3 MYWEBSERVER GET /App_Themes/Blue/Blue.css - 80 - HTTP/1.1 Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+WOW64;+Trident/5.0) stuff_id=stuff;+user=stuff;+persistcookie=True;+stuffSelection=STUFF1,STUFF2,STUFF3,STUFF4,STUFF5,;+MYWEBSITE=R285025761;+ASP.NET_SessionId=3sgbsssgrvbwizta31fcynmx;+MyWebSite.ASPXAUTH=D2E24F7A75F2114DCF6AFB5DA65C739A2972D39870A74C1735EF0B3A819F27D5E743DE70EB6C5D7ADF944507DA71042D235483889FEA3A736EFBA2E81AB02F47A08BA93D51C6563422CE17055236EA5BBDCC03A03B4389CE042ADDFB89AA7A7D6C7246376DB20045AD709BE50444332F048A79BD65269C0919B0A5ADA4EE415EE1E96BCFBF3D5D33507D663A5671DE9E https://m5.0+(Macintosh;+Intel+Mac+OS+X+10_10)+AppleWebKit/600.1.25+(KHTML,+like+Gecko)+Version/8.0+Safari/600.1.25 MYWEBSITE=R285025761;+ASP.NET_SessionId=o2hgz2wa34vj2v0i2c5zdmis https://mywebsite.thisisawesome.com/Logon.aspx?ReturnUrl=%2f mywebsite.thisisawesome 200 0 0 24916 515 31
This request appears to be a mashup of two or more requests:
Part 1: 2015-12-30 15:06:54 W3SVC3 MYWEBSERVER GET /App_Themes/Blue/Blue.css - 80 - HTTP/1.1 Mozilla/5.0+(Windows+NT+6.1;+Trident/7.0;+rv:11.0)+like+Gecko stuff_id=stuff;+user=stuff;+persistcookie=True;+datalistSelection=OFAC,PEP_FO,;+MYWEBSITE=R285025761;+ASP.NET_SessionId=ykvwd2cgbhjcjck45jcy1w13 https://mywebsite.thisisawesome.com/logon.aspx mywebsite.thisisawesome.com 304 0 0 92 593 62
Part 2: 2015-12-30 15:06:38 W3SVC3 MYWEBSERVER GET /Includes/jquery-1.4.2.min.js - 80 - HTTP/1.1 Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_10)+AppleWebKit/600.1.25+(KHTML,+like+Gecko)+Version/8.0+Safari/600.1.25 MYWEBSITE=R285025761;+ASP.NET_SessionId=o2hgz2wa34vj2v0i2c5zdmis https://mywebsite.thisisawesome.com/Logon.aspx?ReturnUrl=%2f mywebsite.thisisawesome.com 200 0 0 24916 515 31
and part of another request in the middle.
I can see at least one place where the lines were mashed together. In this snippit, "5671DE9E https://m5.0+(Macintosh;+Int", you can see "https://m" is part of a URL and "5.0+" is part of a user agent but they're put together without a space as if they're one field.
Other than that, I'm not sure where the data is coming from in the log file to put that one request together in Splunk.
My question is, how do I get Splunk to read my IIS logs properly and not mash up multiple lines into one line?