Hi,
I've been trouble-shooting a problem where files are occasionally getting missed in Splunk. The app creates a lot of files and a lot of data - they roll over at 50mb, about every 1-2 minutes. Just today, I caught a "unable to open file" message, and when I went on the system, it wasn't there - probably because they have a cleanup job that moves files on a regular basis. The file in question is over an hour old, so I'm beginning to wonder if Splunk is having a hard time keeping up.
How can easily validate the splunk isn't falling behind? This app has lots of server and lots of files, so running a btool after the fact isn't going to help me (nor will list monitors...). Looking for ideas/thoughts...
Update:
I have noticed that on certain systems, the same file keeps getting "removed from queue", which doesn't make sense, as it's still active. (And the file is very busy).
04-16-2016 22:44:05.213 -0400 INFO BatchReader - Removed from queue file='/gsysrtpp23/logs/ORS_RTP_Node2_PR/ORS_RTP_Node2_PR.20160416_223009_902.log'.
04-16-2016 22:44:06.202 -0400 INFO BatchReader - Removed from queue file='/gsysrtpp23/logs/ORS_RTP_Node2_PR/ORS_RTP_Node2_PR.20160416_223009_902.log'.
04-16-2016 22:44:07.212 -0400 INFO BatchReader - Removed from queue file='/gsysrtpp23/logs/ORS_RTP_Node2_PR/ORS_RTP_Node2_PR.20160416_223009_902.log'.
04-16-2016 22:44:08.221 -0400 INFO BatchReader - Removed from queue file='/gsysrtpp23/logs/ORS_RTP_Node2_PR/ORS_RTP_Node2_PR.20160416_223009_902.log'.
Thanks!
↧