Recently I upgraded from Splunk 6.5.3 to 6.6.3. I have a CentOS 6.x virtual server on x64 Intel architecture acting as a heavy forwarder and it's main mission is to receive netflow and send it to the indexers. I am using the Splunk Add-On for Netflow and it's been working well for several years.
However, when I upgraded I stopped getting netflow. I looked at the splunk/etc/apps/Splunk_TA_flowfix_slim folder and the nfdump-binary and nfdump-ascii folders were gone, as was the bin/flowfix.sh run file.
So I re-ran the Splunk_TA_flowfix_slim/configure.sh script, restarted splunk, and those folders came back. But no data was being sent to the netflow index.
I checked nfdump-ascii and no files were being created in that folder.
I checked nfdump-binary and only one file was in that folder, it was 276 bytes and it wasn't growing.
restarted splunk, nothing.
I checked splunkd.log and I couldn't find any errors.
I checked netstat and nfcapd was listening on the port I had designated.
I checked tcpdump and netflow is being received on that port.
I check the iptables and that port is allowed.
I turned off selinux (setenforce 0)
Still nothing.
So that left a couple of things. When I first ran the configure.sh, it started nfcapd as user splunk. So I stopped that process and ran flowfix.sh as root. Nothing.
I re-ran configure.sh as root. Nothing.
repeated several of the steps above, nothing.
No logs, no errors.
I looked at the following two questions out of all my searches that seemed the most relevant, but no help.
https://answers.splunk.com/answers/172341/installation-of-splunk-add-on-for-netflow-didnt-wo.html
https://answers.splunk.com/answers/36265/netflow-app-not-retrieving-any-data.html
I finally tried running the actual nfcapd command that flowfix.sh runs (found when I did a ps-ef | grep nfcapd) found the following:
Socket error: could not open the requested socket
Terminated due to errors.
Searching on THAT, I found the following:
https://answers.splunk.com/answers/59124/configuring-netflow-app-for-splunk.html
https://sourceforge.net/p/nfsen/mailman/message/4400104/
But what was weird was that doing a netstat -np | grep as root, I found NOTHING listening on that port. Doing the netstat as splunk finds nothing either. The nfdump-binary folder occasionally has a large file, but then has nothing. The nfdump-ascii STILL has no files. And nfcapd is running as splunk.
But I am now getting bursts of netflow data in Splunk. So it appears things are working. But the whole thing is weird and the lack of obvious errors has made figuring this out frustrating.
Anyway, if this helps others troubleshoot, then great.
I just hope I don't go through this again when I upgrade to 7.0
↧