Hello,
I'm trying to create a work in progress environment for our Splunk setup so we can test developed apps on that before it goes to the live environment. We're using AWS and I'm running Chef to initially configure the instances. All the Splunk components (searchhead, deployment server, clustermaster and 3 indexers) boot up just fine with Splunk installed and configured. However, it seems that the cluster master can't seem to connect to the indexers. Checking the indexers I'm seeing a lot of these messages in /opt/splunk/var/log/splunk/splunkd.log:
`08-30-2016 09:55:07.910 +0000 INFO CMBundleMgr - Downloaded bundle path=/opt/splunk/var/run/splunk/cluster/remote-bundle/12871cbad4082a640701d4554d2a14b8-1472550907.bundle time_taken_ms=7, bundle size in KB=20.
08-30-2016 09:55:07.910 +0000 INFO CMBundleMgr - untarring bundle=/opt/splunk/var/run/splunk/cluster/remote-bundle/12871cbad4082a640701d4554d2a14b8-1472550907.bundle
08-30-2016 09:55:07.914 +0000 INFO ClusterBundleValidator - Validating bundle path=/opt/splunk/var/run/splunk/cluster/remote-bundle/12871cbad4082a640701d4554d2a14b8-1472550907/apps
08-30-2016 09:55:07.927 +0000 INFO CMBundleMgr - Removed the untarred bundle folder=/opt/splunk/var/run/splunk/cluster/remote-bundle/12871cbad4082a640701d4554d2a14b8-1472550907
08-30-2016 09:55:07.927 +0000 INFO CMBundleMgr - Removed the bundle downloaded from master to '/opt/splunk/var/run/splunk/cluster/remote-bundle/12871cbad4082a640701d4554d2a14b8-1472550907.bundle'
08-30-2016 09:55:08.936 +0000 INFO CMBundleMgr - Downloaded bundle path=/opt/splunk/var/run/splunk/cluster/remote-bundle/97127d74747c236f0b64b1d6e27a6f50-1472550908.bundle time_taken_ms=7, bundle size in KB=20.
08-30-2016 09:55:08.936 +0000 INFO CMBundleMgr - untarring bundle=/opt/splunk/var/run/splunk/cluster/remote-bundle/97127d74747c236f0b64b1d6e27a6f50-1472550908.bundle
08-30-2016 09:55:08.940 +0000 INFO ClusterBundleValidator - Validating bundle path=/opt/splunk/var/run/splunk/cluster/remote-bundle/97127d74747c236f0b64b1d6e27a6f50-1472550908/apps
08-30-2016 09:55:08.954 +0000 INFO CMBundleMgr - Removed the untarred bundle folder=/opt/splunk/var/run/splunk/cluster/remote-bundle/97127d74747c236f0b64b1d6e27a6f50-1472550908
08-30-2016 09:55:08.954 +0000 INFO CMBundleMgr - Removed the bundle downloaded from master to '/opt/splunk/var/run/splunk/cluster/remote-bundle/97127d74747c236f0b64b1d6e27a6f50-1472550908.bundle'
08-30-2016 09:55:09.964 +0000 INFO CMBundleMgr - Downloaded bundle path=/opt/splunk/var/run/splunk/cluster/remote-bundle/3e7a55382e5162fee3be759f7dcc5f67-1472550909.bundle time_taken_ms=9, bundle size in KB=20.
08-30-2016 09:55:09.965 +0000 INFO CMBundleMgr - untarring bundle=/opt/splunk/var/run/splunk/cluster/remote-bundle/3e7a55382e5162fee3be759f7dcc5f67-1472550909.bundle
08-30-2016 09:55:09.968 +0000 INFO ClusterBundleValidator - Validating bundle path=/opt/splunk/var/run/splunk/cluster/remote-bundle/3e7a55382e5162fee3be759f7dcc5f67-1472550909/apps
08-30-2016 09:55:09.983 +0000 INFO CMBundleMgr - Removed the untarred bundle folder=/opt/splunk/var/run/splunk/cluster/remote-bundle/3e7a55382e5162fee3be759f7dcc5f67-1472550909
08-30-2016 09:55:09.983 +0000 INFO CMBundleMgr - Removed the bundle downloaded from master to '/opt/splunk/var/run/splunk/cluster/remote-bundle/3e7a55382e5162fee3be759f7dcc5f67-1472550909.bundle'`
It keeps doing this over and over again. The cluster master reports the following:
`08-30-2016 09:08:36.182 +0000 WARN TcpOutputFd - Connect to 10.99.116.149:9997 failed. Connection refused
08-30-2016 09:08:36.182 +0000 ERROR TcpOutputFd - Connection to host=10.99.116.149:9997 failed
08-30-2016 09:08:36.183 +0000 WARN TcpOutputFd - Connect to 10.99.116.186:9997 failed. Connection refused
08-30-2016 09:08:36.183 +0000 ERROR TcpOutputFd - Connection to host=10.99.116.186:9997 failed
08-30-2016 09:08:36.184 +0000 WARN TcpOutputFd - Connect to 10.99.116.214:9997 failed. Connection refused
08-30-2016 09:08:36.184 +0000 ERROR TcpOutputFd - Connection to host=10.99.116.214:9997 failed`
This does make sense since the indexers don't have anything listening on port 9997. What's weird is that my application /opt/splunk/etc/apps/xx_cluster_indexer_base/local/inputs.conf looks like this:
`# BASE SETTINGS
[splunktcp://9997]
# [splunktcp-ssl://9996]
# SSL SETTINGS
# [SSL]
# rootCA = $SPLUNK_HOME/etc/auth/cacert.pem
# serverCert = $SPLUNK_HOME/etc/auth/server.pem
# password = password
# requireClientCert = false
# If using compressed = true, it must be set on the forwarder outputs as well.
# compressed = true`
So, in theory, this should work but for whatever reason, I can't get the indexers to start. Does anyone have any ideas what might be wrong? The same setup works on my live environment. I did also try a removing the bundle from the cluster master and do a rolling restart but that didn't work. Thanks in advance.
↧