We are currently on splunk 6.3.x, with the following topology:
(syslog/bro data) --> (load balancer) --> (HFs for props and transforms) --> (indexers)
Here are the inputs, pros, and transforms configurations on the heavy forwarders:
#inputs.conf
[tcp://1571]
connection_host = dns
index = bro
sourcetype = bro
source = my_syslog_appliance:tcp:1571
# props.conf
[bro]
TRANSFORMS-sourcetype = bro_sourcetype
TRANSFORMS-host=bro_hostoverride
TRUNCATE = 0
SHOULD_LINEMERGE = false
KV_MODE = none
NO_BINARY_CHECK = 1
MAX_TIMESTAMP_LOOKAHEAD = 45
TIME_FORMAT = %s%6N
# transforms.conf
[bro_hostoverride]
DEST_KEY = MetaData:Host
REGEX = ^\w+\t([\w\-]+)
FORMAT = host::$1
[bro_sourcetype]
DEST_KEY = MetaData:Sourcetype
REGEX = ^(\w+)\t
FORMAT = sourcetype::$1
The props and transforms have no problem extracting the sourcetype from raw, but it appears to:
1.) not extracting the host; so the default load balancer host is used instead
2.) randomly extracting other keywords in the event as the host
Through trial and error, we unset "connection_host = dns" in inputs.conf, and the issue is now resolved. The host field is now being extracted properly.
However, the question remains why the connection_host entry could impact the overwriting of the host Metadata...
Additional info: we don't see this problem if we bypass the load balancer, with "connection_host=dns" set:
(syslog/bro data) --> (HF) --> (indexers)
↧