I see the same problem on other `_internal` logs, but I'll focus the example on the `license_usage.log`, since it's near and dear to all of our hearts. ;)
I found a few similar questions but without an actual resolution for the simple setup that I have (sorry I don't have enough karma to link yet):
- Why is Splunk Light Cloud indexing 58 duplicates of 1 raw entry? (Splunk Answer 457888)
Sounds like what I'm seeing, except that they're talking about their server logs whereas I'm seeing it on built-in Splunk `_internal` logs. No answers on that thread.
- Splunk shows duplicate events in search results when there are no duplicates in the source file (Splunk Answer 60693)
This is exactly the same problem I'm seeing, but the question is from 2012 so may not apply. For example, I have no such field as `_indextime`, but the duplicates I'm seeing all have the same `_time` value which looks like the extracted timestamp as opposed to the time it was indexed. Regardless, I don't know what would be causing my configuration to index the file multiple times, though it obvious is.
Details:
I have a clustered 6.4.2 environment with the Distribute Management Console and License Master running on the Cluster Master server, which is sending its internal logs to a cluster of four indexers.
On the Master node, in `/opt/splunk/var/log/splunk/license_usage.log`:
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip1" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=14539620 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip1" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=9896121 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip1" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=12852051 poolsz=805306368000
When I do a search like `index=_internal source=*license_usage.log type="Usage"` (which is what DMC uses to display the 30-day license usage graph), I get results like this:
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip3" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=12852051 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip3" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=12852051 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip3" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=12852051 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip2" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=9896121 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip2" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=9896121 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip2" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=9896121 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip1" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=14539620 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip1" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=14539620 poolsz=805306368000
11-03-2016 17:17:05.010 +0000 INFO LicenseUsage - type=Usage s="tcp:5140" st="sourcetype1" h="src_ip1" o="" idx="index1" i="(GUID)" pool="auto_generated_pool_enterprise" b=14539620 poolsz=805306368000
That is, I get exactly the same log line (at least) three times. I've seen where it's dramatically more than three times for the exact same time stamp. I've found that over time, the multiplication factor tends to grow, sometimes being resolved by doing a `splunk restart` on the indexer. I can see last week where it was sending over **180 (!) copies** of each line, consistently for several days because I hadn't found the source of the problem yet.
I'm assuming that the `i` GUID field indicates the indexer because I typically see four values. During the period when it was generating hundreds of copies, I can see that two of the indexers have reasonable numbers of events and the other two are about 300x as many, so probably those two indexers were the culprit at that time.
I have a pretty vanilla `outputs.conf` on the master node to allow its logs to be indexed using `indexerDiscovery` by its own cluster, with acknowledgement enabled:
[indexer_discovery:master1]
pass4SymmKey =
master_uri = https://:8089
# Turn off indexing on the master
[indexAndForward]
index = false
[tcpout]
defaultGroup = discovered_peer_nodes
forwardedindex.filter.disable = true
indexAndForward = false
[tcpout:discovered_peer_nodes]
indexerDiscovery = master1
autoLB = true
useACK = true
I don't have anything about this in my `local/inputs.conf`, so I think the `_internal` logs are just being picked up by `default/inputs.conf`, which has this in it:
[monitor://$SPLUNK_HOME/var/log/splunk]
index = _internal
↧