Many duplicate events since a major outage / corrupt buckets?

Hi guys, since I still can not open a support case, I can only try it here (I've tried so many times to get this issue resolved, but yea, it's not like we're paying a lot of money for support). We already had some issues with duplicate events in the past and always resolved them. But not this time, it seems. And it is quite a problem now. Recently, there was a storage outage in our datacenter. The Splunk VMs were running all the time (single site cluster with its VMs in two datacenters) but there were quite some disturbances and replication issues. We have two environments. Due to our many differnt (V)LAN zones, we have a couple of Heavy Forwarders set up in each zone we need to collect data from. **Now here's the problem:** In our testing environment, there is a total of 4 HFs, but **only one** is sending duplicate data to our two indexers. This is not a lot of data, so I could live with it being resolved at a later point. In our productive environment however, there is a total of five HF and now we got 80-100 hosts sending its data duplicated (in the last 15 minutes: 87 hosts from 15 different sources, into 37 differnt indexes). I have searched errors and warnings for some time now and found a few log events, googled them but still have those problems. Restarting the indexer cluster, removing excess-buckets from the master or restarting the Heavy Forwarders did not resolve any issues. No configurations have been changed (I tried disabling/re-enabling useACK on one HF but had no luck whatsoever). Here are a few errors: SERVER2 14:13:25.737 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=_internal~380~B3A9C962-6814-49A6-A47E-593741B331A3 path=/opt/splunk/var/lib/splunk/_internaldb/db/380_B3A9C962-6814-49A6-A47E-593741B331A3/rawdata/journal.gz status=failed SERVER2 14:12:44.855 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=_internal~337~E432BEC8-63C5-4DCE-A500-90756157F30F path=/opt/splunk/var/lib/splunk/_internaldb/db/337_E432BEC8-63C5-4DCE-A500-90756157F30F/rawdata/journal.gz status=failed SERVER2 14:12:44.732 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=00_p_INDEX4_14~264~E432BEC8-63C5-4DCE-A500-90756157F30F path=/opt/splunk/var/lib/splunk/00_p_INDEX4_14/db/264_E432BEC8-63C5-4DCE-A500-90756157F30F/rawdata/journal.gz status=failed SERVER2 14:12:44.735 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=_audit~112~E432BEC8-63C5-4DCE-A500-90756157F30F path=/opt/splunk/var/lib/splunk/audit/db/112_E432BEC8-63C5-4DCE-A500-90756157F30F/rawdata/journal.gz status=failed SERVER2 14:12:44.718 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=60_p_INDEX1_14~96~E432BEC8-63C5-4DCE-A500-90756157F30F path=/opt/splunk/var/lib/splunk/60_p_INDEX1_14/db/96_E432BEC8-63C5-4DCE-A500-90756157F30F/rawdata/journal.gz status=failed SERVER2 14:12:44.844 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=00_p_INDEX2_14~301~E432BEC8-63C5-4DCE-A500-90756157F30F path=/opt/splunk/var/lib/splunk/00_p_INDEX2_14/db/301_E432BEC8-63C5-4DCE-A500-90756157F30F/rawdata/journal.gz status=failed SERVER2 14:12:44.825 +0200 ERROR S2SFileReceiver - event=statSize replicationType=eJournalReplication bid=00_p_INDEX3_14~533~E432BEC8-63C5-4DCE-A500-90756157F30F path=/opt/splunk/var/lib/splunk/00_p_INDEX3_14/db/533_E432BEC8-63C5-4DCE-A500-90756157F30F/rawdata/journal.gz status=failed ..and a few warnings: SERVER2 14:12:44.825 +0200 WARN S2SFileReceiver - unable to remove dir=/opt/splunk/var/lib/splunk/00_p_INDEX2_14/db/533_E432BEC8-63C5-4DCE-A500-90756157F30F for bucket=00_p_INDEX2_14~533~E432BEC8-63C5-4DCE-A500-90756157F30 SERVER1 09-28-2017 14:13:25.737 +0200 WARN S2SFileReceiver - unable to remove dir=/opt/splunk/var/lib/splunk/_internaldb/db/380_B3A9C962-6814-49A6-A47E-593741B331A3 for bucket=_internal~380~B3A9C962-6814-49A6-A47E-593741B331A3 SERVER2 09-28-2017 14:12:44.855 +0200 WARN S2SFileReceiver - unable to remove dir=/opt/splunk/var/lib/splunk/_internaldb/db/337_E432BEC8-63C5-4DCE-A500-90756157F30F for bucket=_internal~337~E432BEC8-63C5-4DCE-A500-90756157F30F SERVER2 14:12:44.844 +0200 WARN S2SFileReceiver - unable to remove dir=/opt/splunk/var/lib/splunk/00_p_INDEX1_14/db/301_E432BEC8-63C5-4DCE-A500-90756157F30F for bucket=00_p_INDEX1_14~301~E432BEC8-63C5-4DCE-A500-90756157F30F These messages only occur after a rolling-restart of the indexer cluster. It's interesting that the "Indexer Clustering Status" shows "everything is fine" and also the Health Check is not finding any issue at all. Does this mean, all of those buckets are corrupt (there are listed around 20-30 different ones? That will be interesting to get explained by the storage people. But even if so: What does this have to do with new incoming duplicated data? **Edit**: Two indexers are clustered with a RF=2, one indexer in datacenter X and one indexer in datacenter Y, three Search Heads with a SF=2. The Search Heads seem to be working fine (2 in datacenter X, 1 in datacenter Y) Skalli

Many duplicate events since a major outage / corrupt buckets?

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...