Hello Splunkers.
I am facing issue while implementing steps for the migration of legacy standalone Splunk instances to multi-site cluster environment.I tried to perform the changes today but roll backed them after the issue. Can you please advise on the same?
Scenario: Migration of legacy data from non-cluster splunk instances acting as standalone servers to multisite cluster (2 node in each site)
Procedure followed:
- Upgraded the source server to match target splunk server version v6.4.1
1) Create required indexes on Target Enterprise Splunk servers(Multisite environment)
2) Roll buckets and Rsync data from non-cluster splunk instances to all Target Splunk Indexer servers. 1 copy of bucket on all 4 target indexers.
3) Copy data to target indexes (source db->target db and source colddb -> target colddb) and perform bucket scrubbing to prevent bucket id collision.
4) Rebuild the buckets for each index (E.g. ./splunk _internal call /data/indexes/customindexname/rebuild-metadata-and-manifests)
5) Performed rolling restart from Cluster Master and all indexers went down.
6) After issue checked .bucketManifest file and found that "origin_site" header was set to default, tried updating origin_site for each indexes on all indexer depending on which site the indexer server is.
7) Again performed rolling restart from Cluster Master but same result.
8) After restart indexer servers are crashing with Error:"Cannot disable indexes on a clustering slave."' failed.
Below is the snippet of crash log from one of the indexer:
[build debde650d26e] 2016-10-09 09:03:36
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 24368 running under UID 33335.
Crashing thread: SplunkdSpecificInitThread
Registers:
RIP: [0x00007F6DA4628625] gsignal + 53 (/lib64/libc.so.6 + 0x32625)
RDI: [0x0000000000005F30]
RSI: [0x0000000000005F3B]
RBP: [0x00007F6DA7690638]
RSP: [0x00007F6D9F5FE3F8]
RAX: [0x0000000000000000]
RBX: [0x00007F6DA5B08000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0x00007F6D9BC00000]
R9: [0x00007F6DA0D5F880]
R10: [0x0000000000000008]
R11: [0x0000000000000206]
R12: [0x00007F6DA7690AE8]
R13: [0x00007F6DA7745980]
R14: [0x00007F6D9F04A460]
R15: [0x00007F6D9F5FE8D0]
EFL: [0x0000000000000206]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace (PIC build):
[0x00007F6DA4628625] gsignal + 53 (/lib64/libc.so.6 + 0x32625)
[0x00007F6DA4629E05] abort + 373 (/lib64/libc.so.6 + 0x33E05)
[0x00007F6DA462174E] ? (/lib64/libc.so.6 + 0x2B74E)
[0x00007F6DA4621810] __assert_perror_fail + 0 (/lib64/libc.so.6 + 0x2B810)
[0x00007F6DA6511C4D] _ZN14IndexerService35disableIndexesAndReinitGlobalConfigERKN9__gnu_cxx17__normal_iteratorIPK3StrSt6vectorIS2_SaIS2_EEEESA_ + 1741 (splunkd + 0x9BAC4D)
[0x00007F6DA6512C27] _ZN14IndexerService18initPerIndexConfigEP9StrVectorb + 455 (splunkd + 0x9BBC27)
[0x00007F6DA65151F1] _ZN14IndexerService12reloadConfigERK14IndexConfigRef + 481 (splunkd + 0x9BE1F1)
[0x00007F6DA6AF06A0] _ZN9EventLoop20internal_runInThreadEP13InThreadActorb + 256 (splunkd + 0xF996A0)
[0x00007F6DA6511128] _ZN14IndexerService16loadLatestConfigEP14IndexConfigRef + 808 (splunkd + 0x9BA128)
[0x00007F6DA651129B] _ZN14IndexerService16loadLatestConfigEv + 43 (splunkd + 0x9BA29B)
[0x00007F6DA65158EB] _ZN14IndexerServiceC2Ev + 859 (splunkd + 0x9BE8EB)
[0x00007F6DA6515D87] _ZN14IndexerService14_new_singletonEv + 55 (splunkd + 0x9BED87)
[0x00007F6DA61B755F] _ZN25SplunkdSpecificInitThread4mainEv + 159 (splunkd + 0x66055F)
[0x00007F6DA6BADC00] _ZN6Thread8callMainEPv + 64 (splunkd + 0x1056C00)
[0x00007F6DA4991AA1] ? (/lib64/libpthread.so.0 + 0x7AA1)
[0x00007F6DA46DE93D] clone + 109 (/lib64/libc.so.6 + 0xE893D)
Linux / vvslm0123.vodafone.com.au / 2.6.32-504.30.3.el6.x86_64 / #1 SMP Thu Jul 9 15:20:47 EDT 2015 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2016-10-09 08:57:40.610 +1000 splunkd started (build debde650d26e)
splunkd: /home/build/build-src/galaxy/src/pipeline/indexer/IndexerService.cpp:921: void IndexerService::disableIndexesAndReinitGlobalConfig(const const_iterator&, const const_iterator&): Assertion `0 && "Cannot disable indexes on a clustering slave."' failed.
2016-10-09 08:58:49.564 +1000 splunkd started (build debde650d26e)
2016-10-09 09:03:15.382 +1000 Interrupt signal received
2016-10-09 09:03:33.212 +1000 splunkd started (build debde650d26e)
splunkd: /home/build/build-src/galaxy/src/pipeline/indexer/IndexerService.cpp:921: void IndexerService::disableIndexesAndReinitGlobalConfig(const const_iterator&, const const_iterator&): Assertion `0 && "Cannot disable indexes on a clustering slave."' failed.
/etc/redhat-release: Red Hat Enterprise Linux Server release 6.6 (Santiago)
glibc version: 2.12
glibc release: stable
Last errno: 2
Threads running: 21
Runtime: 3.295799s
argv: [splunkd -p 8089 start]
Thread: "SplunkdSpecificInitThread", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f6da27c6710:
00000000 00 f7 5f 9f 6d 7f 00 00 |.._.m...|
00000008
InThreadActor @0x7f6d9f5fea20: _queuedOn=(nil), ran=N, wantWake=Y, wantFailIfLoopDone=N
Please advise.
Thanks
↧