While running a query via EMR on a bucket archived to s3 with hadoop data roll, I got the following error:
[hadoop] [ip-192-168-4-184] Streamed search execute failed because: Error reading compressed journal while streaming: gzip data truncated, provider=StdinGzDataProvider
Does this mean that one of the archived journal.gz files is corrupt? If so:
* How can I figure out how it got corrupted?
* How do I figure out which one and fix it?
This is still in test phase, so I have all the archived buckets on my indexer still. I'm trying to validate that the archival mechanism is safe and reliable.
↧