When HUNK does its bucket pushes to HDFS, it also pushes a couple small supporting files, metadata, etc... With Hadoop's issues handling small files, I was wondering if that is something that's been looked at or not?
For using the HUNK archiving on clustered indexers, I understand that if the buckets have 2 or greater searchable copies of the data, searches against that archived data will return duplicated results, is that correct?
↧