Quantcast
Channel: Questions in topic: "splunk-enterprise"
Viewing all articles
Browse latest Browse all 47296

How to configure 6.5.0 data roll to search archived buckets in S3?

$
0
0
I follow the instructions in [the documentation for archiving to S3 in 6.5.0 http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/ArchivingSplunkindexestoS3 but Splunk still can't find the jars it wants. How to I properly configure the jars for searching S3 archived buckets? I ran the `| archivebuckets` command and it worked fine and archived the buckets, but the search errors out saying it can't find the jars: [HadoopProvider] Error in 'ExternalResultProvider': Hadoop CLI may not be set correctly. Please check HADOOP_HOME and Default Filesystem in the provider settings for this virtual index. Running /opt/hadoop/bin/hadoop fs -stat s3a://bucketname/prefix/ should return successfully, rc=255, error=-stat: Fatal internal error java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) I ran the command that I wanted and I could only get it to work if I provide the -libjars option. $ /opt/hadoop/bin/hadoop fs -libjars $HADOOP_TOOLS/hadoop-aws-2.7.2.jar,$HADOOP_TOOLS/aws-java-sdk-1.7.4.jar,$HADOOP_TOOLS/jackson-databind-2.2.3.jar,$HADOOP_TOOLS/jackson-core-2.2.3.jar,$HADOOP_TOOLS/jackson-annotations-2.2.3.jar -Dfs.s3a.access.key=value -Dfs.s3a.secret.key=value -stat s3a://bucketname/prefix/ 1970-01-01 00:00:00 $ export HADOOP_CLASSPATH=$HADOOP_TOOLS/hadoop-aws-2.7.2.jar,$HADOOP_TOOLS/aws-java-sdk-1.7.4.jar,$HADOOP_TOOLS/jackson-databind-2.2.3.jar,$HADOOP_TOOLS/jackson-core-2.2.3.jar,$HADOOP_TOOLS/jackson-annotations-2.2.3.jar $ /opt/hadoop/bin/hadoop classpath /opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/share/hadoop/tools/lib/hadoop-aws-2.7.2.jar,/opt/hadoop/share/hadoop/tools/lib/aws-java-sdk-1.7.4.jar,/opt/hadoop/share/hadoop/tools/lib/jackson-databind-2.2.3.jar,/opt/hadoop/share/hadoop/tools/lib/jackson-core-2.2.3.jar,/opt/hadoop/share/hadoop/tools/lib/jackson-annotations-2.2.3.jar:/opt/hadoop/contrib/capacity-scheduler/*.jar $ /opt/hadoop/bin/hadoop fs -Dfs.s3a.access.key=value -Dfs.s3a.secret.key=value -stat s3a://bucketname/prefix/ -stat: Fatal internal error java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) ... 16 more Here is my provider configuration: [provider:HadoopProvider] vix.family = hadoop vix.splunk.setup.package = /opt/splunk_package.tgz vix.env.JAVA_HOME = /usr/lib/jvm/java-7-openjdk-amd64 vix.env.HADOOP_HOME = /opt/hadoop vix.env.HADOOP_TOOLS = /opt/hadoop/share/hadoop/tools/lib vix.splunk.home.datanode = /opt/splunk vix.splunk.home.hdfs = /working-dir vix.splunk.jars = $HADOOP_TOOLS/hadoop-aws-2.7.2.jar,$HADOOP_TOOLS/aws-java-sdk-1.7.4.jar,$HADOOP_TOOLS/jackson-databind-2.2.3.jar,$HADOOP_TOOLS/jackson-core-2.2.3.jar,$HADOOP_TOOLS/jackson-annotations-2.2.3.jar vix.mapreduce.framework.name = yarn vix.yarn.resourcemanager.address = <%= ENV['HADOOP_MASTER'] %>:8032 vix.yarn.resourcemanager.scheduler.address = <%= ENV['HADOOP_MASTER'] %>:8030 vix.fs.s3a.access.key = <%= ENV['S3_ARCHIVE_ACCESS_KEY'] %> vix.fs.s3a.secret.key = <%= ENV['S3_ARCHIVE_SECRET_KEY'] %> vix.fs.default.name = s3a://<%= ENV['SPLUNK_HADOOP_BUCKET'] %>/prefix [main_archive] vix.provider = HadoopProvider vix.output.buckets.from.indexes = main vix.output.buckets.older.than = 1 vix.output.buckets.path = s3a://<%= ENV['SPLUNK_HADOOP_BUCKET'] %>/prefix I'm running against a vanilla apache hadoop tarball, version 2.7.2. I'm not sure which commands are trying to run against the hadoop cluster, but I'm working against an AWS EMR cluster of the same hadoop version. http://docs.splunk.com/Documentation/Splunk/6.5.0/Indexer/ArchivingSplunkindexestoS3

Viewing all articles
Browse latest Browse all 47296

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>