Suggestions for improvement to the Python SDK script implementation are being requested. Would modifying the EXEC_MODE or OUTPUT_MODE to another value help?
I'm am using a Python SDK (splunk-sdk-python-1.6.2) script in the examples directory (search.py) on a heavy forwarder to collect search results from a Splunk Enterprise server, writing them to file, monitoring the file and forwarding to Splunk Cloud.
I've wrapped the search.py script it in a BASH shell script and it is somewhat successfully executing from the splunk user crontab every minute. Initially, it appears data is collected and everything is working fine. However, after a few minutes, I start to receive HTTP Error 503 (too many HTTP threads) and start to get socket timeout errors (errno 110).
Eventually, the host's memory utilization is so high that it is no longer reachable and needs to be rebooted. I can see there a variety of processes spawned, like: kthreadd, ksoftirqd/0, kworker/0:0H and the like.
I know the one minute, repeated execution is a lot and am working with the requestors to change that requirement. In addition, I have asked them to consider forwarding the data directly to Splunk Cloud. In the meantime, I am trying to get a stable implementation working.
***The BASH wrapper:***
-#Modify this file if you need to change PYTHONPATH, host, port, username or password
SCRIPT_HOME=/opt/splunk/etc/apps/gcs-shippingapi/bin
source $SCRIPT_HOME/gcs-shippingapi-hostcred.cfg
-#The time string can be either a UTC time (with fractional seconds), a relative time specifier (to now) or a formatted time string.
EARLIEST='-2m@m'
LATEST='-1m@m'
-#Execution mode valid values: (blocking | oneshot | normal); default=normal
-#Refer to the following for more information: http://dev.splunk.com/view/python-sdk/SP-CAAAEE5
EXEC_MODE='oneshot'
-#Output mode valid values: (atom | csv | json | json_cols | json_rows | raw | xml); default=xml
OUTPUT_MODE='raw'
SEARCH='search Org=pitneybowes AND Env=prod AND EndpointName= AND responseStatus='
/opt/splunk/bin/python $SCRIPT_HOME/search.py "$SEARCH" --host=$SPLUNK_HOST --port=$PORT --username=$SPLUNK_USERNAME --password=$SPLUNK_PASSWORD --output_mode=$
OUTPUT_MODE --earliest_time=$EARLIEST --latest_time=$LATEST
***Cron Error Message #1:***
Traceback (most recent call last):
File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 115, in main(sys.argv[1:]) File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 72, in main service = client.connect(**kwargs_splunk) File "/opt/splunk-sdk-python-1.6.2/splunklib/client.py", line 321, in connect s.login() File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 857, in login cookie="1") # In Splunk 6.2+, passing "cookie=1" will return the "set-cookie" header File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1201, in post return self.request(url, message) File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1221, in request raise HTTPError(response) splunklib.binding.HTTPError: HTTP 503 Too many HTTP threads (628) already running, try again later --
Too many HTTP threads (628) already running, try again later
The server can not presently handle the given request.
***Cron Error Message #2:***
Traceback (most recent call last):
File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 115, in
main(sys.argv[1:])
File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 72, in main
service = client.connect(**kwargs_splunk)
File "/opt/splunk-sdk-python-1.6.2/splunklib/client.py", line 321, in connect
s.login()
File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 857, in login
cookie="1") # In Splunk 6.2+, passing "cookie=1" will return the "set-cookie" header
File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1201, in post
return self.request(url, message)
File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1218, in request
response = self.handler(url, message, **kwargs)
File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1357, in request
connection.request(method, path, body, head)
File "/opt/splunk/lib/python2.7/httplib.py", line 1042, in request
self._send_request(method, url, body, headers)
File "/opt/splunk/lib/python2.7/httplib.py", line 1082, in _send_request
self.endheaders(body)
File "/opt/splunk/lib/python2.7/httplib.py", line 1038, in endheaders
self._send_output(message_body)
File "/opt/splunk/lib/python2.7/httplib.py", line 882, in _send_output
self.send(msg)
File "/opt/splunk/lib/python2.7/httplib.py", line 844, in send
self.connect()
File "/opt/splunk/lib/python2.7/httplib.py", line 1255, in connect
HTTPConnection.connect(self)
File "/opt/splunk/lib/python2.7/httplib.py", line 821, in connect
self.timeout, self.source_address)
File "/opt/splunk/lib/python2.7/socket.py", line 575, in create_connection
raise err
socket.error: [Errno 110] Connection timed out
↧
Splunk Python SDK - Causing HTTP 503 (HTTP Too Many Threads) and Socket Errno=110
↧
How to search for number of license violations over time
I'm looking to display my license violations (over my capacity) as a dashboard panel that I can show over time.
↧
↧
Splunk CLI search parse _raw into fields
I am using a locally installed Splunk instance to perform a remote search using the CLI.
splunk search "index=sandbox sourcetype=access http_status_code<400 earliest="10/01/2017:00:00:00" latest="10/02/2017:00:00:00"" -output csv -maxout 0 -max_time 0 -auth user:password -app remote_app -uri https://hostname:port > output.csv
"access" is a sourcetype that is defined on the remote Splunk enterprise server. When I get the results, how can I parse the _raw field into the individual fields that have field extractions defined on the remote Splunk server.
↧
Splunk CLI remote search parse _raw into fields
I am using a locally installed Splunk instance to perform a remote search using the CLI.
splunk search "index=sandbox sourcetype=access http_status_code<400 earliest="10/01/2017:00:00:00" latest="10/02/2017:00:00:00"" -output csv -maxout 0 -max_time 0 -auth user:password -app remote_app -uri https://hostname:port > output.csv
"access" is a sourcetype that is defined on the remote Splunk enterprise server. When I get the results, how can I parse the _raw field into the individual fields that have field extractions defined on the remote Splunk server.
↧
Index retains old warm buckets
One of my indexes has a couple of old buckets in Warm which are closed for writing in 2014, then the next oldest one is from 2017. When trying to use dbinspect to determine data age per index they are throwing out the accuracy of my report. Each one is less than 0.5 Mb on disk.
How can I get rid of them? Can I manually roll them to cold, then frozen?
↧
↧
Chart Display value
Hi All,
I found out when the dashboard have too many col in the chart, cannot display the x value,
Can we make the chart larger to display?
↧
VM templating of Splunk instances
We plan to create Splunk pre-installed virtual machine (VM)
templates for internal use.
We have assumed the following points should be taken steps
with Splunk VM templates.
- Use hostname or FQDN in conf files instead of IP address directory.
- Separate credentials and apply them after VM instantiation.
default auth method, SSL/TLS certs, keys for clustering (e.g. pass4SymmKey)
- Allocate enough (dedicated) resources for VMs.
https://www.splunk.com/web_assets/pdfs/secure/Splunk_and_VMware_VMs_Tech_Brief.pdf
Does someone have any other ideas or points to notine for VM templates?
In a view point of DevOps, templates should be minimize like pure-OS,
Splunk or other applications should be installed and configured
via provisioning tools (Ansible, Chef, Puppet, Salt Stack, ...)
after launch of VMs.
↧
Running one of two searches based on time picker selection
I am trying to create a dashboard panel which will run one of the following email searches. There are a number of inputs which allow a user to filter exactly what he/she wants to search for.
- One input allows a user to select the search criteria (sender, recipient, source IP, message id, etc.)
- Another input allow the user to input the data being searched for.
- The last input is a time picker.
Each input is a separate token. So, if one wants to search for sender=john.doe@xyz.org, for example, those values (sender, john.doe@xyz.org) would each be passed to the search with tokens.
If the time selected from the time picker is within the last 24h, a search based on raw events (including index=, eventtype=, stats, etc.) should be run. If the time selected is historic (ie. more than 24h ago), I want to run a search based on a summary index (index=summary report=x).
I have been working to figure this out, but each attempt has been unsuccessful. Assistance with this will greatly be appreciated.
Thank you.
↧
Can we use same property names (say "[setnull]","[setparsing]") defining the event data filtering criteria , in two different apps (having different filtering criteria) residing on same server ?
I have two clustered environments consisting of 3 SH,3 Indexers and 1 HWF each running on Splunk 6.4.1.I need to filter out certain unwanted events coming from jms queues and send them to the nullQueue.
We have two applications running on same servers, having different criteria for filtering event data.
I added below code in HWF in props.conf for one of the apps:
[my_sourcetype]
TRANSFORMS-set= setnull,setparsing
and this in transforms.conf
[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = (?<=mbody=.{51}TQ-123|mbody=.{51}TQ-145)
DEST_KEY = queue
FORMAT = indexQueue
Can I use the same names property name i.e., "setnull" , "setparsing" in transforms.conf of the other app and define a new regex there? Will doing so have any impact on filtering?
↧
↧
how to get data to splunk indexer without a forwarder for continous monitering?
basically need to monitor dell Idrac and CMC logs
↧
Splunk add-on for Servicenow
Hi All
I want to download Splunk add-on for servicenow Event management integration . As per the documents ( store.servicenow.com/sn_appstore_store.do#!/store/application/bac6db564f6a3100a0fc7d2ca310c721/1.1.4?referer=sn_appstore_store.do%23!%2Fstore%2Fsearch%3Fq%3Dsplunk and I should get the Add-on from link , splunkbase.splunk.com/app/1928/ . This link is not functional , "Aggree to download " button is not working .
Could someonw please help me with the correct link to download the add-on.
Thanks..
↧
Nessus scan vulnerability duration
Am trying to find all vulnerabilities present in nessus scans that have been reported more than 15 days ago and are still present. My current search query works but I can't help feeling that it is inefficient. Here it is:
sourcetype="nessus:scan" | fields severity, plugin_name, _time, host-ip | stats earliest(_time) as firsttime, latest(_time) as lasttime by plugin_name, severity, host-ip | eval now=now()| eval Days =((lasttime-firsttime)/86400), test=((now-lasttime)/86400), First_Sighting =strftime(firsttime,"%Y/%m/%d %I:%M:%S"), Last_Sighting =strftime(lasttime,"%Y/%m/%d %I:%M:%S") | where test<15 | eval Real-Days=round(Days, 0) | table plugin_name, severity, host-ip, First_Sighting, Last_Sighting, Real-Days | sort -Last_Sighting
Thanks
↧
Nessus exploitable vulnerabilities
Here, am trying to find all vulnerabilities found during a nessus scan that are exploitable. The exploit_available field is shown only in nessus plugin. I would like to corelate the exploitable vulnerabilities with hosts in my network which are only shown in the nessus scan sourcetype. My search query works but once again, it takes a while to run. Any help fine tuning this is welcome.
sourcetype="nessus:scan" plugin_id="*" [search sourcetype="nessus:plugin" exploit_available="true" id="*" | table plugin_name] | dedup plugin_id |table plugin_name, plugin_id, severity
Thanks
↧
↧
How to configure Splunk to extract key value pairs with JSON log data from Http Event Collector?
We have started using the Http Event Collector (HEC) for logging directly from our Java apps. HEC takes data in JSON format but we have a lot of legacy code that logs key/value pairs and some searches/dashboards that utilize these. Data logged to HEC is by default indexed as the _json sourcetype and I have tried to configure this with KV_MODE=auto (for key/value) and json (for json-format) but none of these seem to trigger Splunk to index key/values. Example log statement:
logger.info("corrId=11-1111-566 aa=88");
However, I have not been able to search on the keys, e.g. _search aa=88_
The event looks like this:
![alt text][1]
[1]: /storage/temp/217736-screenshot-2017-10-03-095137.png
Raw format: {"severity":"INFO","logger":"splunk.logger","thread":"main","message":"corrId=11-1111-566 aa=88"}
Any ideas?
↧
unable to run query sendemail
sendemail command is not working in scheduled searches.
Query used.
| inputlookup testing.csv | map search=" | sendemail to=$email$ message=\"
Hi $realname$,
This is a test message
Many Thanks,
Splunk@abc.com
\" subject=\"Action Required: Testing\" " maxsearches=50.
Note: The lookup file will contain fields like emailid, realname etc
i get the following error messages, when we check in the job inspector
error : command="sendemail", {} while sending mail to:
warn : The search result count (20) exceeds maximum (10), using max. To override it, set maxsearches appropriately.
warn: Unable to run query ' | sendemail to=$email$ message=" Hi "", This is a test message
↧
I am indexing reports as an excel file but after indexing I am getting field value for tag as error also event type as error. Can somebody please help me as the TA is not working and we are manually ingesting the excel file.
I am indexing reports as an excel file but after indexing I am getting field value for tag as error also event type as error. Can somebody please help me as the TA is not working and we are manually ingesting the excel file.
↧
Splunk showing gateway timeout
We're running Splunk in our environment. We can only access the Splunk instance via the IP address, but not the DNS address we have mapped to it.
For instance, we can go to this URL using the IP:
http://10.40.50.17:8000/en-US/app/launcher/home
Splunk is working fine. However if I go to:
http://splunk.mycompany.com:8000/en-US/app/launcher/home
I get a gateway timeout error:
> Gateway Timeout
Server error - server 10.40.50.17 is unreachable at this moment.
Please retry the request or contact your adminstrator.
I'm wondering where the issue may lie. Is this a problem with Splunk? Do I need to make [Splunk Administraion][1] aware of the DNS address that I'm giving it?
[1]: https://goo.gl/ByMTq6
↧
↧
Change Notifications from AWS Config Service
Hi,
After a great .conf 2017, I decided to install the Splunk App for AWS and the associated AWS TA and I am having issues with getting Change Notifications into Splunk.
I think they are supported, at least according to this page: https://docs.splunk.com/Documentation/AddOns/released/AWS/Config but the handler.py for the ConfigNoticeParser has the following:
> _UNSUPPORTED_MESSAGE_TYPE = [> 'ConfigurationItemChangeNotification',> 'ConfigurationSnapshotDeliveryStarted',> 'ComplianceChangeNotification',> 'ConfigRulesEvaluationStarted',> ]
In addition the overview dashboard has 0 configuration changes. I am reasonably certain that I am outputting Change notifications for the Config service, and that the configuration in AWS is for the right region, and includes all Supported resources, and global resources.
Am I doing something wrong?
PS, Splunk AWS app in general is great :)
↧
Doing stats on multivalued json fields
Hi Ninjas
Im dealing with some deeply nested json events like:
"sendTime":"2017-09-21T17:02:06.583+02:00","runningProcess":[{"Name":"_Total","PercentProcessorTime":"100","WorkingSetPrivate":"1557368"},{"Name":"Bananaservice","PercentProcessorTime":"0","WorkingSetPrivate":"593"},{"Name":"Cherryservice","PercentProcessorTime":"0","WorkingSetPrivate":"7671"},{"Name":"Pineappleservice","PercentProcessorTime":"0","WorkingSetPrivate":"466"},{"Name":"Kiwiservice","PercentProcessorTime":"0","WorkingSetPrivate":"442"},{"Name":"Appleservice","PercentProcessorTime":"0","WorkingSetPrivate":"630"},{"Name":"Peachservice","PercentProcessorTime":"0","WorkingSetPrivate":"1470"}
So all i want to do is getting out the avg values over time by each process, something like
| stats avg(runningProcess{}.PercentProcessorTime) as CPU by runningProcess{}.Name, _time
| stats list(*) as * by _time
But without mvexpand and so on, im not getting the right data as just takes the value of the first entry of the mv field by each event.
As said, im aware of doing it witch mvexpand etc. but it slows down the search dramatically and i was wondering wheter there is a more elegant solution to get the right data here.
Thanks
↧
Incomplete JSON ingested.
Hi,
I am using the REST API modular input addon to monitor an elasticsearch instance on the stats api endpoint. The output is in JSON format and has an average of 1200 lines.
I am using Heavy Forwarders and I have the following settings
inputs.conf
[rest://elastic-stats]
source = elastic-stats
auth_type = none
endpoint = http://localhost:9200/_nodes/stats
http_method = GET
index = main
index_error_response_codes = 0
polling_interval = 60
request_timeout = 55
response_type = json
sequential_mode = 0
sourcetype = elastic-stats
streaming_request = 0
props.conf
[elastic-stats]
SHOULD_LINEMERGE = false
LINE_BREAKER = {"cluster_name":
category = json_no_timestamp
pulldown_type = 1
disabled = false
elasticsearch api endpoint json output contains :
Total Character Total Word Total Lines Size
26793 2230 1229 27.36 KB
But only the first 10000 characters get indexed.
Please assist & thank you in advance.
↧