Splunk Python SDK - Causing HTTP 503 (HTTP Too Many Threads) and Socket Errno=110

October 2, 2017, 3:18 pm

≫ Next: How to search for number of license violations over time

≪ Previous: Cylance Protect data integration with Enterprise Security ES

Suggestions for improvement to the Python SDK script implementation are being requested. Would modifying the EXEC_MODE or OUTPUT_MODE to another value help? I'm am using a Python SDK (splunk-sdk-python-1.6.2) script in the examples directory (search.py) on a heavy forwarder to collect search results from a Splunk Enterprise server, writing them to file, monitoring the file and forwarding to Splunk Cloud. I've wrapped the search.py script it in a BASH shell script and it is somewhat successfully executing from the splunk user crontab every minute. Initially, it appears data is collected and everything is working fine. However, after a few minutes, I start to receive HTTP Error 503 (too many HTTP threads) and start to get socket timeout errors (errno 110). Eventually, the host's memory utilization is so high that it is no longer reachable and needs to be rebooted. I can see there a variety of processes spawned, like: kthreadd, ksoftirqd/0, kworker/0:0H and the like. I know the one minute, repeated execution is a lot and am working with the requestors to change that requirement. In addition, I have asked them to consider forwarding the data directly to Splunk Cloud. In the meantime, I am trying to get a stable implementation working. ***The BASH wrapper:*** -#Modify this file if you need to change PYTHONPATH, host, port, username or password SCRIPT_HOME=/opt/splunk/etc/apps/gcs-shippingapi/bin source $SCRIPT_HOME/gcs-shippingapi-hostcred.cfg -#The time string can be either a UTC time (with fractional seconds), a relative time specifier (to now) or a formatted time string. EARLIEST='-2m@m' LATEST='-1m@m' -#Execution mode valid values: (blocking | oneshot | normal); default=normal -#Refer to the following for more information: http://dev.splunk.com/view/python-sdk/SP-CAAAEE5 EXEC_MODE='oneshot' -#Output mode valid values: (atom | csv | json | json_cols | json_rows | raw | xml); default=xml OUTPUT_MODE='raw' SEARCH='search Org=pitneybowes AND Env=prod AND EndpointName= AND responseStatus=' /opt/splunk/bin/python $SCRIPT_HOME/search.py "$SEARCH" --host=$SPLUNK_HOST --port=$PORT --username=$SPLUNK_USERNAME --password=$SPLUNK_PASSWORD --output_mode=$ OUTPUT_MODE --earliest_time=$EARLIEST --latest_time=$LATEST ***Cron Error Message #1:*** Traceback (most recent call last): File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 115, in main(sys.argv[1:]) File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 72, in main service = client.connect(**kwargs_splunk) File "/opt/splunk-sdk-python-1.6.2/splunklib/client.py", line 321, in connect s.login() File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 857, in login cookie="1") # In Splunk 6.2+, passing "cookie=1" will return the "set-cookie" header File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1201, in post return self.request(url, message) File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1221, in request raise HTTPError(response) splunklib.binding.HTTPError: HTTP 503 Too many HTTP threads (628) already running, try again later -- Too many HTTP threads (628) already running, try again later The server can not presently handle the given request. ***Cron Error Message #2:*** Traceback (most recent call last): File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 115, in main(sys.argv[1:]) File "/opt/splunk/etc/apps/gcs-shippingapi/bin/search.py", line 72, in main service = client.connect(**kwargs_splunk) File "/opt/splunk-sdk-python-1.6.2/splunklib/client.py", line 321, in connect s.login() File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 857, in login cookie="1") # In Splunk 6.2+, passing "cookie=1" will return the "set-cookie" header File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1201, in post return self.request(url, message) File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1218, in request response = self.handler(url, message, **kwargs) File "/opt/splunk-sdk-python-1.6.2/splunklib/binding.py", line 1357, in request connection.request(method, path, body, head) File "/opt/splunk/lib/python2.7/httplib.py", line 1042, in request self._send_request(method, url, body, headers) File "/opt/splunk/lib/python2.7/httplib.py", line 1082, in _send_request self.endheaders(body) File "/opt/splunk/lib/python2.7/httplib.py", line 1038, in endheaders self._send_output(message_body) File "/opt/splunk/lib/python2.7/httplib.py", line 882, in _send_output self.send(msg) File "/opt/splunk/lib/python2.7/httplib.py", line 844, in send self.connect() File "/opt/splunk/lib/python2.7/httplib.py", line 1255, in connect HTTPConnection.connect(self) File "/opt/splunk/lib/python2.7/httplib.py", line 821, in connect self.timeout, self.source_address) File "/opt/splunk/lib/python2.7/socket.py", line 575, in create_connection raise err socket.error: [Errno 110] Connection timed out

↧

How to search for number of license violations over time

October 2, 2017, 4:22 pm

≫ Next: Splunk CLI search parse _raw into fields

≪ Previous: Splunk Python SDK - Causing HTTP 503 (HTTP Too Many Threads) and Socket Errno=110

I'm looking to display my license violations (over my capacity) as a dashboard panel that I can show over time.

↧

Splunk CLI search parse _raw into fields

October 2, 2017, 6:12 pm

≫ Next: Splunk CLI remote search parse _raw into fields

≪ Previous: How to search for number of license violations over time

I am using a locally installed Splunk instance to perform a remote search using the CLI. splunk search "index=sandbox sourcetype=access http_status_code<400 earliest="10/01/2017:00:00:00" latest="10/02/2017:00:00:00"" -output csv -maxout 0 -max_time 0 -auth user:password -app remote_app -uri https://hostname:port > output.csv "access" is a sourcetype that is defined on the remote Splunk enterprise server. When I get the results, how can I parse the _raw field into the individual fields that have field extractions defined on the remote Splunk server.

↧

Splunk CLI remote search parse _raw into fields

October 2, 2017, 6:12 pm

≫ Next: Index retains old warm buckets

≪ Previous: Splunk CLI search parse _raw into fields

↧

Index retains old warm buckets

October 2, 2017, 7:20 pm

≫ Next: Chart Display value

≪ Previous: Splunk CLI remote search parse _raw into fields

One of my indexes has a couple of old buckets in Warm which are closed for writing in 2014, then the next oldest one is from 2017. When trying to use dbinspect to determine data age per index they are throwing out the accuracy of my report. Each one is less than 0.5 Mb on disk. How can I get rid of them? Can I manually roll them to cold, then frozen?

↧

Chart Display value

October 2, 2017, 8:57 pm

≫ Next: VM templating of Splunk instances

≪ Previous: Index retains old warm buckets

Hi All, I found out when the dashboard have too many col in the chart, cannot display the x value, Can we make the chart larger to display?

↧

VM templating of Splunk instances

October 2, 2017, 9:19 pm

≫ Next: Running one of two searches based on time picker selection

≪ Previous: Chart Display value

We plan to create Splunk pre-installed virtual machine (VM) templates for internal use. We have assumed the following points should be taken steps with Splunk VM templates. - Use hostname or FQDN in conf files instead of IP address directory. - Separate credentials and apply them after VM instantiation. default auth method, SSL/TLS certs, keys for clustering (e.g. pass4SymmKey) - Allocate enough (dedicated) resources for VMs. https://www.splunk.com/web_assets/pdfs/secure/Splunk_and_VMware_VMs_Tech_Brief.pdf Does someone have any other ideas or points to notine for VM templates? In a view point of DevOps, templates should be minimize like pure-OS, Splunk or other applications should be installed and configured via provisioning tools (Ansible, Chef, Puppet, Salt Stack, ...) after launch of VMs.

↧

Running one of two searches based on time picker selection

October 2, 2017, 10:25 pm

≫ Next: Can we use same property names (say "[setnull]","[setparsing]") defining the event data filtering criteria , in two different apps (having different filtering criteria) residing on same server ?

≪ Previous: VM templating of Splunk instances

I am trying to create a dashboard panel which will run one of the following email searches. There are a number of inputs which allow a user to filter exactly what he/she wants to search for. - One input allows a user to select the search criteria (sender, recipient, source IP, message id, etc.) - Another input allow the user to input the data being searched for. - The last input is a time picker. Each input is a separate token. So, if one wants to search for sender=john.doe@xyz.org, for example, those values (sender, john.doe@xyz.org) would each be passed to the search with tokens. If the time selected from the time picker is within the last 24h, a search based on raw events (including index=, eventtype=, stats, etc.) should be run. If the time selected is historic (ie. more than 24h ago), I want to run a search based on a summary index (index=summary report=x). I have been working to figure this out, but each attempt has been unsuccessful. Assistance with this will greatly be appreciated. Thank you.

↧

Can we use same property names (say "[setnull]","[setparsing]") defining the event data filtering criteria , in two different apps (having different filtering criteria) residing on same server ?

October 2, 2017, 11:29 pm

≫ Next: how to get data to splunk indexer without a forwarder for continous monitering?

≪ Previous: Running one of two searches based on time picker selection

I have two clustered environments consisting of 3 SH,3 Indexers and 1 HWF each running on Splunk 6.4.1.I need to filter out certain unwanted events coming from jms queues and send them to the nullQueue. We have two applications running on same servers, having different criteria for filtering event data. I added below code in HWF in props.conf for one of the apps: [my_sourcetype] TRANSFORMS-set= setnull,setparsing and this in transforms.conf [setnull] REGEX = . DEST_KEY = queue FORMAT = nullQueue [setparsing] REGEX = (?<=mbody=.{51}TQ-123|mbody=.{51}TQ-145) DEST_KEY = queue FORMAT = indexQueue Can I use the same names property name i.e., "setnull" , "setparsing" in transforms.conf of the other app and define a new regex there? Will doing so have any impact on filtering?

↧

how to get data to splunk indexer without a forwarder for continous monitering?

October 3, 2017, 12:04 am

≫ Next: Splunk add-on for Servicenow

≪ Previous: Can we use same property names (say "[setnull]","[setparsing]") defining the event data filtering criteria , in two different apps (having different filtering criteria) residing on same server ?

basically need to monitor dell Idrac and CMC logs

↧

Splunk add-on for Servicenow

October 3, 2017, 2:10 am

≫ Next: Nessus scan vulnerability duration

≪ Previous: how to get data to splunk indexer without a forwarder for continous monitering?

Hi All I want to download Splunk add-on for servicenow Event management integration . As per the documents ( store.servicenow.com/sn_appstore_store.do#!/store/application/bac6db564f6a3100a0fc7d2ca310c721/1.1.4?referer=sn_appstore_store.do%23!%2Fstore%2Fsearch%3Fq%3Dsplunk and I should get the Add-on from link , splunkbase.splunk.com/app/1928/ . This link is not functional , "Aggree to download " button is not working . Could someonw please help me with the correct link to download the add-on. Thanks..

↧

Nessus scan vulnerability duration

October 3, 2017, 2:33 am

≫ Next: Nessus exploitable vulnerabilities

≪ Previous: Splunk add-on for Servicenow

Am trying to find all vulnerabilities present in nessus scans that have been reported more than 15 days ago and are still present. My current search query works but I can't help feeling that it is inefficient. Here it is: sourcetype="nessus:scan" | fields severity, plugin_name, _time, host-ip | stats earliest(_time) as firsttime, latest(_time) as lasttime by plugin_name, severity, host-ip | eval now=now()| eval Days =((lasttime-firsttime)/86400), test=((now-lasttime)/86400), First_Sighting =strftime(firsttime,"%Y/%m/%d %I:%M:%S"), Last_Sighting =strftime(lasttime,"%Y/%m/%d %I:%M:%S") | where test<15 | eval Real-Days=round(Days, 0) | table plugin_name, severity, host-ip, First_Sighting, Last_Sighting, Real-Days | sort -Last_Sighting Thanks

↧

Nessus exploitable vulnerabilities

October 3, 2017, 2:38 am

≫ Next: How to configure Splunk to extract key value pairs with JSON log data from Http Event Collector?

≪ Previous: Nessus scan vulnerability duration

Here, am trying to find all vulnerabilities found during a nessus scan that are exploitable. The exploit_available field is shown only in nessus plugin. I would like to corelate the exploitable vulnerabilities with hosts in my network which are only shown in the nessus scan sourcetype. My search query works but once again, it takes a while to run. Any help fine tuning this is welcome. sourcetype="nessus:scan" plugin_id="*" [search sourcetype="nessus:plugin" exploit_available="true" id="*" | table plugin_name] | dedup plugin_id |table plugin_name, plugin_id, severity Thanks

↧

How to configure Splunk to extract key value pairs with JSON log data from Http Event Collector?

October 3, 2017, 12:57 am

≫ Next: unable to run query sendemail

≪ Previous: Nessus exploitable vulnerabilities

We have started using the Http Event Collector (HEC) for logging directly from our Java apps. HEC takes data in JSON format but we have a lot of legacy code that logs key/value pairs and some searches/dashboards that utilize these. Data logged to HEC is by default indexed as the _json sourcetype and I have tried to configure this with KV_MODE=auto (for key/value) and json (for json-format) but none of these seem to trigger Splunk to index key/values. Example log statement: logger.info("corrId=11-1111-566 aa=88"); However, I have not been able to search on the keys, e.g. _search aa=88_ The event looks like this: ![alt text][1] [1]: /storage/temp/217736-screenshot-2017-10-03-095137.png Raw format: {"severity":"INFO","logger":"splunk.logger","thread":"main","message":"corrId=11-1111-566 aa=88"} Any ideas?

↧

unable to run query sendemail

October 3, 2017, 1:24 am

≫ Next: I am indexing reports as an excel file but after indexing I am getting field value for tag as error also event type as error. Can somebody please help me as the TA is not working and we are manually ingesting the excel file.

≪ Previous: How to configure Splunk to extract key value pairs with JSON log data from Http Event Collector?

sendemail command is not working in scheduled searches. Query used. | inputlookup testing.csv | map search=" | sendemail to=$email$ message=\" Hi $realname$, This is a test message Many Thanks, Splunk@abc.com \" subject=\"Action Required: Testing\" " maxsearches=50. Note: The lookup file will contain fields like emailid, realname etc i get the following error messages, when we check in the job inspector error : command="sendemail", {} while sending mail to: warn : The search result count (20) exceeds maximum (10), using max. To override it, set maxsearches appropriately. warn: Unable to run query ' | sendemail to=$email$ message=" Hi "", This is a test message

↧

I am indexing reports as an excel file but after indexing I am getting field value for tag as error also event type as error. Can somebody please help me as the TA is not working and we are manually ingesting the excel file.

October 3, 2017, 3:26 am

≫ Next: Splunk showing gateway timeout

≪ Previous: unable to run query sendemail

I am indexing reports as an excel file but after indexing I am getting field value for tag as error also event type as error. Can somebody please help me as the TA is not working and we are manually ingesting the excel file.

↧

Splunk showing gateway timeout

October 3, 2017, 3:31 am

≫ Next: Change Notifications from AWS Config Service

≪ Previous: I am indexing reports as an excel file but after indexing I am getting field value for tag as error also event type as error. Can somebody please help me as the TA is not working and we are manually ingesting the excel file.

We're running Splunk in our environment. We can only access the Splunk instance via the IP address, but not the DNS address we have mapped to it. For instance, we can go to this URL using the IP: http://10.40.50.17:8000/en-US/app/launcher/home Splunk is working fine. However if I go to: http://splunk.mycompany.com:8000/en-US/app/launcher/home I get a gateway timeout error: > Gateway Timeout Server error - server 10.40.50.17 is unreachable at this moment. Please retry the request or contact your adminstrator. I'm wondering where the issue may lie. Is this a problem with Splunk? Do I need to make [Splunk Administraion][1] aware of the DNS address that I'm giving it? [1]: https://goo.gl/ByMTq6

↧

Change Notifications from AWS Config Service

October 3, 2017, 5:06 am

≫ Next: Doing stats on multivalued json fields

≪ Previous: Splunk showing gateway timeout

Hi, After a great .conf 2017, I decided to install the Splunk App for AWS and the associated AWS TA and I am having issues with getting Change Notifications into Splunk. I think they are supported, at least according to this page: https://docs.splunk.com/Documentation/AddOns/released/AWS/Config but the handler.py for the ConfigNoticeParser has the following: > _UNSUPPORTED_MESSAGE_TYPE = [> 'ConfigurationItemChangeNotification',> 'ConfigurationSnapshotDeliveryStarted',> 'ComplianceChangeNotification',> 'ConfigRulesEvaluationStarted',> ] In addition the overview dashboard has 0 configuration changes. I am reasonably certain that I am outputting Change notifications for the Config service, and that the configuration in AWS is for the right region, and includes all Supported resources, and global resources. Am I doing something wrong? PS, Splunk AWS app in general is great :)

↧

Doing stats on multivalued json fields

October 3, 2017, 5:07 am

≫ Next: Incomplete JSON ingested.

≪ Previous: Change Notifications from AWS Config Service

Hi Ninjas Im dealing with some deeply nested json events like: "sendTime":"2017-09-21T17:02:06.583+02:00","runningProcess":[{"Name":"_Total","PercentProcessorTime":"100","WorkingSetPrivate":"1557368"},{"Name":"Bananaservice","PercentProcessorTime":"0","WorkingSetPrivate":"593"},{"Name":"Cherryservice","PercentProcessorTime":"0","WorkingSetPrivate":"7671"},{"Name":"Pineappleservice","PercentProcessorTime":"0","WorkingSetPrivate":"466"},{"Name":"Kiwiservice","PercentProcessorTime":"0","WorkingSetPrivate":"442"},{"Name":"Appleservice","PercentProcessorTime":"0","WorkingSetPrivate":"630"},{"Name":"Peachservice","PercentProcessorTime":"0","WorkingSetPrivate":"1470"} So all i want to do is getting out the avg values over time by each process, something like | stats avg(runningProcess{}.PercentProcessorTime) as CPU by runningProcess{}.Name, _time | stats list(*) as * by _time But without mvexpand and so on, im not getting the right data as just takes the value of the first entry of the mv field by each event. As said, im aware of doing it witch mvexpand etc. but it slows down the search dramatically and i was wondering wheter there is a more elegant solution to get the right data here. Thanks

↧

Incomplete JSON ingested.

October 3, 2017, 5:50 am

≫ Next: SLL certs for SPLUNK WEB

≪ Previous: Doing stats on multivalued json fields

Hi, I am using the REST API modular input addon to monitor an elasticsearch instance on the stats api endpoint. The output is in JSON format and has an average of 1200 lines. I am using Heavy Forwarders and I have the following settings inputs.conf [rest://elastic-stats] source = elastic-stats auth_type = none endpoint = http://localhost:9200/_nodes/stats http_method = GET index = main index_error_response_codes = 0 polling_interval = 60 request_timeout = 55 response_type = json sequential_mode = 0 sourcetype = elastic-stats streaming_request = 0 props.conf [elastic-stats] SHOULD_LINEMERGE = false LINE_BREAKER = {"cluster_name": category = json_no_timestamp pulldown_type = 1 disabled = false elasticsearch api endpoint json output contains : Total Character Total Word Total Lines Size 26793 2230 1229 27.36 KB But only the first 10000 characters get indexed. Please assist & thank you in advance.

↧