I'm seeing this error in the splunkd.log. Anyone else having this issue?
MongoModificationsTracker - Could not load configuration for collection 'acknotescoll' in application 'TA-FireEye_v3'. Collection will be ignored.
↧
FireEye TA - Could not load configuration for collection 'acknotescoll'
↧
How do I stop duplicate entries from PM2 logs from being indexed in Splunk?
We utilize splunk to forward log files written out by PM2 (a node.js process manager) to our Splunk indexers. PM2 has its own logrotate features, and creates backup log files when it reaches its settings. These log files are also in the same folder, and we are indexing *.log. We DO want this data to be evaluated, because there may be a time that the forwarders are down and we don't want to miss anything that may have been logged.
Example:
prog.log
prog_2018-01-03.log
prog_2018-01-02.log
prog_2018-01-01.log
In the above scenario, how do we keep things that have been indexed in prog.log from becoming indexed when the file is written out as prog_date.log? Keeping in mind that we do want to ensure we dont miss any entries for outages, and want to continue to process the dated logs as a backup.
We just upgraded to splunkforwarder 7.0.4, since we were under the impression it would assist with this, but we are still seeing the same results.
↧
↧
Alerting on multiple emails from grouping IPs
I'm running into an issue where I am receiving a flood of emails for an alert.
The alert works as expected when I alert on values greater than one; however, raising the value breaks the alert.
Search:
`sourcetype="backend" | regex "User with email .* used an invalid password." | rex "User with email (?.*) used an invalid password." | rex "client_ip=(?\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b)" | transaction client_ip maxspan=1000s | search eventcount > 2 | stats values(email), count(eval(email)) as EmailCount by client_ip | where EmailCount > 1`
Example email output (attached csv):
`>>>> client_ip | values(emails) | EmailCount`
` 123.123.123.123 | a@gmail.com b@gmail.com | 2`
When I raise `EmailCount > 2`, I get about 50 emails in the span of 5 minutes with csvs that look like this:
`_time | _raw | client_ip | values(email) | EmailCount | email | index | ...`
` ... | ... | x.x.x.x | [BLANK] | [BLANK] | a@b.com | ... | ...`
I get more information (like `_raw` and `_time)`, but values(email) and EmailCount are left blank. A new `email` field appears and it only contains one email. Doing a plain realtime search, no stats are reported.
Why is this happening? Why does raising the value by one break the alerting, when keeping the value at 1 produces the expected result?
Notes:
- I am using 1000s temporarily. This will decrease to less than 5s after debugging this search
- I tried using `dc()` instead of `count()`, but that also didn't work
↧
How to manipulate stats or chart results mathematically?
Hey everyone,
I've got a search
search = *
| eval _time=_time - (6*60*60)
| bucket _time span=1d
# Takes the current time and rolls it back six hours. We operate on a 6am-6am reporting schedule.
| eval MaterialType = case(match(lotNumber,"regex") OR lotNumber = "WasteLots","Waste",match(field1,"regex"),"Production")
# Designates each event as a waste event (using the Lot #) or a production event (using the value in field1)
| where isnotnull(MaterialType)
| eval time = strftime(_time,"%m/%d/%y")
| chart sum(netWeightQty) by time, MaterialType
| eval _time=_time + (6*60*60)
Now this | chart generates the following:
![Big money big money][1]
[1]: /storage/temp/252239-capture07182018092200.png
__How can I get a value, for each date, of Waste% = 100 * Waste / (Production + Waste)?__
Thanks!
↧
What is a query to get a comparison between last week's results and this week's results?
Hi, I have a query as follows
index="summary" search_name="ABC" | dedup hostname | table hostname
Now I want see the hostnames which are in last week's result and not in this week's result and vice versa?
What are the earliest and latest times that I should be specified in subsearch and main search? What could be the query to get that result?
↧
↧
Converting a chart back to its original format
I have the following SPL:
some search | table _time, col1, col2 | timechart span=2m useother=f values(col2) as col2 by col1 | fillnull value=0
This creates a separate column for each value in col1. I now want to convert this data back into the original format i.e. a table in the format |_time|col1|col2|. I'm basically using the time chart command to fill in null values for every timestamp that had no value associated with it in the original data.
Is there any way I can do this ? I guess a better way would be to not use a time chart, but I'm not sure how. Not using a time chart give me the advantage of working with all the values of col1 (rather than only 10 as in the case of time chart)
↧
How to get hourly stats into a graph
I have some fields in my Splunk search now i want to use them to create a search query so that i can pull those information into a graph. On splunk i want to show hourly(hour field) how many d_in , d_to d_up ,err and p_to are there . Below are the field which i have
d_in = 4027
d_to = 336210
d_up = 332183
hour = 12
err = 0
p_to = 264749
d_in = 427
d_to = 3210
d_up = 2183
hour = 13
err = 2
p_to = 249
I am new in Splunk please help me in this . I am using below query to in the search to get above fields :
eventtype="abc"
↧
splunk ingestion from queues
I am using Splunk heavy forwarder to read data from the MQ/Solace queues. For this I am using app "Splunk JMS modular input".
But when the data read from queues is indexed, It is converting the new lines in the message to series of white spaces. i.e. it is simply converting multi line event to single line event. May I know how to handle this scenario without parsing the events further.
Ex:
**Actual event:**
how are you
hi
good to know
**Indexed event:**
how are you hi good to know
↧
Splunk query to merge 2 timecharts as overlay
Hello,
I have 2 timecharts that are working independently, can you help to merge both to one query (as overylay), the modified query should show timecharts based on 2 different source types and different criteria's.
Query 1 : index=index1 sourcetype="sourcetype1" "SearchString1"|timechart count span=1h
Query 2 : index=index1 sourcetype=sourcetype2 "SearchString2"=* | timechart count by "SearchString2"
↧
↧
Using bin command can you put all values above a cutoff into the last bin?
I'm using the bin command to get a distribution of values, and each grouping is in increments of 10,000.
I have a few outliers way at the upper end, and I don't want to see all the empty buckets in the middle. Is it possible to put all values above a desingated cutoff value into the final bin?
Thanks,
Jonathan
↧
SPUNK saves logs. does it fulfill STIG requirements?
When SPLUNK saves logs in raw data does it fulfill STIG requirement
Full requirement of Logging:
1.Logs must be tamper-evident
2.Log functionality must support logging of sensitive data (ie: encrypted, and viewable/decrypted only by authorized users)
3.The system shall support “centralized” log functionality
4.The system must support authorization for viewing/configuring logs
↧
How to calculate the difference between two rows with multiple fields?
I have a search returns two rows of records (check the result from the following query):
| makeresults
| eval date="2018-07-16", col1=4, col2=5, col3=6, col4=7
| append [| makeresults
| eval date="2018-07-17", col1=8, col2=9, col3=16, col4=17]
| fields - _time
| table date col1 col2 col3 col4
Is there a way to get the difference between the date from all the columns? Here is the expected result:
| makeresults
| eval date="2018-07-16", col1=4, col2=5, col3=6, col4=7
| append [| makeresults
| eval date="2018-07-17", col1=8, col2=9, col3=16, col4=17]
| append [| makeresults
| eval date="diff", col1=4, col2=4, col3=10, col4=10]
| fields - _time
| table date col1 col2 col3 col4
Thanks
↧
Two y Axis
I have a very simple index which tracks a competitive game. Over the course of the year, people earn points for the top 16 positions switch places. In a CSV I track their Name, rank, score at regular intervals.
Below is a screenshot of my data in a simple table and an example of the kinda of the graph I wish to build.
How would I go about making this?
Name on the right >< Rank on the left> time along the bottom <
![alt text][1]
[1]: /storage/temp/252248-x.png
↧
↧
How can I add and parse the XML data into Splunk
I am getting an XML of below formatGeorgia Georgia Atlanta Atlanta
I found a solution at [https://answers.splunk.com/answers/187195/how-to-add-and-parse-xml-data-in-splunk.html][1] but is not helping me with mentioned xml format for field extractions during data ingestion.
[1]: https://answers.splunk.com/answers/187195/how-to-add-and-parse-xml-data-in-splunk.html
↧
Linux servers - Universal Forwarder or Syslog
What is the best practice to capture data from our *nix servers? Install the Splunk forwarder agent and the Splunk for Unix app which feeds directly to our indexers? I thought a best practice was to have everything syslog-ed before being indexed.
If we only use remote syslog on our servers (not having the Splunk forwarder agent on our servers), I'm assuming we won't get the metrics that the Splunk for Unix app polls for.
↧
Insert Python script in the dashboard to preprocess data?
I've create a dashboard to visualize a business software log analysis. Before adding flume agent I was processing data in excel and then through Python program because there're a lot of tedious field extractions to do, such as extracting about ten fields from one column that's JSON formatted. And then I would manually upload processed `.csv` file and search in Splunk.
But after adding flume agent, the _raw data was really messy and not responding to my rex search. Also if I were to use rex the search string could be about 100 lines for each dashboard panel. So I was wondering if there's way to insert my Python program to the dashboard source to process data?
my props.config:
> [log_session]INDEXED_EXTRACTIONS =csv > FIELD_DELIMITER=,> [source::/datalake/log/******]> sourcetype = my_named_source
↧
VALUE FORMAT
Hi
i have a value like this in a field 2018067155420 and i want to format it with this format : yyyymmddhhmmss so
could you help me please??
↧
↧
Horizon Chart - How can I change the axis based on user's timezone?
Recently I observed that the axis in Horizon Chart is not following the timezone setting by the user. Please refer to the images below,
- The user with timezone setting as GMT+0800
![alt text][1]
- The user with timezone setting as GMT
![alt text][2]
Do I have any method that I can apply the timezone configured by the user in Splunk on the axis?
[1]: /storage/temp/251226-gmt0800-event-result.png
[2]: /storage/temp/251227-gmt-event-result.png
↧
can i run curl command in the search head
can i run curl command in the search head to access the rest api logs
↧
Calculating utilization of nodes connected in Tree network topology
I have been busting my brain on this for a few weeks with no clear solution, turning to the brainiacs in the Splunk community for help now.
I have bandwidth utilization data for a few thousand nodes of telecom stations. They are all connected in a tree network topology. For clearer understanding what is tree topology, image below is an example. A subnetwork begins when node A which is connected to a backhaul branches off links to child nodes. At any point in the branches, other sub-branches may branch off as well.
![alt text][1]
We are trying to build a dashboard of utilization data of the links between nodes. The challenge is my data is hourly utilization for individual nodes, but utilization data for each links need to accumulate the utilization for all nodes under it. Based on this example, utilization for link A-C is total of utilization of nodes C+F+H+G.
I have a lookup that provides the details for each node in this form:
![alt text][2]
My objective is to build
1) a scheduled search that is able to iterate over multiple trees to calculate utilization of each link (around 7k+ links total). Utilization of a link is total utilization of all its subnodes.
2) an interactive dashboard where the user can input the name of the link (e.g A-B or F-H) and it will calculate the current utilization of a link
Anyone can help? Your wildest ideas are accepted.
[1]: /storage/temp/252255-tree-topology.png
[2]: /storage/temp/252256-lookup-table.png
↧