One of our 3rd party apps has some pretty unfriendly logging. The app itself carries out somewhere between 20-30 jobs, each of which has its own log. the issue we have is that all logs are written to one directory and the log files themselves are named such as this
20200213.445933.log
The only way to distinguish between job log files is by a header within each log that has a description included. A further issue is that every line in the file is prefixed with a date and time. This results in Splunk splitting every line into a separate event even when the true event may be several lines long. for example:
[2020-02-13 15:00:34] #########################################################
[2020-02-13 15:00:34] # Log File Path: /data/logs/jobs/20200213.445933.log
[2020-02-13 15:00:34] # Creation Date: Thu Feb 13 15:00:34 GMT 2020
[2020-02-13 15:00:34] # Description: DQ:Import DQ CAR Files
[2020-02-13 15:00:34] # Parameters: --terminatetime 175000 -mapping 52000 -daemon yes -rb true
[2020-02-13 15:00:34] #########################################################
[2020-02-13 15:00:34] 'INIT' actions:
[2020-02-13 15:00:34] Collect Files
[2020-02-13 15:00:34] Collect Files Action
[2020-02-13 15:00:34] Connected: ftp://***********************
[2020-02-13 15:00:34] Filter: ^BT.*\.CAR
[2020-02-13 15:00:35] Files found: 0
[2020-02-13 15:00:35] Retrieving batches for mapping : DQ CAR Records
[2020-02-13 15:00:35] Found no Batch files to import
[2020-02-13 15:00:35] No 'CLSE' actions
[2020-02-13 15:01:35] 'INIT' actions:
[2020-02-13 15:01:35] Collect Files
[2020-02-13 15:01:35] Collect Files Action
[2020-02-13 15:01:35] Connected: ftp://***********************
[2020-02-13 15:01:35] Filter: ^BT.*\.CAR
[2020-02-13 15:01:35] Files found: 0
[2020-02-13 15:01:35] Retrieving batches for mapping : DQ CAR Records
[2020-02-13 15:01:35] Found no Batch files to import
[2020-02-13 15:01:35] No 'CLSE' actions
[2020-02-13 15:02:45] 'INIT' actions:
[2020-02-13 15:02:45] Collect Files
[2020-02-13 15:02:46] Collect Files Action
[2020-02-13 15:02:46] Connected: ftp://***********************
[2020-02-13 15:02:46] Filter: ^BT.*\.CAR
[2020-02-13 15:02:46] Files found: 0
[2020-02-13 15:02:46] Retrieving batches for mapping : DQ CAR Records
[2020-02-13 15:02:46] Found no Batch files to import
[2020-02-13 15:02:46] No 'CLSE' actions
[2020-02-13 15:03:47] 'INIT' actions:
[2020-02-13 15:03:47] Collect Files
[2020-02-13 15:03:47] Collect Files Action
[2020-02-13 15:03:47] Connected: ftp://***********************
[2020-02-13 15:03:47] Filter: ^BT.*\.CAR
[2020-02-13 15:03:47] Files found: 0
[2020-02-13 15:03:47] Retrieving batches for mapping : DQ CAR Records
[2020-02-13 15:03:47] Found no Batch files to import
[2020-02-13 15:03:47] No 'CLSE' actions
One event would actually look like this:
[2020-02-13 15:00:34] 'INIT' actions:
[2020-02-13 15:00:34] Collect Files
[2020-02-13 15:00:34] Collect Files Action
[2020-02-13 15:00:34] Connected: ftp://***********************
[2020-02-13 15:00:34] Filter: ^BT.*\.CAR
[2020-02-13 15:00:35] Files found: 0
[2020-02-13 15:00:35] Retrieving batches for mapping : DQ CAR Records
[2020-02-13 15:00:35] Found no Batch files to import
[2020-02-13 15:00:35] No 'CLSE' actions
Our 3rd party developer has advised that this cannot be changed, so the only option is to work around this in splunk somehow.
I was wondering if it is possible to regex out the description in each log and assign it as a sourcetype. Each sourcetype could then have its own event splitting rules. Is this possible?
↧