We are forwarding a directory consisting of hundreds of batch job execution logs. However Splunk reindexes the logs buy splitting the logs into multiple events(3, 4. ...sometimes 10 events). As a result of this behaviour, the number of events and for that matter the volume of data is increasing exponentially. The nature/size of logs are not are distinct however the header and footer details are in similar formats. I have provided a snapshot of a sample log file and how splunk splits and indexes the data below:
Actual Log File:
===============================================================
= JOB : ABCD[(0900 03/23/16),(0AAAAAAAAAAARCVF)].tttttt
= USER : deb Sponsor svvbnmn,SHELL=/bin/ksh
= JCLFILE : $HOME/jobs/xyz.sh
= Job Number: 20
= Thu 03/24/16 00:43:18 EDT
===============================================================
ABC for UNIX/ghcgv 11.2
HGF Starting /opt/app/hghj/dxdxfd/VCX/ghcgv $HOME/jobs/xyz.sh
Tivoli Workload Scheduler (UNIX)/ghcgv 11.2 (20130417)
Installed for user "dxdxfd".
Locale LANG set to the following: "en"
stty: : No such device or address
stty: : No such device or address
stty: : No such device or address
+------------------------------------------------------------+
xyz.sh; Message; Program started at: 03/24/2016 00:43:18
Machine Job Starting...........
Waiting for job...
Finished waiting for job
Job Status : (1)
Status code = 1
Job submitted successfully
MachineJob Ending.............
xyz.sh; Message; Program ended successfully at: 03/24/2016 00:44:54
===============================================================
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 1
= User Time (Seconds) : 0
= Thu 03/24/16 00:44:54 EDT
===============================================================
How Splunk indexes the log file:
Event-1:
3/24/16
12:43:18.000 AM
===============================================================
= JOB : ABCD[(0900 03/23/16),(0AAAAAAAAAAARCVF)].tttttt
= USER : deb Sponsor svvbnmn,SHELL=/bin/ksh
= JCLFILE : $HOME/jobs/xyz.sh
= Job Number: 20
= Thu 03/24/16 00:43:18 EDT
===============================================================
ABC for UNIX/ghcgv 11.2
HGF Starting /opt/app/hghj/dxdxfd/VCX/ghcgv $HOME/jobs/xyz.sh
Tivoli Workload Scheduler (UNIX)/ghcgv 11.2 (20130417)
Installed for user "dxdxfd".
Locale LANG set to the following: "en"
stty: : No such device or address
stty: : No such device or address
stty: : No such device or address
+------------------------------------------------------------+
xyz.sh; Message; Program started at: 03/24/2016 00:43:18
DataStage Job Starting...........
Waiting for job...
Event-2:
3/24/16
12:43:18.000 AM
===============================================================
= JOB : ABCD[(0900 03/23/16),(0AAAAAAAAAAARCVF)].tttttt
= USER : deb Sponsor svvbnmn,SHELL=/bin/ksh
= JCLFILE : $HOME/jobs/xyz.sh
= Job Number: 20
= Thu 03/24/16 00:43:18 EDT
===============================================================
ABC for UNIX/ghcgv 11.2
HGF Starting /opt/app/hghj/dxdxfd/VCX/ghcgv $HOME/jobs/xyz.sh
Tivoli Workload Scheduler (UNIX)/ghcgv 11.2 (20130417)
Installed for user "dxdxfd".
Locale LANG set to the following: "en"
stty: : No such device or address
stty: : No such device or address
stty: : No such device or address
+------------------------------------------------------------+
xyz.sh; Message; Program started at: 03/24/2016 00:43:18
Machine Job Starting...........
Waiting for job...
Event-3:
3/24/16
12:44:54.000 AM
===============================================================
= JOB : ABCD[(0900 03/23/16),(0AAAAAAAAAAARCVF)].tttttt
= USER : deb Sponsor svvbnmn,SHELL=/bin/ksh
= JCLFILE : $HOME/jobs/xyz.sh
= Job Number: 20
= Thu 03/24/16 00:43:18 EDT
===============================================================
ABC for UNIX/ghcgv 11.2
HGF Starting /opt/app/hghj/dxdxfd/VCX/ghcgv $HOME/jobs/xyz.sh
Tivoli Workload Scheduler (UNIX)/ghcgv 11.2 (20130417)
Installed for user "dxdxfd".
Locale LANG set to the following: "en"
stty: : No such device or address
stty: : No such device or address
stty: : No such device or address
+------------------------------------------------------------+
xyz.sh; Message; Program started at: 03/24/2016 00:43:18
Machine Job Starting...........
Waiting for job...
Finished waiting for job
Job Status : (1)
Status code = 1
Job submitted successfully
MachineJob Ending.............
xyz.sh; Message; Program ended successfully at: 03/24/2016 00:44:54
===============================================================
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 1
= User Time (Seconds) : 0
= Thu 03/24/16 00:44:54 EDT
===============================================================
Question: We would like to have Splunk index the data as a single event instead of multiple events. Can you please help suggest the approach to deal in such a scenario? We were not able to find a solution reading through the blogs and would really love to hear from you.
↧