Hi,
I wanted to see if anyone else had come across some strange behaviour when using the (?J) mode modifier in the 'rex' command.
This modifier should allow you to use the same capture group name more than once, in the same regular expression. If you try and do this without the modifier, you get the error:
Regex: two named subpatterns have the same name
In some 'rex' work that I'm doing, I'm using the Regular Expression conditionals syntax for 'If, then, else'.
The syntax for this is:
(?(?=regex)then|else)
I'm using a number of these in a nested way to match some code in Cisco ACLs that has very poor (read awful) syntax structure.
(Anyway, that's another story).
The problem that I'm seeing in Splunk, is that if the same capture group name is in both the 'Then' and 'Else' parts, then it will only extract for the 'Else' case.
If it matches in the 'Then' part, it would appear that the field gets 'nulled' due to the second definition in the 'Else' part. This feels wrong, as if the 'Then' case is matched, the regex engine shouldn't be tracking through the 'Else' part.
You can test this behaviour in Splunk with the following test case:
| makeresults
| eval case1="a then match"
| eval case2="a else match"
| rex field=case1 "(?J)a (?(?=then)(?then)|(?else)) match"
| rex field=case2 "(?J)a (?(?=then)(?then)|(?else)) match"
| table case*
In the resulting table, you should get:
case1 = "a then match"
case1_match = "then"
case2 = "a else match"
case2_match = "else"
What actually happens is that the field 'case1_match' is blank / null.
I've tried the expression in the online Regex101 site (unfortunately I can't post URLs yet, but copy/paste 'regex101.com/r/lX2uY8/2').
Has anyone else come across this issue before?
Is it by design or is it a bug?
Im sure that there are other ways for me to tackle what I'm looking at (I'm not too worried about that). What I just want to know if if this is functionality that 'should' work in Splunk.
This is in version 6.4.2 of Splunk Enterprise.
Many thanks,
Graham.
↧