|
ReedN Wizard
Joined: 04 Jan 2006 Posts: 1279 Location: Portland, Oregon
|
Posted: Tue Mar 18, 2008 3:02 pm
Question about matching a blank line |
I am sometimes in the situation where I want to either match a blank line or a line that starts with several possibilities. I've never been able to capture a completely blank line with anything except for:
^$
If I try to combine it with other things such as:
^(?:First|Second)?$
or
^(?:First|Second|$)
It never works for capturing a blank line. Anyone know why? |
|
|
|
shalimar GURU
Joined: 04 Aug 2002 Posts: 4715 Location: Pensacola, FL, USA
|
Posted: Tue Mar 18, 2008 3:09 pm |
thats regex right? you got the regex box checked i assume
|
|
_________________ Discord: Shalimarwildcat |
|
|
|
ReedN Wizard
Joined: 04 Jan 2006 Posts: 1279 Location: Portland, Oregon
|
Posted: Tue Mar 18, 2008 11:03 pm |
yes, of course.
|
|
|
|
Anaristos Sorcerer
Joined: 17 Jul 2007 Posts: 821 Location: California
|
Posted: Tue Mar 18, 2008 11:42 pm |
Both ^ and $ are anchors. They don't match any characters, they match positions. So (?:First|Second|$) will fail if "First" or "Second" is not found. The $ in the expression is simply ignored (from your POV). You could try ^(?:First|Second|\s+)$. It might do what you want.
|
|
_________________ Sic itur ad astra. |
|
|
|
JQuilici Adept
Joined: 21 Sep 2005 Posts: 250 Location: Austin, TX
|
Posted: Tue Mar 18, 2008 11:56 pm |
This looks like a bug, and it is present in 2.20 as well, so it should get reported. Zugg has actually done some recent work on the regex code
Anaristos' comment is correct regarding the anchor characters. However, the simple pattern '^ ?$' should match (a) a blank line, or (b) a line with a single space on it, and it does not appear to do so. It will match (b), but not (a), according to my tests.
e.g. try this at the command-line
#REGEX {^$} {#say blank 1}
#REGEX {^ ?$} {#say blank 2}
#show ""
#show " "
Blank 2 should fire on both #shows, but it will only fire on the second. |
|
_________________ Come visit Mozart Mud...and tell an imm that Aerith sent you! |
|
|
|
Anaristos Sorcerer
Joined: 17 Jul 2007 Posts: 821 Location: California
|
Posted: Tue Mar 18, 2008 11:59 pm |
The problem with ^(?:First|Second)?$ is that it will only succeed if you have "First" or "Second" once in the line starting at the beginning of the line and nothing else, everything else is a failure so the trigger won't fire.
Both ^ and $ are anchors. They don't match any characters, they match positions. So (?:First|Second|$) will fail if "First" or "Second" is not found anywhere in the line. The $ in the expression is simply ignored (from your POV). You could try ^(?:First|Second|\s+)$. It might do what you want. |
|
_________________ Sic itur ad astra. |
|
|
|
Anaristos Sorcerer
Joined: 17 Jul 2007 Posts: 821 Location: California
|
Posted: Wed Mar 19, 2008 12:08 am |
Sorry, JQuilici, I was editing my post. You are correct, of course. (^$) is the standard way to match a blank line. Regarding ^ ?$, it does work for me on tests outside of cMUD. So maybe there is a problem.
|
|
_________________ Sic itur ad astra. |
|
|
|
shalimar GURU
Joined: 04 Aug 2002 Posts: 4715 Location: Pensacola, FL, USA
|
Posted: Wed Mar 19, 2008 3:20 am |
maybe if you removed the space from between ^ and ?
|
|
_________________ Discord: Shalimarwildcat |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Mar 19, 2008 4:08 am |
All your regexps are OK. These should all match a blank line, as you expected:
^$
^(?:First|Second)?$
^(?:First|Second|$)
Testing using perl on UNIX:
Code: |
$ perl -ne 'print(/^(?:First|Second)?$/ ? 'True' : 'False', "\n");'
True
f
False
First
True
Second
True
$ perl -ne 'print(/^(?:First|Second|$)/ ? 'True' : 'False', "\n");'
True
f
False
First
True
Second
True
|
You can indeed put zero-width assertions into a group, or as an alternative. |
|
|
|
JQuilici Adept
Joined: 21 Sep 2005 Posts: 250 Location: Austin, TX
|
Posted: Wed Mar 19, 2008 4:10 am |
Shalimar, the point is that the '?' character in a regular expression means 'match the previous thing 0 or 1 times'. Or, equivalently, 'match whether or not the previous thing appears'. So, '^ ?$' means 'match <start of line><optional space><end of line>', and should thus match a blank line, or a line with a single space on it. '^?$' would mean something entirely different (and would be almost worthless - it would match any line that had an ending!)
The pattern in the original post ('^(?:First|Second)?$') should have matched a blank line for the same reason. It should also have matched lines that contained, in their entirety, "First" or "Second". I gather that "First" and "Second" are really placeholders for something else...what he's trying to achieve is a pattern that will match one of a handful of things, or a blank line, but nothing else. |
|
_________________ Come visit Mozart Mud...and tell an imm that Aerith sent you! |
|
|
|
Anaristos Sorcerer
Joined: 17 Jul 2007 Posts: 821 Location: California
|
Posted: Wed Mar 19, 2008 4:19 am |
^(?:First|Second|$) will return true for a null line, but not for a blank line. ^(?:First|Second)?$ will not return true for a blank line, either. $ will always be taken by the regex engine to mean "anchor this to the end of the line".
|
|
_________________ Sic itur ad astra. |
|
|
|
ReedN Wizard
Joined: 04 Jan 2006 Posts: 1279 Location: Portland, Oregon
|
Posted: Wed Mar 19, 2008 4:36 am |
So is this a bug? The absolute only way I can get it to match on a blank line is with:
^$
Nothing else works, and I've tried a lot of stuff including:
^\s*$
Which should work and I can't explain why it doesn't. If it did work I bet the other more complicated ones involving multiple items discussed above would work too. |
|
|
|
Anaristos Sorcerer
Joined: 17 Jul 2007 Posts: 821 Location: California
|
Posted: Wed Mar 19, 2008 4:40 am |
^\s*$ works for me outside of CMUD. I haven't tested it with the client, but it is a perfectly valid way to test for a blank line. Remember, though. ^ and $ are not part of the line, they just tell the engine the starting and ending positions for the search.
|
|
_________________ Sic itur ad astra. |
|
|
|
Vijilante SubAdmin
Joined: 18 Nov 2001 Posts: 5182
|
Posted: Wed Mar 19, 2008 9:33 am |
I will do some testing of it and see if I can figure out why it isn't matching. I just have to adjust my test app to give me more information, but I am guessing it is because of the options used.
|
|
_________________ The only good questions are the ones we have never answered before.
Search the Forums |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Wed Mar 19, 2008 9:18 pm |
Anaristos wrote: |
^(?:First|Second|$) will return true for a null line, but not for a blank line. ^(?:First|Second)?$ will not return true for a blank line, either. $ will always be taken by the regex engine to mean "anchor this to the end of the line". |
Both of those work fine for a blank line in Perl on UNIX. See my example above. I just noticed I forgot to chomp; the input, but it I tried that, and it still works. |
|
|
|
Vijilante SubAdmin
Joined: 18 Nov 2001 Posts: 5182
|
Posted: Wed Mar 19, 2008 10:45 pm |
Zugg answered in another topic in the Beta forum stating that CMud bypasses the regex engine to match a blank line. He stated that the only pattern supported currently is "^$", and come to think of it I will suggest a small change to him that might make this better.
|
|
_________________ The only good questions are the ones we have never answered before.
Search the Forums |
|
|
|
shalimar GURU
Joined: 04 Aug 2002 Posts: 4715 Location: Pensacola, FL, USA
|
Posted: Thu Mar 20, 2008 2:37 am |
i thought a pattern of just '$' would also get a blank line. least i remember it used to
|
|
_________________ Discord: Shalimarwildcat |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Thu Mar 20, 2008 2:33 pm |
No, "$" should match every line, since every line has an end. Just "^" does as well.
|
|
|
|
shalimar GURU
Joined: 04 Aug 2002 Posts: 4715 Location: Pensacola, FL, USA
|
Posted: Thu Mar 20, 2008 2:37 pm |
then explain why
#GAG $
Doesnt eat up every line :)
I knew it worked! |
|
_________________ Discord: Shalimarwildcat |
|
|
|
Anaristos Sorcerer
Joined: 17 Jul 2007 Posts: 821 Location: California
|
Posted: Fri Mar 21, 2008 12:01 am |
You are simply anchoring the (blank) line to the end.
|
|
_________________ Sic itur ad astra. |
|
|
|
Zhiroc Adept
Joined: 04 Feb 2005 Posts: 246
|
Posted: Fri Mar 21, 2008 4:52 pm |
shalimar wrote: |
then explain why
#GAG $
Doesnt eat up every line :)
I knew it worked! |
A non-regex trigger of "$" does seem to match only blank lines. But if you look at the test Pattern tab, that "$" is actually the regex "^$". If you create a regex trigger "$" it will match every line. Do this:
#REGEX {$} {#WIN debug {saw %trigger}}
and you'll see it in action. "#GAG $" creates a non-regex trigger |
|
|
|
|
|