Register to post in forums, or Log in to your existing account
 

Play RetroMUD
Post new topic  Reply to topic     Home » Forums » CMUD General Discussion
ReedN
Wizard


Joined: 04 Jan 2006
Posts: 1279
Location: Portland, Oregon

PostPosted: Wed Jun 18, 2008 9:41 am   

What's the proper way to escape '@' in a regex trigger?
 
I've having an issue where I need to match '@' in the text, but the regex it treating it as a variable.

String I want to match:

S--@h++,H++,CE100%,W<-SE@16kts,C/S->SE@0,V -

Regex:

^S..@h.{2},H.{2},CE\d{1,3}\%,W<-\w{1,3}@\d{1,2}kts,C/S->\w{1,3}@\d{1,2},\w{1,5} -$

With the above regex the '@' is messing up the matching. I tried to put a '\' in front of it but that didn't work. The only thing I could think of to get it to work was to match the '@' with a '.', as in:

^S...h.{2},H.{2},CE\d{1,3}\%,W<-\w{1,3}.\d{1,2}kts,C/S->\w{1,3}.\d{1,2},\w{1,5} -$

Does anyone know the proper way of doing this when you have actual '@'s you want to match?
Reply with quote
Fang Xianfu
GURU


Joined: 26 Jan 2004
Posts: 5155
Location: United Kingdom

PostPosted: Wed Jun 18, 2008 10:52 am   
 
I think you need to quote them with ~ so that the CMUD parser will ignore them - this should be the only time that ~ is stripped too in case you need to use it. I don't have CMUD here to check that, but I guess you can try it and see :P

In case that doesn't work, in the meantime you can use a range to define a bunch of characters you don't want it to match, like [^\w\d!"£$%\^&*()_\-+=[]{}',.~#?/] or something. I don't think any of those need quoting other than the ones I've already done.
_________________
Rorso's syntax colouriser.

- Happy bunny is happy! (1/25)
Reply with quote
Larkin
Wizard


Joined: 25 Mar 2003
Posts: 1113
Location: USA

PostPosted: Wed Jun 18, 2008 10:57 am   
 
I just tested in CMUD with the following code, and it fired just fine for me:
Code:
#REGEX {^Hello \@ home\.$} {#SAY "Hiya."}
#SHOW "Hello @ home."


Using ~ does not work in a regex pattern for escaping things like this. Razz
Reply with quote
Vijilante
SubAdmin


Joined: 18 Nov 2001
Posts: 5182

PostPosted: Wed Jun 18, 2008 11:02 am   
 
It is a bit of hoop to jump through. Use the octal code for them so CMud's parser has no idea what you are doing. The only time you really should have to do it is when the next characters could be interpretted as a variable name. In the case of your regex that would be the "@h", I changed them all anyways.
^S..\080h.{2},H.{2},CE\d{1,3}\%,W<-\w{1,3}\080\d{1,2}kts,C/S->\w{1,3}\080\d{1,2},\w{1,5} -$
_________________
The only good questions are the ones we have never answered before.
Search the Forums
Reply with quote
Larkin
Wizard


Joined: 25 Mar 2003
Posts: 1113
Location: USA

PostPosted: Wed Jun 18, 2008 1:49 pm   
 
I can see now that mine worked because I followed it with a space, but I think there's a bug here somewhere.

I tried this, and it didn't fire (which I consider to be a bug, personally):
Code:
#REGEX {^Hello\@home\.$} {#SAY "Hiya."}
#SHOW "Hello@home."


So, I tried this instead, and it still didn't fire for me:
Code:
#REGEX {^Hello\080home\.$} {#SAY "Hiya."}
#SHOW "Hello@home."


Am I missing something here?
Reply with quote
ReedN
Wizard


Joined: 04 Jan 2006
Posts: 1279
Location: Portland, Oregon

PostPosted: Wed Jun 18, 2008 2:06 pm   
 
I tried using the \080 but like Larkin I was unable to get that to work. I also verified that '~' doesn't work. It does work if I use [@] which is a match on a set of just one which I guess should be efficient enough and is what I'm currently using.

It does have me perplexed that there doesn't seem to be a way to escape this character in Cmud. Does this seem like a bug?
Reply with quote
Rahab
Wizard


Joined: 22 Mar 2007
Posts: 2320

PostPosted: Wed Jun 18, 2008 3:13 pm   
 
Zugg has already said in another posting that v2.28 will have a %quoteregex() function. Would this solve your problem?
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jun 18, 2008 5:04 pm   
 
It's a bug.

Normally you should use ~@Home because you are escaping the @ from CMUD parsing of the CMUD variable, which has nothing to do with regular expressions. The \ is used to escape characters within a regular expression from the PCRE engine.

Patterns are expanded by CMUD first to get the variable references, and that is what you want to stop from happening in this case, which is why you'd use a ~ character for it.

You can use the Compiled Pattern tab to see exactly what is happening with these two cases. When using the \@Home you will see that CMUD is still compiling a variable reference. When using the ~@Home you will see that the variable reference is gone, but the bug is that the ~ is not removed from the string pattern.

I'll try to get this bug fixed in 2.28.

Edited: I not sure why the ^Hello\080home\.$ doesn't work. I debugged CMUD and verified that it is sending that pattern directly to the PCRE engine. We might need Vijilante to advise us about why this doesn't work or if it's a bug in the PCRE.DLL.
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jun 18, 2008 5:28 pm   
 
OK, I changed my mind. When I looked at the CMUD code, it was already properly handling the difference between a normal zScript trigger pattern and a regular expression. It just wasn't handling the \ properly in the code.

So, ignore what I said above. In a *regular expression* the proper way to quote a @ is using the \ just like you normally would. You don't use ~ to quote anything in a regular expression. Sorry for the confusion.

But this still doesn't explain the issue with \080.
Reply with quote
Vijilante
SubAdmin


Joined: 18 Nov 2001
Posts: 5182

PostPosted: Wed Jun 18, 2008 7:10 pm   
 
I am examining the compiled pattern data from within my test app and it looks like the recent change to use the 30bit link size caused both octal and hex notations to break. It is going to take me a while figure out why since this really shouldn't have happened.
_________________
The only good questions are the ones we have never answered before.
Search the Forums
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jun 18, 2008 7:48 pm   
 
Geez, you'd think something as widely used as the PCRE.DLL would be debugged by now. For CMUD use, I think I'd rather have the 30-bit for larger string-list patterns than the octal/hex notation though. But definitely let me know if you find a solution.
Reply with quote
Vijilante
SubAdmin


Joined: 18 Nov 2001
Posts: 5182

PostPosted: Wed Jun 18, 2008 7:54 pm   
 
Well, that was fun. I guess I shouldn't post in the morning on my first cup of coffee. Octal of course means base 8 and has a valid range of 0 to 7, as you can see 080 is invalid and should have been 100. Sadly it took me reading through all the PCRE source to find where the conversion is done to realize this. My statement about the hex notation being off was due to a hasty test. I am used to using 0x when programming so I did \0x40, regex only uses the x so it should have been \x40. Everything looks to be working right once I got my brain moving in the right directions.
^S..\100h.{2},H.{2},CE\d{1,3}\%,W<-\w{1,3}\100\d{1,2}kts,C/S->\w{1,3}\100\d{1,2},\w{1,5} -$
_________________
The only good questions are the ones we have never answered before.
Search the Forums
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jun 18, 2008 10:22 pm   
 
Don't worry, I've had days like that too ;)

Thanks for giving all of us the explanation since nobody else noticed that \080 wasn't proper octal either.
Reply with quote
ReedN
Wizard


Joined: 04 Jan 2006
Posts: 1279
Location: Portland, Oregon

PostPosted: Thu Jun 19, 2008 1:13 am   
 
Where do you look up those codes?
Reply with quote
Vijilante
SubAdmin


Joined: 18 Nov 2001
Posts: 5182

PostPosted: Thu Jun 19, 2008 8:10 am   
 
Those codes are the ascii numbers for the character. You can get the number using #SHOW %ascii("character"). Then converting to octal or hex can be done in any number of ways. I tend to do it in my head, but you can use the calculator or some various script snippets that are floating around.
_________________
The only good questions are the ones we have never answered before.
Search the Forums
Reply with quote
mr_kent
Enchanter


Joined: 10 Oct 2000
Posts: 698

PostPosted: Thu Jun 19, 2008 8:29 am   
 
This link was given on these forums at one point and I had saved it in my browser. Hope it helps.
Reply with quote
ReedN
Wizard


Joined: 04 Jan 2006
Posts: 1279
Location: Portland, Oregon

PostPosted: Thu Jun 19, 2008 8:55 am   
 
I didn't think to use %ancii. And that link was helpful too, thanks!

Converting it isn't a problem. I'm an EE so converting bases is second nature.
Reply with quote
Display posts from previous:   
Post new topic   Reply to topic     Home » Forums » CMUD General Discussion All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

© 2009 Zugg Software. Hosted by Wolfpaw.net