Register to post in forums, or Log in to your existing account
 

Play RetroMUD
Post new topic  Reply to topic     Home » Forums » CMUD General Discussion
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Fri Jul 06, 2007 4:58 pm   

Regular expression syntax in CMUD v2.0
 
This is a continuation of the discussion about allowing a short-cut syntax for specifying regular expression patterns in the upcoming 2.0 version of CMUD.

The reason for allowing normal trigger patterns to contain "embedded" regular expression syntax is so that we don't have to have duplicate commands and functions...one for normal zscript trigger patterns, and one for regular expressions. Also, there are some commands that don't currently support regular expressions: for example, the (#IF @var =~ "pattern") expression.

Several people have suggested using the syntax "/regex/" to specify a regular expression. I really like this, but it causes a problem for any current trigger that contains a "/" character. And the "/" character is pretty common, especially on MUD prompts like "100/500hp 200/300 mana" etc.

I want to allow "embedded" regular expressions. So in the above example, CMUD might think that "/500hp 200/" was an embedded regular expression, and this would break a lot of existing triggers.

Even though I personally like the look of "/regex/", maybe we can use something like "#regex#" instead? I think Perl uses that syntax (actually, I think Perl allows any delimiter to be used).

Related to this is how to specify options, such as case-sensitive. In Perl, PHP, and other languages that use regular expression syntax, you can do something like this: "/regex/i" to specify a case-insensitive regular expression. Again, this doesn't work very well for embedded regular expressions within a normal trigger pattern.

So, I'm looking for suggestions on how people would like to see this problem solved. I think it would be a really useful new feature if we can come up with a good syntax that isn't too kludged.
Reply with quote
Arminas
Wizard


Joined: 11 Jul 2002
Posts: 1265
Location: USA

PostPosted: Fri Jul 06, 2007 5:38 pm   
 
Does the delimiter have to be a single char? What about %/regex/% with the %/ as the opener and the /% for the closing? Then tell the parser to look ahead if it finds a %/ and if it does not find a /% to treat the pattern normally instead of parsing for regex?
_________________
Arminas, The Invisible horseman
Windows 7 Pro 32 bit
AMD 64 X2 2.51 Dual Core, 2 GB of Ram
Reply with quote
Thinjon100
Apprentice


Joined: 12 Jul 2004
Posts: 190
Location: Canada

PostPosted: Fri Jul 06, 2007 6:14 pm   
 
If you can use multiple characters, and you're going to encapsulate the global/insensitive/etc flags, you'll probably want to go with something more along the lines of %/regex/flags/%. The middle / should be unharmed by the regex, as normal regex won't parse a / without a \ delimiter.

This does bring one issue with it, though... if you're going to allow inline embedded regex... which escape delimiter is used? In normal zScript, the ~ is the escape, while in regex it's the more traditional \... are we going to have issues on conflicting escape characters, where we'll have to double-escape things, so that zScript and regex don't attempt to assign them their special functions, or will the zScript parser ignore everything within an embedded regex, thus returning us to only the use of the backslash?
_________________
If you're ever around Aardwolf, I'm that invisible guy you can never see. Wizi ftw! :)
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Fri Jul 06, 2007 6:55 pm   
 
What I plan to do with the escape characters is to allow both. If you use the traditional \ character, it gets passed untouched to the regular expression. But CMUD will also parse the ~ character and will change it into a \ character. This will allow people who want to consistently use the ~ character to work properly. Of course, to embed a literal ~ character you can use ~~ or \~. Both of those will be parsed correctly.

The %/regex/flags/% idea is pretty good. I don't think that interferes with any existing syntax. The % is already used for function calls and for %1..%99, but using %/ should be easy to parse. The advantage of this over the #regex# syntax is that it allows easy implementation of the regex flags feature.
Reply with quote
Zhiroc
Adept


Joined: 04 Feb 2005
Posts: 246

PostPosted: Fri Jul 06, 2007 9:17 pm   
 
Well, as this is CMUD 2.0, if the parser is extended to allow for /regex/flags directly, without the use of quotes, then there is no confusion, which seems to be the best of both worlds.

#if ($str =~ "pattern")...

#if ($str =~ /regexp/flags)...

Or is changing the parser not really possible?

By the way, you can change the delimiters in Perl, but you have to then use the explicit match syntax (m#regexp# or m(regexp) which demonstrates bracketing delimiters).
Reply with quote
Tech
GURU


Joined: 18 Oct 2000
Posts: 2733
Location: Atlanta, USA

PostPosted: Fri Jul 06, 2007 11:49 pm   
 
Since using the /regex/ approach could cause issues with the existing triggers I'm in favor of Amrinas' suggestion. %/regex/flags% works for me.
_________________
Asati di tempari!
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Sat Jul 07, 2007 12:04 am   
 
Zhiroc: changing the parser like that doesn't allow the regex to be embedded into the middle of an existing pattern. It's not just the #IF statement that I'm trying to fix...I'm trying to come up with something general that will work for *all* trigger patterns to allow embedded regex syntax.
Reply with quote
haiku
Wanderer


Joined: 19 Nov 2004
Posts: 70

PostPosted: Sun Aug 05, 2007 1:10 am   
 
ruby syntax on regex can be either /regex/ or %r{regex}. Would that help?
Reply with quote
Fang Xianfu
GURU


Joined: 26 Jan 2004
Posts: 5155
Location: United Kingdom

PostPosted: Sun Aug 05, 2007 3:19 am   
 
I suppose it's a bit belated now, but I quite like that %r idea. It seems a very zScript-like syntax. I could imagine there being technical problems with that kind of function-ish syntax, though, since it's not something that's ever been used before.
_________________
Rorso's syntax colouriser.

- Happy bunny is happy! (1/25)
Reply with quote
Daagar
Magician


Joined: 25 Oct 2000
Posts: 461
Location: USA

PostPosted: Sun Aug 05, 2007 2:36 pm   
 
Zugg's latest blog entry sounded like he already went with the %/regex/% format.
Reply with quote
Display posts from previous:   
Post new topic   Reply to topic     Home » Forums » CMUD General Discussion All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

© 2009 Zugg Software. Hosted by Wolfpaw.net