Register to post in forums, or Log in to your existing account
 

Play RetroMUD
Post new topic  Reply to topic     Home » Forums » CMUD General Discussion Goto page Previous  1, 2
oldguy2 Posted: Mon Jul 21, 2008 9:39 am
[2.32-33] Triggers in stringlist variables
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Tue Jul 29, 2008 7:19 pm   
 
As you said, using variables and string lists in RE is not a standard part of the PCRE engine. I'd rather that @list work the same in zScript and RegEx since more people are using RegEx triggers these days. The fact that zMUD doesn't handle this correctly in REGEX triggers is a bug in zMUD, not a problem in CMUD.

Like I said, using %%string is an easy workaround for people who understand what they are doing. So just use %%string when you want to preserve the order of values in the list. Like I said, 99% of the time you won't want to use "set|setting" instead of "setting|set" anyway. I am not going to change how this works in CMUD.

The issue with %subregex is a bit different. The argument sent to this function is a *string* and not a pattern. Yes, this is an obscure implementation detail, but the @tmp is only sorted when used within a pattern and not when expanded within a string argument. For example, if you used %concat(@tmp,"whatever") you would *not* want @tmp sorted in reverse order. There is no way for CMUD to tell the difference between using @tmp in %concat vs using it in the %subregex function. Both functions take string arguments and not patterns. I could kludge the parser to make an exception for the %subregex function, but I hate kludging the parser like that.

The reason CMUD sorts the string list in the first place is so that CMUD handles the default behavior that the vast majority of players want. I'm trying very hard to make CMUD a *MUD Client* and not a *programming language*. So I want it to work "out of the box" the way a novice MUD player would expect it to. Using %subregex is not something that a novice user is going to use. In fact, it's one of the most complex functions in CMUD. So again, I expect the power-users using %subregex to be able to understand differences in how things work.
Reply with quote
Zhiroc
Adept


Joined: 04 Feb 2005
Posts: 246

PostPosted: Wed Jul 30, 2008 3:37 pm   
 
The more I think about it, the more I think that this is the wrong thing to do.

1) Having a trigger potentially change its behavior when a variable changes from Autotype to String list is bad, particularly since the PE formats an autotyped stringlist as a stringlist, not a string. If a trigger is working with an autotype variable, it may break if you do a #ADDITEM, because that changes it to a string list.
2) Having the same-looking RE act differently in the trigger and in a %subregex or %regex (or now, any RE if-expression?) is bad
3) You are not guaranteeing that the alternation picks the longest string, just the longest alternation pattern (e.g. "ab*|abbb" when matched on "abbbbbbb" would still only match "abbb"). And you get other oddities "abb*|abbb" would still match only "abbb". But if they added one more character to the first, "abbb*|abbb" matches "abbbbbbb".
4) Assuming that CMUD is the first exposure to RE's that novices get, you shouldn't be "training" them to think that this is how RE alternations work--they won't really understand the nuances of stringlists vs. strings vs static patterns. vs. CMUD and all other RE engines. You are doing a disservice to them as a tutorial.

I see no positives, only negatives from a language design point of view. Consistency, especially within zscript, is simply not there, and consistency from the point of view of other RE engines is certainly not there. I know you value standards and consistency as well (referring to the previous discussion on telnet protocols). While this doesn't rise to the level of a standard violation, it violates the "principle of least surprise".
Reply with quote
oldguy2
Wizard


Joined: 17 Jun 2006
Posts: 1201

PostPosted: Wed Jul 30, 2008 6:31 pm   
 
Quote:
just the longest alternation pattern (e.g. "ab*|abbb" when matched on "abbbbbbb" would still only match "abbb").


No it doesn't. The trigger "^(abbb|ab*)$" matches abbbbbbbbbbbbbbbbbbbbbbbbbbbbb even.

Tested with your previous trigger setup in the other thread using that alternation above.

ab
fired - ab
abb
fired - abb
abbb
fired - abbb
abbbb
fired - abbbb
abbbbb
fired - abbbbb
abbbbbb
fired - abbbbbb
abbbbbbb
fired - abbbbbbb
abbbbbbbb
fired - abbbbbbbb
abbbbbbbbb
fired - abbbbbbbbb
abbbbbbbbbb
fired - abbbbbbbbbb
abbbbbbbbbbb
fired - abbbbbbbbbbb
abbbbbbbbbbbb
fired - abbbbbbbbbbbb
abbbbbbbbbbbbb
fired - abbbbbbbbbbbbb
abbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbb
abbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbbbbbbb
abbbbbbbbbbbbbbbbbbbbb
fired - abbbbbbbbbbbbbbbbbbbbb
Reply with quote
Zugg
MASTER


Joined: 25 Sep 2000
Posts: 23379
Location: Colorado, USA

PostPosted: Wed Jul 30, 2008 7:20 pm   
 
I think we need to agree to disagree. This isn't going to change.

I am not "teaching" people anything wrong about RE. As you said yourself, RE doesn't have any way for using variables in the first place. So a syntax like (@tmp) is already *not* a valid regular expression. No other regular expression engines have anything like the string-list that zMUD/CMUD has. So you can't compare this with other RE engines in the first place. If you put your string-list inline and get rid of the @tmp reference, then CMUD doesn't touch it (in a regular expression), just like you'd expect.

I will come back to my example that is *very* common in most MUD setups: you want to color the names of your friends. So you create a string list to contain your friends names. But you don't create the list all at once, you create it over time by adding new people to the list as you wish. For example:

#ADDITEM Friends Zugg
#ADDITEM Friends Sam
#ADDITEM Friends Samuel

Now you create a trigger. You decide to use a regular expression trigger, so you do this:

#REGEX {(@Friends)} {#CW red}

Now the MUD displays:

"Samuel tells you Hello!"

Without the automatic sorting that CMUD performs, only "Sam" would be highlighted in the above example, and the obviously *not* what the player wants. In fact, if the player sorts the @friends list with the #SORT command (or using the Sort checkbox in the settings editor), then this trigger will never work properly. It would never highlight the full "Samuel" because "Sam" would come first in the string list.

Without the automatic sorting, there would be no way to fix this without introducing some new function that reverse-sorts a string list and returns it. So there would be no workaround. (Yes, in some situations you could use word-boundaries, but not in other cases) The only way to make it work would be to turn off sorting and then go into the settings editor and re-order the items in the list so that "Samuel" came before "Sam".

When would you *ever* want it to just highlight "Sam" and not "Samuel" in the above example? Now imagine a very large Friends list with hundreds of different names. Without the automatic sorting, you'd never be able to manage your string list and your triggers would always be failing to match the correct patterns.

My point is that the number of times you will actually care about the order of the items in the string list is very small. It really only comes into play when you start putting complex expressions within the string list and start using the new feature that allows nested wildcards (which is a very recent feature that only power-users use). It's very easy for power-users to just use %%string(@tmp) in this case.

Why should I make playing MUDs more complicated for 99% of the players who just want their triggers to work?

Anyway, this discussion is pointless because as I said, it's not going to change. If anything changes then it will be to make the %subregex work the same as the #REGEX command and also automatically sort nested string lists.
Reply with quote
Vijilante
SubAdmin


Joined: 18 Nov 2001
Posts: 5182

PostPosted: Thu Jul 31, 2008 1:56 am   
 
I have to second Zugg's statement that this should not change.

The use of (@var) and (?:@var) directly within a regex pattern is a CMud extension. Zugg wants to make it more useful to more people, and doing the sorting as it is done is the right way to do it. When a variable is a string list the order tends to remain FIFO, but that is not guaranteed. The extension also supports record variables which nearly always change key order with each addition. Sorting is needed in these cases.

In my experience there only a few times the order is important. One is when a match needs to include additional characters if they are present instead of allowing a following wildcard to eat them, for example "(abc|ab|a).*". In that example we see CMud is doing the user a favor by structuring thier variable in the best fashion for its extension. Another time is when a subpatter needs to choose between matching a character or stopping, for example "(?<A>(?>[^\045\042]+|(?<=\176)[\045\042]|(?(R&A)\042|\042(?&A)))+)". Here the order is important because it doesn't matter whether we are in a recursion or not to how we handle a specific 2 character sequence, in fact we don't want that 2 character sequence to change the recursion state, and must therefore be checked first. If someone can handle writing and understanding that second one then I am sure they can find pattern matching help, and should see something about the CMud extensions there. If it isn't in there then it will be added shortly.

Oldguy2, that is because your pattern is anchored on the end with a $. All of Zugg's examples are based on newer users that would not realize they should use the %q or regex \b pattern items. The addition of a termination anchor causes the regex to fail the first part of your alternation and try the second. Expalaining the need for such word boundary anchors has been among the top 10 support questions for a while.
_________________
The only good questions are the ones we have never answered before.
Search the Forums
Reply with quote
oldguy2
Wizard


Joined: 17 Jun 2006
Posts: 1201

PostPosted: Thu Jul 31, 2008 7:09 pm   
 
Yeah you're right. It does only match on the abbb without the anchor. Sorry if I confused anyone again. lol Geesh what was I doing...oh yeah.

On a side note, there isn't much in the way of the Help Files that explains stringlist usage in this way, and the #regex file doesn't even mention word boundries like \b.

The only thing it says is:

Quote:
(exp) group the regular expression "exp" into a single pattern. The matching pattern is stored in the %1..%99 variables.
(?:exp) group the regular expression as above, but do NOT store the value in the %1..%99 variables.

exp1 | exp2 match expression exp1 OR expression exp2. Any number of expressions can be listed, separated by |


It would be nice if a list was added such as the table on Regular-Expressions or at least more examples. I guess users can look it up themselves though and it would be considered an advanced topic.
Reply with quote
Zhiroc
Adept


Joined: 04 Feb 2005
Posts: 246

PostPosted: Thu Jul 31, 2008 8:03 pm   
 
It's not a bad idea to have the documentation on RE's kept local to CMUD. The other sites won't necessarily track the version of the RE library being used, and the CMUD version can document the specific behavior of string list variables (as well as things like %%string, etc.)
Reply with quote
Vijilante
SubAdmin


Joined: 18 Nov 2001
Posts: 5182

PostPosted: Thu Jul 31, 2008 10:48 pm   
 
I will make some additions to the helps as soon as I get a chance.
_________________
The only good questions are the ones we have never answered before.
Search the Forums
Reply with quote
Display posts from previous:   
Post new topic   Reply to topic     Home » Forums » CMUD General Discussion All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

© 2009 Zugg Software. Hosted by Wolfpaw.net